Overview
Google Docs and Sheets (documents and spreadsheets in the formats native to the Google Drive web platform) can be collected, preserved, and shared with Archive-It’s current software suite. For reference, here are the Archive-It team’s current recommendations for archiving them:
Known issues
There are currently no known issues for archiving Google Docs and Sheets. Please note that we do not currently support archiving documents that follow the drive.google.com/file/ URL path. For a full list of known issues for archiving various platforms please visit our Status of monitored platforms page.
On this page
- How to archive Google Docs and Sheets individually
- How to archive links to Google Docs and Sheets automatically
- What to expect form archived Google Docs, Sheets, and Drive
How to archive Google Docs and Sheets individually
To collect each directly, format the Google Doc or Sheet's seed URL exactly as it appears in the archived example cases below, substituting your document's unique identifying alphanumeric string where appropriate:
- Google doc: https://docs.google.com/document/d/124LjR2jsB8YYbxSHc09NZE1QEKm9iI_f5e3_x4skQc0/edit
- Google sheet: https://docs.google.com/spreadsheets/u/0/d/1BhiZ2lDuKVk3RewgTZC_SqdD4XWZxvvCnr6EDC188OQ/htmlview
Do not put a trailing slash (/) at the end of the seed URL. Use the One Page seed type in order to archive only the doc or sheet seen at your seed URL, or One Page Plus in order to archive it and its links out to other web pages.
How to archive links to Google Docs and Sheets automatically
To archive the links between them and any Google Docs, add the following scoping rule/s to your seed/s and make sure to collect them with Brozzler:
- Expand scope to include URL if it contains the text: docs.google.com/document/
- Expand scope to include URL if it contains the text: docs.google.com/spreadsheets/
Note that capture tools cannot yet navigate the links between Google Drive folders and their contained docs, so Archive-It does not recommend using Google Drive folders as seed URLs.
What to expect from archived Google Docs, Sheets, and Drive
Google Docs
Some Google Docs archived will display error messages in addition to their contents. If you click on the 'help us improve' link in the error message, the Google Doc will replay as expected. At this time there is no other solution for this Wayback replay issue.
Google Sheets
Some Google spreadsheets archived automatically will display error messages in addition to their contents. There is no known solution to this Wayback replay issue yet, however the contents of these crawls may be collected and preserved while replay improvements are made.
Google Drive
Please note that we do not currently support archiving documents that follow the drive.google.com/file/ URL path. Google Drive seeds like the one below will generally only show the Drive interface's loading wheel in Wayback.
Comments
0 comments
Please sign in to leave a comment.