Overview
Google Docs and Sheets (documents and spreadsheets in the formats native to the Google Drive web platform) can be collected, preserved, and shared with Archive-It's current software suite. We do not currently support archiving documents that follow the drive.google.com/file/ URL path.
Known issues
There are currently no known issues for archiving Google Docs and Sheets.
For a full list of known issues for archiving various platforms please visit our Status of monitored platforms page.
On this page
- How to archive Google Docs and Sheets individually
- How to archive links to Google Docs and Sheets automatically
- What to expect form archived Google Docs, Sheets, and Drive
How to archive Google Docs and Sheets individually
To collect Google Docs and Sheets individually, format each Google Doc or Sheet's seed URL exactly as it appears in the example cases below, substituting your document's unique identifying alphanumeric string where appropriate:
- Google doc: https://docs.google.com/document/d/124LjR2jsB8YYbxSHc09NZE1QEKm9iI_f5e3_x4skQc0/edit
- Google sheet: https://docs.google.com/spreadsheets/u/0/d/1BhiZ2lDuKVk3RewgTZC_SqdD4XWZxvvCnr6EDC188OQ/htmlview
Do not put a trailing slash (/) at the end of the seed URL. Use the One Page seed type in order to archive only the doc or sheet seen at your seed URL, or One Page Plus in order to archive it and its links out to other web pages.
How to archive links to Google Docs and Sheets automatically
To archive the links between them and any Google Docs, add the following scoping rule/s to your seed/s and make sure to run a crawl using Brozzler crawling technology:
-
Expand scope to include URL if it contains the text: docs.google.com/document/
- Expand scope to include URL if it contains the text: docs.google.com/spreadsheets/
Note: Capture tools cannot yet navigate the links between Google Drive folders and their contained docs, so Archive-It does not recommend using Google Drive folders as seed URLs.
What to expect from archived Google Docs, Sheets, and Drive
Google Docs
When replaying archived Google Docs, some may display error messages in addition to their contents. To remove messages, click or double-click the Help us improve link and/or Reload. At this time there is no other solution to remove error messages.
Google Sheets
When replaying archived Google Sheets, some may display error messages in addition to their contents. Check that your seed URL is formatted with '/htmlview'. At this time there is no other solution to remove error messages.
Google Drive
We do not currently support archiving documents that follow the drive.google.com/file/ URL path. Google Drive seeds like the one below will generally only show the Drive interface's loading wheel in Wayback.
Comments
0 comments
Please sign in to leave a comment.