New Research Tool: GovScape (US Gov PDFs)
govscape.net ||| Research Paper (preprint) About #GovScape arxiv.org/abs/2511.11010 #govdocs @eotarchive.org
New Research Tool: GovScape (US Gov PDFs)
govscape.net ||| Research Paper (preprint) About #GovScape arxiv.org/abs/2511.11010 #govdocs @eotarchive.org
Our EOT2024 partner @harvardlil.bsky.social was interviewed by @jed.co on @source.coop about archiving government data.
Listen & learn how 300,000+ federal datasets are archived for posterity
Inside Harvardβs Data.gov Archive, A Conversation with Jack Cushman: www.youtube.com/watch?v=XYMb...
GovInfo Day(s) Spring 2025 brought together librarians, archivists & digital stewards to champion long-term access to government info. Hosted by Internet Archive Canada, SFU & UBC. Inspiring sessions on digital preservation & public access.
π internetarchivecanada.org/2025/05/07/r...
CC @archive.org
@eotarchive.org is in the middle of harvesting the ERIC database. Weβve currently collected 24000 PDF files (going on 500,000+... should be done later today). Thereβs a full-text keyword collection search at web.archive.org where you can choose to search Eric.ed.gov collection. @archive.org FTW!
SF Chronicle: βTrumpβs War on Information Meets a Dedicated Adversary: University #Librariansβ www.infodocket.com/2025/05/02/s... #libraries #archives @archive.org @eotarchive.org @envirodgi.bsky.social @datarescueproject.org
We have a busy day today full of rescue and talking. The RDAP leadership is killing it with a data rescue event. And later, we will be presenting with @pegiproject.bsky.social and @eotarchive.org with some awesome Canadians at GovInfo Day! www.eventbrite.com/e/govinfo-da...
BBC Report: "Inside The Desperate Rush to Save Decades of Us Scientific #Data From Deletion"
www.bbc.co.uk/future/artic... #webarchives #dataresecue @eotarchive.org @datarescueproject.org @archive.org
University of North Texas is now accepting seed nominations for post-EOT crawls:
digital2.library.unt.edu/nomination/G...
Crawls from this seed list will not be a part of the #EOT2024 #EOTArchive
We are wrapping up seed list nominations for #EOT2024.
Going forward, you can submit government URL nominations using a similar tool: digital2.library.unt.edu/nomination/G...
Crawls of post-EOT seeds wonβt be part of the #EOTArchive, but will appear in the @archive.org Wayback Machine
@archive.org is an #EOTArchive team member, that has been archiving the public web since 1996.
As with many supporting orgs they crawl every seeded federal government URL. These crawl efforts contribute to our 700+ terabytes of public WebArchive (WARC) files #EOT2024
Hereβs a new user interface for select #EOT2024 WACZ files that were archived in high fidelity!
Great work coming from our archiving team member
@webrecorder.net
Each playback collection is a standalone mirror, eg. usaid.govarchive.us/about-us/mis...
π½ Give us your hidden, your overlooked,
Your orphaned Gov URLs yearning to be preserved,
The forgotten databases of your civic shore.
Send these, the neglected, the soon-to-vanish, to usβ
We lift our crawler beside the open web. π½
#EOTArchive #EOT2024
Screenshot of Webrecorder's public collection gallery of US government sites web archive on Browsertrix
Weβre excited to share the first batch of US Government websites that Webrecorder has archived as part of the
@eotarchive.org initiative. Theyβre now available on our public collections gallery app.browsertrix.com/explore/usgo...
#WebArchiving #Browsertrix #EOTarchive
We've posted on our friends @eotarchive.org. They do great work and are a great option for rescues. www.datarescueproject.org/end-of-term-...
bsky.app/profile/eota...
Answer: They are quadrennial events!
FIFA world cup, the olympics and the End of Term web archive occur on four year cycles.
Hint: We begin web archiving months before the election and continue for months after the inauguration. Our partners collaborate, no matter if a new or incumbent President takes office.
Question: What do the End Of Term web archive, Olympics and the FIFA World Cup have in common?
A recent @muckrock.com hosted discussion on data preservation efforts included our Mark Graham @archive.org, Sarah Cohen @biglocalnews.bsky.social, Jack Cushman @harvardlil.bsky.social, and Lynda Kellam @iassistdata.bsky.social a member organization of the @datarescueproject.org
Missed our data preservation event yesterday? We've uploaded the entire discussion on our YouTube channel:
Some federal webpages contain interactive elements that are best preserved using high-fidelity web archiving. Our team member @webrecorder.net has written about handling these efforts in a blog by @ilya.webrecorder.net
#EOTArchive #EOT2024
Today @mark.bsky.social spoke with Amy Goodman on @democracynow.org about our quadrennial effort to archive federal websites before, during, and after U.S. presidential administration changes.
Mark Graham, director of Internet Archive's Wayback Machine, will join this conversation β¬οΈ
Join us on Thursday, Feb. 13 as we host a discussion with organizations that are helping lead the efforts to preserve the publicβs data.
The crawls for this current transition are ongoing. However a portion of it is has been made available by EOT team member Internet Archive
This could be a starting point for your search
web.archive.org/collection-s...
Whitehouse.gov captures from: 2008 Sept. 15; 2013 Mar. 21; 2017 Feb. 3; and 2021 Feb. 25
Every 4 years, a team of libraries & research organizations work together to preserve material from U.S. government websites during the transition of administrations. π³οΈ
Get the latest on the 2024/2025 End of Term Web Archive @eotarchive.org β‘οΈ blog.archive.org/2025/02/06/u...
Internet Archive Provides βUpdate on the 2024/2025 End of Term Web Archiveβ www.infodocket.com/2025/02/06/i... #eot #archives #webarchives @archive.org @eotarchive.org
Weβre proud to be collaborating with the @eotarchive.org and EDGI to capture and preserve critical government websites before theyβre lost, using our high-fidelity web archiving tools: Browsertrix and ArchiveWeb.page
webrecorder.net/blog/2025-02...
#webarchiving #digitalpreservation #EOTarchive
The Whitehouse homepage in the morning of the inauguration with the image of a smiling Joe Biden in the Oval Office.
The Whitehouse homepage in the afternoon of Inauguration Day 2025 with the image of a Donald Trump pointing.
The US Constitution states that the power of the President changes at noon on the 20th. At noon in DC (3hrs ago) the WH homepage changed as well from
web.archive.org/web/20250105... to www.whitehouse.gov
We'll include both versions in the End Of Term web archive #EOT2024 #EOTArchive
Thank you for your government URL nominations!
Bulk Nomination - 886,998 Nominated URLs
Human Nominations - 9,267
More than 200 Nominators!
Gov URLs for the End Of Term web crawl are still welcomed, but send them urgently, we cutoff nominations in a few days.