‘Internet archiving’ directory

See Also
Gwern
Links
Miscellaneous
Bibliography

[essay on this tag topic]

See Also

Gwern

“Design Graveyard ”, Gwern 2010

Design Graveyard

“Research Bounties On Fulltexts ”, Gwern 2018

Research Bounties On Fulltexts

“Internet Search Case Studies ”, Gwern 2019

Internet Search Case Studies

“Internet Search Tips ”, Gwern 2018

Internet Search Tips

“Design Of This Website ”, Gwern 2010

Design Of This Website

“Archiving URLs ”, Gwern 2011

Archiving URLs

“The `sort –key` Trick ”, Gwern 2014

The sort –key Trick

“Darknet Market Archives (2013–2015) ”, Gwern 2013

Darknet Market Archives (2013–2015)

“Predicting Google Closures ”, Gwern 2013

Predicting Google closures

“Easy Cryptographic Timestamping of Files ”, Gwern 2015

Easy Cryptographic Timestamping of Files

“Writing a Wikipedia Link Archive Bot ”, Gwern 2008

Writing a Wikipedia Link Archive Bot

“Archiving GitHub ”, Gwern 2011

Archiving GitHub

“Writing a Wikipedia RSS Link Archive Bot ”, Gwern 2009

Writing a Wikipedia RSS Link Archive Bot

“Resilient Haskell Software ”, Gwern 2008

Resilient Haskell Software

Links

“The Making of Printing Types ”, Rougeux 2025

The Making of Printing Types

“`cyc-Archive`: An Archive of Material Related to the Cyc Project ”, Liu 2025

cyc-archive: An archive of material related to the Cyc project

“‘New York and Erie Railroad Organizational Diagram’: A Recreation of One of the First Charts of Its Kind from the 19^th Century ”, Rougeux 2025

‘New York and Erie Railroad Organizational Diagram’: A recreation of one of the first charts of its kind from the 19^th century

“Archival Storage ”, Rosenthal 2025

Archival Storage

“ELIZA Reanimated: The World’s First Chatbot Restored on the World’s First Time Sharing System ”, Lane et al 2025

ELIZA Reanimated: The world’s first chatbot restored on the world’s first time sharing system

“Visualizing All Books of the World in ISBN-Space ”

Visualizing all books of the world in ISBN-Space :

View HTML:

/doc/www/phiresky.github.io/84680245f7ced6598124c0590fe3c560a6b4734f.html

“How Do Archivists Package Things? The Battle of the Boxes ”

How do archivists package things? The battle of the boxes :

View HTML:

/doc/www/peelarchivesblog.com/4aaed5369561e3f8412b4677b0a0ef5ada26f79e.html

“HUGE Google Search Document Leak Reveals Inner Workings of Ranking Algorithm: The Documents Reveal How Google Search Is Using, or Has Used, Clicks, Links, Content, Entities, Chrome Data and More for Ranking. ”, Goodwin 2024

HUGE Google Search document leak reveals inner workings of ranking algorithm: The documents reveal how Google Search is using, or has used, clicks, links, content, entities, Chrome data and more for ranking.

“Insights from a Laboratory Fire ”, Jones et al 2023

Insights from a laboratory fire

“Introducing A Dark Web Archival Framework ”, Brunelle et al 2021

Introducing A Dark Web Archival Framework

“Gscan2pdf: A GUI to Produce PDFs from Scanned Documents ”, Ratcliffe 2019

gscan2pdf: A GUI to produce PDFs from scanned documents

“When Nothing Ever Goes Out of Print: Maintaining Backlist Ebooks ”, Elsey 2016

When Nothing Ever Goes Out of Print: Maintaining Backlist Ebooks

“Memory and the Construction of Scientific Meaning: Michael Faraday’s Use of Notebooks and Records ”, Tweney & Ayala 2015

Memory and the construction of scientific meaning: Michael Faraday’s use of notebooks and records

“Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot ”, Klein et al 2014

Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot

“Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations ”, Zittrain & Albert 2013

Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations

“John Bruno Hare Obituary ”

John Bruno Hare obituary

“The Prevalence and Inaccessibility of Internet References in the Biomedical Literature at the Time of Publication ”, Aronsky et al 2007

The Prevalence and Inaccessibility of Internet References in the Biomedical Literature at the Time of Publication

“More Product, Less Process: Revamping Traditional Archival Processing ”, Greene & Meissner 2005

More Product, Less Process: Revamping Traditional Archival Processing

“How Large Is the World Wide Web? ”, Dobra & Fienberg 2004

How Large Is the World Wide Web?

“The Data Deluge: An E-Science Perspective ”, Hey & Trefethen 2003

The Data Deluge: An e-Science Perspective :

View PDF:

/doc/cs/linkrot/archiving/2003-hey.pdf

“The Little Engines That Could: Modeling the Performance of World Wide Web Search Engines ”, Bradlow & Schmittlein 2000

The Little Engines That Could: Modeling the Performance of World Wide Web Search Engines

Unforgotten Dreams: Poems by the Zen Monk Shōtetsu, Shōtetsu & Carter 1997

Unforgotten Dreams: Poems by the Zen Monk Shōtetsu

“Space Jam Homepage ”

Space Jam Homepage :

View HTML:

/doc/www/www.spacejam.com/49ee87c66adeab7abdfb2cfa7e538b9a0d4ddce9.html

“Faraday’s Notebooks: the Active Organization of Creative Science ”, Tweney 1991

Faraday’s notebooks: the active organization of creative science

“The Other Pínakes and Reference Works of Callimachus ”, Witty 1973

The Other Pínakes and Reference Works of Callimachus

“The Pínakes of Callimachus ”, Witty 1958

The Pínakes of Callimachus

“How Archives Can Make—Or Break—A Philosopher’s Reputation ”

How archives can make—or break—a philosopher’s reputation :

View HTML:

/doc/www/aeon.co/766da564e49ec69a2508d3163b2a599bc2911663.html

“The Backrooms of the Internet Archive ”

The Backrooms of the Internet Archive :

View External Link:

https://blog.archive.org/2024/06/01/the-backrooms-of-the-internet-archive/

“The Original WWW Proposal Is a Word for Macintosh 4.0 File from 1990, Can We Open It? ”

The original WWW proposal is a Word for Macintosh 4.0 file from 1990, can we open it? :

View HTML:

/doc/www/blog.jgc.org/5de902c855481694e164eb20360ae0d20ab0b519.html

“The Old Family Photos Project: Lessons in Creating Family Photos That People Want to Keep ”, Schindler 2025

The Old Family Photos Project: Lessons in creating family photos that people want to keep :

View HTML:

/doc/www/estherschindler.medium.com/faa2ec65e779050c656cf94c8c63a47c56cce835.html

“SingleFile ”, Lormeau 2025

“Century-Scale Storage: If You Had to Store Something for 100 Years, How Would You Do It? ”, Neely-Cohen 2025

Century-Scale Storage: If you had to store something for 100 years, how would you do it?

“A Lunar Library ”

A Lunar Library :

View HTML:

/doc/www/longnow.org/9c24dfe17baca487a4b25f821ebde9c2433992bb.html

“2024 Guide on Removing DRM from Kobo & Kindle Ebooks ”

2024 Guide on removing DRM from Kobo & Kindle ebooks :

View HTML:

/doc/www/old.reddit.com/83b0da81af7d8ab6e36a8c434a7d1f7644baa158.html

“Internet Archive Hacked, Data Breach Impacts 31 Million Users ”

Internet Archive hacked, data breach impacts 31 million users

“The Forgotten Pixel Art Masterpieces of the PlayStation 1 Era by Richmond Lee ”

The Forgotten Pixel Art Masterpieces of the PlayStation 1 Era by Richmond Lee

“Policymakers Don’t Have Access to Paywalled Articles ”

Policymakers don’t have access to paywalled articles :

View HTML:

/doc/www/www.greaterwrong.com/ddb4f069f6aacf1e86e4f40cc9018c0999360dca.html

“To Preserve Their Work—And Drafts of History—Journalists Take Archiving into Their Own Hands ”

To preserve their work—and drafts of history—journalists take archiving into their own hands :

View HTML:

/doc/www/www.niemanlab.org/67b44a6e7fbc96b17afd7f50996f5bb50ec03fd9.html

Wikipedia (6)

Miscellaneous

Bibliography

https://arxiv.org/abs/2501.06707: “ELIZA Reanimated: The World’s First Chatbot Restored on the World’s First Time Sharing System ”, Rupert Lane, Anthony Hay, Arthur Schwarz, David M. Berry, Jeff Shrager

link-bibliography
https://github.com/gildas-lormeau/SingleFile/: “SingleFile ”, Gildas Lormeau

link-bibliography

[Quote Of The Day]

[Site Of The Day]

[Annotation Of The Day]

[adblock public service announcement]