Skip to main content

shell directory


“Rare Greek Variables”, Branwen 2021

Variables: “Rare Greek Variables”⁠, Gwern Branwen (2021-04-08; ⁠, ⁠, ; backlinks; similar):

I scrape Arxiv to find underused Greek variables which can add some diversity to math; the top 10 underused letters are ϰ, ς, υ, ϖ, Υ, Ξ, ι, ϱ, ϑ, & Π. Avoid overused letters like λ, and spice up your next paper with some memorable variables!

Some Greek alphabet variables are just plain overused. It seems like no paper is complete without a bunch of E or μ or α variables splattered across it—and they all mean different things in different papers, and that’s when they don’t mean different things in the same paper! In the spirit of offering constructive criticism, might I suggest that, based on Arxiv frequency of usage, you experiment with more recherché, even, outré variables?

Instead of reaching for that exhausted π, why not use… ϰ (variant kappa)? (It looks like a Hebrew escapee…) Or how about ς (variant sigma), which is calculated to get your reader’s attention by making them go “ςςς” and exclaim “these letters are Greek to me!”

The top 10 least-used Greek variables on Arxiv⁠, rarest to more common:

  1. \varkappa (ϰ)
  2. \varsigma (ς)
  3. \upsilon (υ)
  4. \varpi (ϖ)
  5. \Upsilon (Υ)
  6. \varrho (ϱ)
  7. \Xi (Ξ)
  8. \vartheta (ϑ)
  9. \iota (ι)
  10. \Pi (Π)

“GPT-2 Preference Learning for Music Generation”, Branwen 2019

GPT-2-preference-learning: “GPT-2 Preference Learning for Music Generation”⁠, Gwern Branwen (2019-12-16; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Experiments with OpenAI’s ‘preference learning’ approach, which trains a NN to predict global quality of datapoints, and then uses reinforcement learning to optimize that directly, rather than proxies. I am unable to improve quality, perhaps due to too-few ratings.

Standard language generation neural network models, like GPT-2⁠, are trained via likelihood training to imitate human text corpuses. Generated text suffers from persistent flaws like repetition, due to myopic generation word-by-word, and cannot improve on the training data because they are trained to predict ‘realistic’ completions of the training data.

A proposed alternative is to use reinforcement learning to train the NNs, to encourage global properties like coherence & lack of repetition, and potentially improve over the original corpus’s average quality. Preference learning trains a reward function on human ratings, and uses that as the ‘environment’ for a blackbox DRL algorithm like PPO⁠.

OpenAI released a codebase implementing this dual-model preference learning approach for textual generation, based on GPT-2. Having previously used GPT-2 for poetry & music generation⁠, I experimented with GPT-2 preference learning for unconditional music and poetry generation.

I found that preference learning seemed to work better for music than poetry, and seemed to reduce the presence of repetition artifacts, but the results, at n ≈ 7,400 ratings compiled over 23 iterations of training+sampling November 2019–January 2020, are not dramatically better than alternative improvements like scaling up models or more thorough data-cleaning or more stringent sample curation. My blind ratings using n ≈ 200 comparisons showed no large advantage for the RL-tuned samples (winning only 93 of 210 comparisons, or 46%).

This may be due to insufficient ratings, bad hyperparameters, or not using samples generated with common prefixes, but I suspect it’s the former, as some NLP tasks in Ziegler et al 2019 required up to 60k ratings for good performance, and the reward model appeared to achieve poor performance & succumb to adversarial examples easily.

Working with it, I suspect that preference learning is unnecessarily sample-inefficient & data-inefficient, and that the blackbox reinforcement learning approach is inferior to directly using the reward model to optimize text samples, and propose two major architectural overhauls: have the reward model directly model the implied ranking of every datapoint, and drop the agent model entirely in favor of backprop-powered gradient ascent which optimizes sequences to maximize the reward model’s output⁠.

“The Most ‘Abandoned’ Books on GoodReads”, Branwen 2019

GoodReads: “The Most ‘Abandoned’ Books on GoodReads”⁠, Gwern Branwen (2019-12-09; ⁠, ⁠, ; backlinks; similar):

Which books on GoodReads are most difficult to finish? Estimating proportions in December 2019 gives an entirely different result than absolute counts.

What books are hardest for a reader who starts them to finish, and most likely to be abandoned? I scrape a crowdsourced tag⁠, abandoned, from the GoodReads book social network on 2019-12-09 to estimate conditional probability of being abandoned.

The default GoodReads tag interface presents only raw counts of tags, not counts divided by total ratings ( = reads). This conflates popularity with probability of being abandoned: a popular but rarely-abandoned book may have more abandoned tags than a less popular but often-abandoned book. There is also residual error from the winner’s curse where books with fewer ratings are more mis-estimated than popular books. I fix that to see what more correct rankings look like.

Correcting for both changes the top-5 ranking completely, from (raw counts):

  1. The Casual Vacancy, J. K. Rowling
  2. Catch-22, Joseph Heller
  3. American Gods, Neil Gaiman
  4. A Game of Thrones, George R. R. Martin
  5. The Book Thief, Markus Zusak

to (shrunken posterior proportions):

  1. Black Leopard, Red Wolf, Marlon James
  2. Space Opera⁠, Catherynne M. Valente
  3. Little, Big, John Crowley
  4. The Witches: Salem, 1692⁠, Stacy Schiff
  5. Tender Morsels, Margo Lanagan

I also consider a model adjusting for covariates (author/​average-rating/​year), to see what books are most surprisingly often-abandoned given their pedigrees & rating etc. Abandon rates increase the newer a book is, and the lower the average rating.

Adjusting for those, the top-5 are:

  1. The Casual Vacancy, J. K. Rowling
  2. The Chemist⁠, Stephenie Meyer
  3. Infinite Jest, David Foster Wallace
  4. The Glass Bead Game, Hermann Hesse
  5. Theft by Finding: Diaries (1977–2002), David Sedaris

Books at the top of the adjusted list appear to reflect a mix of highly-popular authors changing genres, and ‘prestige’ books which are highly-rated but a slog to read.

These results are interesting for how they highlight how people read books for many reasons (such as marketing campaigns, literary prestige, or following a popular author), and this is reflected in their decision whether to continue reading or to abandon a book.

“GPT-2 Folk Music”, Branwen & Presser 2019

GPT-2-music: “GPT-2 Folk Music”⁠, Gwern Branwen, Shawn Presser (2019-11-01; ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Generating Irish/​folk/​classical music in ABC format using GPT-2-117M, with good results.

In November 2019, I experimented with training a GPT-2 neural net model to generate folk music in the high-level ABC music text format, following previous work in 2016 which used a char-RNN trained on a ‘The Session’ dataset. A GPT-2 hypothetically can improve on an RNN by better global coherence & copying of patterns, without problems with the hidden-state bottleneck.

I encountered problems with the standard GPT-2 model’s encoding of text which damaged results, but after fixing that⁠, I successfully trained it on n = 205,304 ABC music pieces taken from The Session & The resulting music samples are in my opinion quite pleasant. (A similar model was later retrained by Geerlings & Meroño-Peñuela 2020⁠.)

The ABC folk model & dataset are available for download⁠, and I provide for listening selected music samples as well as medleys of random samples from throughout training.

We followed the ABC folk model with an ABC-MIDI model: a dataset of 453k ABC pieces decompiled from MIDI pieces, which fit into GPT-2-117M with an expanded context window when trained on TPUs⁠. The MIDI pieces are far more diverse and challenging, and GPT-2 underfits and struggles to produce valid samples but when sampling succeeds, it can generate even better musical samples⁠.

“GPT-2 Neural Network Poetry”, Branwen & Presser 2019

GPT-2: “GPT-2 Neural Network Poetry”⁠, Gwern Branwen, Shawn Presser (2019-03-03; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Demonstration tutorial of retraining OpenAI’s GPT-2 (a text-generating Transformer neural network) on large poetry corpuses to generate high-quality English verse.

In February 2019, following up on my 2015–2016 text-generation experiments with char-RNNs⁠, I experiment with the cutting-edge Transformer NN architecture for language modeling & text generation. Using OpenAI’s GPT-2-117M (117M) model pre-trained on a large Internet corpus and nshepperd’s finetuning code, I retrain GPT-2-117M on a large (117MB) Project Gutenberg poetry corpus. I demonstrate how to train 2 variants: “GPT-2-poetry”, trained on the poems as a continuous stream of text, and “GPT-2-poetry-prefix”, with each line prefixed with the metadata of the PG book it came from. In May 2019, I trained the next-largest GPT-2, GPT-2-345M, similarly, for a further quality boost in generated poems. In October 2019, I retrained GPT-2-117M on a Project Gutenberg corpus with improved formatting, and combined it with a contemporary poem dataset based on Poetry Foundation’s website⁠.

With just a few GPU-days on 1080ti GPUs, GPT-2-117M finetuning can produce high-quality poetry which is more thematically consistent than my char-RNN poems, capable of modeling subtle features like rhyming, and sometimes even a pleasure to read. I list the many possible ways to improve poem generation and further approach human-level poems. For the highest-quality AI poetry to date, see my followup pages, “GPT-3 Creative Writing”⁠/​“GPT-3 Non-Fiction”⁠.

For anime plot summaries, see TWDNE⁠; for generating ABC-formatted folk music, see “GPT-2 Folk Music” & “GPT-2 Preference Learning for Music and Poetry Generation”⁠; for playing chess, see “A Very Unlikely Chess Game”⁠; for the Reddit comment generator, see SubSimulatorGPT-2⁠; for fanfiction, the Ao3⁠; and for video games, the walkthrough model⁠. For OpenAI’s GPT-3 followup, see “GPT-3: Language Models are Few-Shot Learners”⁠.

“This Waifu Does Not Exist”, Branwen 2019

TWDNE: “This Waifu Does Not Exist”⁠, Gwern Branwen (2019-02-19; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

I describe how I made the website (TWDNE) for displaying random anime faces generated by StyleGAN neural networks, and how it went viral.

Generating high-quality anime faces has long been a task neural networks struggled with. The invention of StyleGAN in 2018 has effectively solved this task and I have trained a StyleGAN model which can generate high-quality anime faces at 512px resolution. To show off the recent progress, I made a website, “This Waifu Does Not Exist” for displaying random StyleGAN 2 faces. TWDNE displays a different neural-net-generated face & plot summary every 15s. The site was popular and went viral online, especially in China. The model can also be used interactively for exploration & editing in the Artbreeder online service⁠.

TWDNE faces have been used as screensavers, user avatars, character art for game packs or online games⁠, painted watercolors⁠, uploaded to Pixiv, given away in streams⁠, and used in a research paper (Noguchi & Harada 2019). TWDNE results also helped inspired Sizigi Studio’s online interactive waifu GAN⁠, Waifu Labs⁠, which generates even better anime faces than my StyleGAN results.

“Internet Search Tips”, Branwen 2018

Search: “Internet Search Tips”⁠, Gwern Branwen (2018-12-11; ⁠, ⁠, ⁠, ; backlinks; similar):

A description of advanced tips and tricks for effective Internet research of papers/​books, with real-world examples.

Over time, I developed a certain google-fu and expertise in finding references, papers, and books online. Some of these tricks are not well-known, like checking the Internet Archive (IA) for books. I try to write down my search workflow, and give general advice about finding and hosting documents, with demonstration case studies⁠.

“The Kelly Coin-Flipping Game: Exact Solutions”, Branwen et al 2017

Coin-flip: “The Kelly Coin-Flipping Game: Exact Solutions”⁠, Gwern Branwen, Arthur B., nshepperd, FeepingCreature, Gurkenglas (2017-01-19; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Decision-theoretic analysis of how to optimally play Haghani & Dewey 2016’s 300-round double-or-nothing coin-flipping game with an edge and ceiling better than using the Kelly Criterion. Computing and following an exact decision tree increases earnings by $6.6 over a modified KC.

Haghani & Dewey 2016 experiment with a double-or-nothing coin-flipping game where the player starts with $30.4[^\$25.0^~2016~]{.supsub} and has an edge of 60%, and can play 300 times, choosing how much to bet each time, winning up to a maximum ceiling of $303.8[^\$250.0^~2016~]{.supsub}. Most of their subjects fail to play well, earning an average $110.6[^\$91.0^~2016~]{.supsub}, compared to Haghani & Dewey 2016’s heuristic benchmark of ~$291.6[^\$240.0^~2016~]{.supsub} in winnings achievable using a modified Kelly Criterion as their strategy. The KC, however, is not optimal for this problem as it ignores the ceiling and limited number of plays.

We solve the problem of the value of optimal play exactly by using decision trees & dynamic programming for calculating the value function, with implementations in R, Haskell⁠, and C. We also provide a closed-form exact value formula in R & Python, several approximations using Monte Carlo/​random forests⁠/​neural networks, visualizations of the value function, and a Python implementation of the game for the OpenAI Gym collection. We find that optimal play yields $246.61 on average (rather than ~$240), and so the human players actually earned only 36.8% of what was possible, losing $155.6 in potential profit. Comparing decision trees and the Kelly criterion for various horizons (bets left), the relative advantage of the decision tree strategy depends on the horizon: it is highest when the player can make few bets (at b = 23, with a difference of ~$36), and decreases with number of bets as more strategies hit the ceiling.

In the Kelly game, the maximum winnings, number of rounds, and edge are fixed; we describe a more difficult generalized version in which the 3 parameters are drawn from Pareto, normal, and beta distributions and are unknown to the player (who can use Bayesian inference to try to estimate them during play). Upper and lower bounds are estimated on the value of this game. In the variant of this game where subjects are not told the exact edge of 60%, a Bayesian decision tree approach shows that performance can closely approach that of the decision tree, with a penalty for 1 plausible prior of only $1. Two deep reinforcement learning agents, DQN & DDPG⁠, are implemented but DQN fails to learn and DDPG doesn’t show acceptable performance, indicating better deep RL methods may be required to solve the generalized Kelly game.

“Internet WiFi Improvement”, Branwen 2016

WiFi: “Internet WiFi improvement”⁠, Gwern Branwen (2016-10-20; ⁠, ⁠, ⁠, ; backlinks; similar):

After putting up with slow glitchy WiFi Internet for years, I investigate improvements. Upgrading the router, switching to a high-gain antenna, and installing a buried Ethernet cable all offer increasing speeds.

My laptop in my apartment receives Internet via a WiFi repeater to another house, yielding slow speeds and frequent glitches. I replaced the obsolete WiFi router and increased connection speeds somewhat but still inadequate. For a better solution, I used a directional antenna to connect directly to the new WiFi router, which, contrary to my expectations, yielded a ~6× increase in speed. Extensive benchmarking of all possible arrangements of laptops/​dongles/​repeaters/​antennas/​routers/​positions shows that the antenna+router is inexpensive and near optimal speed, and that the only possible improvement would be a hardwired Ethernet line, which I installed a few weeks later after learning it was not as difficult as I thought it would be.

“Easy Cryptographic Timestamping of Files”, Branwen 2015

Timestamping: “Easy Cryptographic Timestamping of Files”⁠, Gwern Branwen (2015-12-04; ⁠, ⁠, ⁠, ; backlinks; similar):

Scripts for convenient free secure Bitcoin-based dating of large numbers of files/​strings

Local archives are useful for personal purposes, but sometimes, in investigations that may be controversial, you want to be able to prove that the copy you downloaded was not modified and you need to timestamp it and prove the exact file existed on or before a certain date. This can be done by creating a cryptographic hash of the file and then publishing that hash to global chains like centralized digital timestampers or the decentralized Bitcoin blockchain. Current timestamping mechanisms tend to be centralized, manual, cumbersome, or cost too much to use routinely. Centralization can be overcome by timestamping to Bitcoin; costing too much can be overcome by batching up an arbitrary number of hashes and creating just 1 hash/​timestamp covering them all; manual & cumbersome can be overcome by writing programs to handle all of this and incorporating them into one’s workflow. So using an efficient cryptographic timestamping service (the OriginStamp Internet service), we can write programs to automatically & easily timestamp arbitrary files & strings, timestamp every commit to a Git repository, and webpages downloaded for archival purposes. We can implement the same idea offline, without reliance on OriginStamp, but at the cost of additional software dependencies like a Bitcoin client.

“RNN Metadata for Mimicking Author Style”, Branwen 2015

RNN-metadata: “RNN Metadata for Mimicking Author Style”⁠, Gwern Branwen (2015-09-12; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Teaching a text-generating char-RNN to automatically imitate many different authors by labeling the input text by author; additional experiments include imitating Geocities and retraining GPT-2 on a large Project Gutenberg poetry corpus.

Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. A 2015 experiment using torch-rnn on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.

I further try & fail to train a char-RNN on Geocities HTML for unclear reasons.

More successfully, I experiment in 2019 with a recently-developed alternative to char-RNNs⁠, the Transformer NN architecture, by finetuning training OpenAI’s GPT-2-117M Transformer model on a much larger (117MB) Project Gutenberg poetry corpus using both unlabeled lines & lines with inline metadata (the source book). The generated poetry is much better. And GPT-3 is better still.

“The Sort --key Trick”, Branwen 2014

Sort: “The sort --key Trick”⁠, Gwern Branwen (2014-03-03; ⁠, ⁠, ⁠, ; backlinks; similar):

Commandline folklore: sorting files by filename or content before compression can save large amounts of space by exposing redundancy to the compressor. Examples and comparisons of different sorts.

Programming folklore notes that one way to get better lossless compression efficiency is by the precompression trick of rearranging files inside the archive to group ‘similar’ files together and expose redundancy to the compressor, in accordance with information-theoretical principles. A particularly easy and broadly-applicable way of doing this, which does not require using any unusual formats or tools and is fully compatible with the default archive methods, is to sort the files by filename and especially file extension.

I show how to do this with the standard Unix command-line sort tool, using the so-called “sort --key trick”, and give examples of the large space-savings possible from my archiving work for personal website mirrors and for making darknet market mirror datasets where the redundancy at the file level is particularly extreme and the sort --key trick shines compared to the naive approach.

“Darknet Market Archives (2013–2015)”, Branwen 2013

DNM-archives: “Darknet Market Archives (2013–2015)”⁠, Gwern Branwen (2013-12-01; ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Mirrors of ~89 Tor-Bitcoin darknet markets & forums 2011–2015, and related material.

Dark Net Markets (DNM) are online markets typically hosted as Tor hidden services providing escrow services between buyers & sellers transacting in Bitcoin or other cryptocoins, usually for drugs or other illegal/​regulated goods; the most famous DNM was Silk Road 1, which pioneered the business model in 2011.

From 2013–2015, I scraped/​mirrored on a weekly or daily basis all existing English-language DNMs as part of my research into their usage⁠, lifetimes /  ​ characteristics⁠, & legal riskiness⁠; these scrapes covered vendor pages, feedback, images, etc. In addition, I made or obtained copies of as many other datasets & documents related to the DNMs as I could.

This uniquely comprehensive collection is now publicly released as a 50GB (~1.6TB uncompressed) collection covering 89 DNMs & 37+ related forums, representing <4,438 mirrors, and is available for any research.

This page documents the download, contents, interpretation, and technical methods behind the scrapes.

“Alerts Over Time”, Branwen 2013

Google-Alerts: “Alerts Over Time”⁠, Gwern Branwen (2013-07-01; ⁠, ⁠, ; backlinks; similar):

Does Google Alerts return fewer results each year? A statistical investigation

Has Google Alerts been sending fewer results the past few years? Yes. Responding to rumors of its demise, I investigate the number of results in my personal Google Alerts notifications 2007-2013, and find no overall trend of decline until I look at a transition in mid-2011 where the results fall dramatically. I speculate about the cause and implications for Alerts’s future.

“‘HP: Methods of Rationality’ Review Statistics”, Branwen 2012

hpmor: “‘HP: Methods of Rationality’ review statistics”⁠, Gwern Branwen (2012-11-03; ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Recording fan speculation for retrospectives; statistically modeling reviews for ongoing story with R

The unprecedented gap in Methods of Rationality updates prompts musing about whether readership is increasing enough & what statistics one would use; I write code to download reviews, clean it, parse it, load into R, summarize the data & depict it graphically, run linear regression on a subset & all reviews, note the poor fit, develop a quadratic fit instead, and use it to predict future review quantities.

Then, I run a similar analysis on a competing fanfiction to find out when they will have equal total review-counts. A try at logarithmic fits fails; fitting a linear model to the previous 100 days of MoR and the competitor works much better, and they predict a convergence in <5 years.

A survival analysis finds no major anomalies in reviewer lifetimes, but an apparent increase in mortality for reviewers who started reviewing with later chapters, consistent with (but far from proving) the original theory that the later chapters’ delays are having negative effects.

“Treadmill Desk Observations”, Branwen 2012

Treadmill: “Treadmill desk observations”⁠, Gwern Branwen (2012-06-19; ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Notes relating to my use of a treadmill desk and 2 self-experiments showing walking treadmill use interferes with typing and memory performance.

It has been claimed that doing spaced repetition review while on a walking treadmill improves memory performance. I did a randomized experiment August 2013 – May 2014 and found that using a treadmill damaged my recall performance.

“A/B Testing Long-form Readability on”, Branwen 2012

AB-testing: “A/B testing long-form readability on”⁠, Gwern Branwen (2012-06-16; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

A log of experiments done on the site design, intended to render pages more readable, focusing on the challenge of testing a static site, page width, fonts, plugins, and effects of advertising.

To gain some statistical & web development experience and to improve my readers’ experiences, I have been running a series of CSS A/​B tests since June 2012. As expected, most do not show any meaningful difference.

“Redshift Sleep Experiment”, Branwen 2012

Redshift: “Redshift sleep experiment”⁠, Gwern Branwen (2012-05-09; ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Self-experiment on whether screen-tinting software such as Redshift/​f.lux affect sleep times and sleep quality; Redshift lets me sleep earlier but doesn’t improve sleep quality.

I ran a randomized experiment with a free program (Redshift) which reddens screens at night to avoid tampering with melatonin secretion & the sleep from 2012–2013, measuring sleep changes with my Zeo⁠. With 533 days of data, the main result is that Redshift causes me to go to sleep half an hour earlier but otherwise does not improve sleep quality.

“Time-lock Encryption”, Branwen 2011

Self-decrypting-files: “Time-lock encryption”⁠, Gwern Branwen (2011-05-24; ⁠, ; backlinks; similar):

How do you encrypt a file such that it can be decrypted after a date, but not before? Use serial computations for proof-of-work using successive squaring, chained hashes, or witness encryption on blockchains.

In cryptography, it is easy to adjust encryption of data so that one, some, or all people can decrypt it, or some combination thereof. It is not so easy to achieve adjustable decryptability over time, a “time-lock crypto”: for some uses (data escrow, leaking, insurance, last-resort Bitcoin backups etc), one wants data which is distributed only after a certain point in time.

I survey techniques for time-lock crypto. Proposals often resort to trusted-third-parties, which are vulnerabilities. A better time-lock crypto proposal replaces trusted-third-parties with forcibly serial proof-of-work using number squaring and guaranteeing unlocking not after a certain point in time but after sufficient computation-time has been spent; it’s unclear how well number-squaring resists optimization or shortcuts. I suggest a new time-lock crypto based on chained hashes; hashes have been heavily attacked for other purposes, and may be safer than number-squaring. Finally, I cover obfuscation & witness-encryption which, combined with proof-of-work, can be said to solve time-lock crypto but currently remain infeasible.

“Archiving URLs”, Branwen 2011

Archiving-URLs: “Archiving URLs”⁠, Gwern Branwen (2011-03-10; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Archiving the Web, because nothing lasts forever: statistics, online archive services, extracting URLs automatically from browsers, and creating a daemon to regularly back up URLs to multiple sources.

Links on the Internet last forever or a year, whichever comes first. This is a major problem for anyone serious about writing with good references, as link rot will cripple several% of all links each year, and compounding.

To deal with link rot, I present my multi-pronged archival strategy using a combination of scripts, daemons, and Internet archival services: URLs are regularly dumped from both my web browser’s daily browsing and my website pages into an archival daemon I wrote, which pre-emptively downloads copies locally and attempts to archive them in the Internet Archive. This ensures a copy will be available indefinitely from one of several sources. Link rot is then detected by regular runs of linkchecker, and any newly dead links can be immediately checked for alternative locations, or restored from one of the archive sources.

As an additional flourish, my local archives are efficiently cryptographically timestamped using Bitcoin in case forgery is a concern, and I demonstrate a simple compression trick for substantially reducing sizes of large web archives such as crawls (particularly useful for repeated crawls such as my DNM archives).

“ Website Traffic”, Branwen 2011

Traffic: “ Website Traffic”⁠, Gwern Branwen (2011-02-03; ⁠, ⁠, ⁠, ⁠, ; similar):

Meta page describing editing activity, traffic statistics, and referrer details, primarily sourced from Google Analytics (2011-present).

On a semi-annual basis, since 2011, I review website traffic using Google Analytics; although what most readers value is not what I value, I find it motivating to see total traffic statistics reminding me of readers (writing can be a lonely and abstract endeavour), and useful to see what are major referrers. typically enjoys steady traffic in the 50–100k range per month, with occasional spikes from social media, particularly Hacker News; over the first decade (2010–2020), there were 7.98m pageviews by 3.8m unique users.

“Nootropics”, Branwen 2010

Nootropics: “Nootropics”⁠, Gwern Branwen (2010-01-02; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar)

“In Defense of Inclusionism”, Branwen 2009

In-Defense-Of-Inclusionism: “In Defense of Inclusionism”⁠, Gwern Branwen (2009-01-15; ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ⁠, ; backlinks; similar):

Iron Law of Bureaucracy: the downwards deletionism spiral discourages contribution and is how Wikipedia will die.

English Wikipedia is in decline. As a long-time editor & former admin, I was deeply dismayed by the process. Here, I discuss UI principles, changes in Wikipedian culture, the large-scale statistical evidence of decline, run small-scale experiments demonstrating the harm, and conclude with parting thoughts.