Advertisement for 'HTerm, The Graphical Terminal'

Julian Assange/Wikileaks made some headlines in 2010 when they released an “insurance file”, an 1.4GB AES-256-encrypted file available through BitTorrent. It’s generally assumed that copies of the encryption key have been left with Wikileaks supporters who will, in the appropriate contingency like Assange being assassinated, leak the key online to the thousands of downloaders of the insurance file, who will then read and publicize whatever contents as in it (speculated to be additional US documents Manning gave Wikileaks); this way, if the worst happens and Wikileaks cannot afford to keep digesting the files to eventually release them at its leisure in the way it calculates will have the most impact, the files will still be released and someone will be very unhappy.

Of course, this is an all-or-nothing strategy. Wikileaks has no guarantees that the file will not be released prematurely, nor guarantees that it will eventually be released. Any one of those Wikileaks supporters could become disaffected and leak the key at any time - or if there’s only 1 supporter, they might lose the key to a glitch or become disaffected in the opposite direction and refuse to transmit the key to anyone. (Hope Wikileaks kept backups of the key!) If one trusts the person with the key absolutely, that’s fine. But wouldn’t it be nice if one didn’t have to trust another person like that? Cryptography does really well at eliminating the need to trust others, so maybe there’re better schemes.

Now, it’s hard to imagine how some abstract math could observe an assassination and decrypt embarrassing files. Perhaps a different question could be answered - can you design an encryption scheme which requires no trusted parties but can only be broken after a certain date?

Uses

This sort of cryptography would be useful for many things; from “Time-Lock Puzzles in the Random Oracle Model” (Mahmoody et al 2011):

In addition to the basic use of ‘sending messages to the future’, there are many other potential uses of timed-release crypto. Rivest, Shamir and Wagner 1996 suggest, among other uses, delayed digital cash payments, sealed-bid auctions and key escrow. Boneh & Naor 2000 define timed commitments and timed signatures and show that they can be used for fair contract signing, honesty-preserving auctions and more.

Document embargoes (eg. legal or classified documents or confessions or coercible diaries) and Receipt-free voting are other applications; a cute albeit completely useless application is “Offline Submission with RSA Time-Lock Puzzles”, Jerschow & Mauve 2010:

Our main contribution is an offline submission protocol which enables an author being currently offline to commit to his document before the deadline by continuously solving an RSA puzzle based on that document. When regaining Internet connectivity, he submits his document along with the puzzle solution which is a proof for the timely completion of the document.

One could also lock away files from oneself: seal one’s diary for a long time, or use it to enforce computer time-outs (change the password and require X minutes to decrypt it). When Ross Ulbricht/Dread Pirate Roberts of Silk Road was arrested in October 2013 and his computer with his Bitcoin hoard seized, I thought of another use: conditional transfers of wealth. Ulbricht could create a time-locked copy of his bitcoins and give it to a friend. Because bitcoins can be transferred, he can at any time use his own unlocked copy to render the copy useless and doesn’t have to worry about the friend stealing all the bitcoins from him, but if he is, say, “shot resisting arrest”, his friend can still - eventually - recover the coins. This specific Bitcoin scenario may be possible with the Bitcoin protocol, in which case one has a nice offline digital cash system, as betterunix suggests:

Here is one possible use case: imagine an offline digital cash system, so i.e. the bank will not accept the same token twice. To protect against an unscrupulous seller, the buyer pays by giving the seller a time-lock puzzle with the tokens; if the goods are not delivered by some deadline, the buyer will deposit the tokens at the bank, thus preventing the seller from doing so. Otherwise the seller solves the puzzle and makes the deposit. This is basically an anonymity-preserving escrow service, though in practice there are probably simpler approaches.

This use generalizes beyond black-markets to all services or entities holding Bitcoins: the service can provide users with a time-locked version of the private keys, and regularly move its funds - if the service ever goes down, the users can retrieve their funds. One not-so-cute use is in defeating antivirus software. “Anti-Emulation Through Time-Lock Puzzles”, Ebringer 2008 outlines it: one starts a program with a small time-lock puzzle which must be solved before the program does anything evil, in the hopes that the antivirus scanner will give up or stop watching before the puzzle has been solved and the program decrypts the evil payload; the puzzle’s math backing means no antivirus software can analyze or solve the puzzle first. The basic functionality cannot be blacklisted as it is used by legitimate cryptography software such as OpenSSL which would be expensive collateral damage.

No trusted third-parties

Note that this bars a lot of the usual suggestions for cryptography schemes. For example the general approach of key escrow (eg. Bellare & Goldwasser 1996) if you trust some people, you can just adopt a secret sharing protocol where they XOR together their keys to get the master key for the publicly distributed encrypted file. Or if you only trust some of those people (but are unsure which will try to betray you and either release early or late), you can adopt where k of the n people suffice to reconstruct the master key like Rabin & Thorpe 2006. (And you can connect multiple groups, so each decrypts some necessary keys for the next group; but this gives each group a consecutive veto on release…) Or perhaps something could be devised based on trusted timestamping like Crescenzo et al 1999 or Blake & Chan 2004; but then don’t you need the trusted third party to survive on the network? Or secure multi-party computation (but don’t you need to be on the network, or risk all the parties saying ‘screw it, we’re too impatient, let’s just pool our secrets and decrypt the file now’?) or you could exploit physics and use the speed of light to communicate with a remote computer on a spacecraft (except now we’re trusting the spacecraft as our third party, hoping no one stole its onboard private key & is able to decrypt our transmissions instantaneously)…

One approach is to focus on creating problems which can be solved with a large but precise amount of work, reasoning that if the problem can’t be solved in less than a month, then you can use that as a way to guarantee the file can’t be decrypted within a month’s time. (This would be a proof-of-work system.) This has its own problems1, but it at least delivers what it promises

Weak keys

One could encrypt the file against information that will be known in the future, like stock prices - except wait, how can you find out what stock prices will be a year from now? You can’t use anything that is public knowledge now because that’d let the file be decrypted immediately, and by definition you don’t have access to information currently unknown but which will be known in the future, and if you generate the information yourself planning to release it, now you have problems - you can’t even trust yourself (what if you are abruptly assassinated like Gerald Bull?) much less your confederates.

The first approach that jumps to mind is to encrypt the file, but with a relatively short or weak key, one which will take years to bruteforce. Instead of a symmetrical key at 256 bits, perhaps one of 50 bits or 70 bits?

This fails because we realize that we can guarantee on average how much work it will take to bruteforce the key, but we cannot guarantee how much time it will take. It may take a CPU years to bruteforce a chosen key-length, but take a cluster of CPUs just months. Or worse than that, without invoking clusters or supercomputers - devices can differ dramatically now even in the same computers; to take the example of Bitcoin mining, my laptop’s 2GHz CPU can search for hashes at 4k/sec, or its single outdated GPU can search at 54M/second2. This 1200x speedup is not because the GPU’s clockspeed is 2400GHz or anything like that, it is because the GPU has hundreds of small specialized processors which are able to compute hashes, and the particular application does not require the processors to coordinate or anything like that which might slow them down. Incidentally, this imbalance between CPUs and highly-parallel specialized chips has had the negative effect of centralizing Bitcoin mining power, reducing the security of the network.3 Many scientific applications have moved to clusters of GPUs because they offer such great speedups; as have a number of cryptographic applications such as generating4 rainbow tables. Someone who tries to time-lock a file using a parallelizable form of work renders themselves vulnerable to any attackers like the NSA or botnets with large numbers of computers, but may also render themselves vulnerable to an ordinary computer-gamer with 2 new GPUs: it would not be very useful to have a time-lock which guarantees the file will be locked between a year and a millennium, depending on how many & what kind of people bother to attack it and whether Moore’s law continues to increase the parallel-processing power available.

So anything which is parallel, like using short keys, is probably useless. We need an approach which is inherently serial.

Hashing

Serial

For example, one could take a hash like bcrypt, give it a random input, and hash it for a month. Each hash depends on the previous hash, and there’s no way to skip from the first hash to the trillionth hash. After a month, you use the final hash as the encryption key, and then release the encrypted file and the random input to all the world. The first person who wants to decrypt the file has no choice but to redo the trillion hashes in serial order to reach the same encryption key you used.

Nor can the general public (or the NSA) exploit any parallelism they have available, because each hash depends sensitively on the hash before it - the avalanche effect is a key property to cryptographic hashes.

This is simple to implement: take a seed, hash it, and feed the resulting hash into the algorithm repeatedly until enough time or iterations have passed; then encrypt the file with it as a passphrase. Here is a straightforward shell implementation using SHA-512 & number of iterations:

timelock () {
 local -i I="$1"
 local HASH_SEED="$2"

 while [ $I -gt 0 ]
 do
     HASH_SEED=$(echo $HASH_SEED | sha512sum | cut --delimiter=' ' --fields=1)
     I=$[$I - 1]
 done

 echo $HASH_SEED
}

timelock_encrypt () {
    local KEY=$(timelock $1 $2)
    local PlAINTEXT_FILE="$3"
    echo "$KEY" | gpg --passphrase-fd 0 --symmetric --output $PlAINTEXT_FILE.gpg $PlAINTEXT_FILE
    }
# same as timelock_encrypt except the GPG command changes to do decryption
timelock_decrypt () {
    local KEY=$(timelock $1 $2)
    local ENCRYPTED_FILE="$3"
    echo "$KEY" | gpg --passphrase-fd 0 --decrypt $ENCRYPTED_FILE
    }

# Example usage:
#
# $ rm foo.*
# $ echo "this is a test" > foo.txt
# $ timelock_encrypt 100 random_seed foo.txt
# Reading passphrase from file descriptor 0
# $ rm foo.txt
# $ timelock_decrypt 100 random_seed foo.txt.gpg
# Reading passphrase from file descriptor 0
# gpg: CAST5 encrypted data
# gpg: encrypted with 1 passphrase
# this is a test

Another simple implementation of serial hashing (SHA-256) for time-locking has been written in Python.

Chained hashes

On the other hand, there seems to be a way that the original person running this algorithm can run it in parallel: one generates n random inputs (for n CPUs, presumably), and sets them hashing as before for say a month. Then, one sets up a chain between the n results - the final hash of seed 1 is used to encrypt seed 2, the final hash of which was the encryption for seed 3, and so on. Then one releases the encrypted file, first seed, and the n1 encrypted seeds. Now the public has to hash the first seed for a month, and only then can it decrypt the second seed, and start hashing that for a month, and so on. Karl Gluck suggests that like repeated squaring, it’s even possible to add error-detection to the procedure5. (A somewhat similar scheme, “A Guided Tour Puzzle for Denial of Service Prevention”, uses network latency rather than hash outputs as the chained data - clients bounce from resources to resource - but this obviously requires an online server and is unsuitable for our purposes, among other problems.)

This is pretty clever. If one has a thousand CPUs handy, one can store up 3 years’ of computation-resistance in just a day. This satisfies a number of needs. (Maybe; please do not try to use this for a real application before getting a proof from a cryptographer that chained hashing is secure.) Implementing chained hashing is not as easy as single-threaded hashing, but there are currently 2 implementations:

  1. Peter Todd & Amir Taaki: implementation in Python (announcement with bounties on example time-locked files; Reddit & HN discussion), with some clever Bitcoin integration
  2. Dorian Johnson, implementation in Go

But what about people who only have a normal computer? Fundamentally, this repeated hashing requires you to put in as much computation as you want your public to expend reproducing the computation, which is not enough. We want to force the public to expend more computation - potentially much more - than we put in. How can we do this?

It’s hard to see. At least, I haven’t thought of anything clever. Homomorphic encryption promises to let us encode arbitrary computations into an encrypted file, so one could imagine implementing the above hash chains inside the homomorphic computation, or perhaps just encoding a loop counting up to a large number. There are two problems with trying to apply homomorphic encryption:

  1. I don’t know how one would let the public decrypt the result of the homomorphic encryption without also letting them tamper with the loop.

    Suppose one specifies that the final output is encrypted to a weak key, so one simply has to run the homomorphic system to completion and then put in a little effort to break the homomorphic encryption to get the key which unlocks a file; what stops someone from breaking the homomorphic encryption at the very start, manually examining the running program, and shortcutting to the end result? Of course, one could postulate that what was under the homomorphic encryption was something like a hash-chain where predicting the result is impossible - but then why bother with the homomorphic encryption? Just use that instead!
  2. and in any case, homomorphic encryption as of 2013 is a net computational loss: it takes as much or more time to create such a program as it would take to run the program, and is no good.

Vulnerability of one-way functions

As it turns out, “Time-Lock Puzzles in the Random Oracle Model” (Mahmoody, Moran, and Vadhan 2011; slides) directly & formally analyzes the general power of one-way functions used for time-lock puzzles assuming a random oracle. Unfortunately, they find an opponent can exploit the oracle to gain speedups. Fortunately, the cruder scheme where one ‘stores up’ computation (repeatedly asking the oracle at inputs based on its previous output) still works under their assumptions:

A time-lock puzzle with a linear gap in parallel time. Although our negative results rule out ‘strong’ time-lock puzzles, they still leave open the possibility for a weaker version: one that can be generated with n parallel queries to the oracle but requires n rounds of adaptive queries to solve. In a positive result, we show that such a puzzle can indeed be constructed…Although this work rules out black-box constructions (with a super-constant gap) from one-way permutations and collision-resistant hash functions, we have no reason to believe that time-lock puzzles based on other concrete problems (e.g., lattice-based problems) do not exist. Extending our approach to other general assumptions (e.g., trapdoor permutations) is also an interesting open problem.

That is, the puzzle constructor can construct the puzzle in parallel, and the solver has to solve it serially.

Successive squaring

At this point, let’s see what the crypto experts have to say. Googling to see what the existing literature was (after I’d thought of the above schemes), I found that the relevant term is “time-lock puzzles” (from analogy with the bank vault time lock). In particular, Rivest/Shamir/Wagner have published a 1996 paper on the topic, “Time-lock puzzles and timed-release crypto”. Apparently the question was first raised by Timothy C. May on the Cypherpunks mailing list in an email from 10 February 1993. May also discusses the topic briefly in his Cyphernomicon, ch14.5; unfortunately, May’s solution (14.5.1) is essentially to punt to the legal system and rely on legal privilege and economic incentives to keep keys private.

Rivest et al agree with us that

There are 2 natural approaches to implementing timed-release crypto:

  • Use ‘time-lock puzzles’ - computational problems that can not be solved without running a computer continuously for at least a certain amount of time.
  • Use trusted agents who promise not to reveal certain information until a specified date.

And that for time-lock puzzles:

Our goal is thus to design time-lock puzzles that, to the great extent possible, are ‘intrinsically sequential’ in nature, and can not be solved substantially faster with large investments in hardware. In particular, we want our puzzles to have the property that putting computers to work together in parallel doesn’t speed up finding the solution. (Solving the puzzle should be like having a baby: two women can’t have a baby in 4.5 months.)

Rivest et al then points out that the most obvious approach - encrypt the file to a random short key, short enough that brute-forcing takes only a few months/years as opposed to eons - is flawed because brute-forcing a key is very parallelizable and amenable to special hardware6. (And as well, the randomness of searching a key space means that the key might be found very early or very late; any estimate of how long it will take to brute force is just a guess.) One cute application of the same brute-forcing idea is Merkle’s Puzzles where the time-lock puzzle is used to hide a key for a second-party to communicate with the first-party creator, but it has the same drawback: it has the creator make many time-lock puzzles (any of which could be used by the second-party) and raises the cost to the attacker (who might have to crack each puzzle), but can be defeated by a feasibly wealthy attacker, and offers only probabilistic guarantees (what if the attacker cracks the same puzzle the second-party happens to choose?).

Rivest et al propose a scheme in which one encrypts the file with a very strong key as usual, but then one encrypts the key in such a way that one must calculate encryptedKey2tmod(n) where t is the adjustable difficulty factor. With the original numbers, one can easily avoid doing the successive squarings. This has the nice property that the puzzle constructor invests only O(log(n)) computing power, but the solver has to spend O(n2) computing power. (This scheme works in the random oracle model, but Barak & Mahmoody-Ghidary 2009 proves that is the best you can do.)

Rivest has actually used this scheme for a time capsule commemorating the MIT Computer Science and Artificial Intelligence Laboratory; he expects his puzzle to take ~35 years. As of 14 years later, minimal progress seems to have been made; if anything, the breakdown of CPU clockspeeds seems to imply that it will take far more than 35 years7 Rivest offers some advice for anyone attempting to unlock this time-lock puzzle (may or may not be related to Mao’s 2000 paper “Time-Lock Puzzle with Examinable Evidence of Unlocking Time”):

An interesting question is how to protect such a computation from errors. If you have an error in year 3 that goes undetected, you may waste the next 32 years of computing. Adi Shamir has proposed a slick means of checking your computation as you go, as follows. Pick a small (50-bit) prime c, and perform the computation modulo cn rather than just modulo n. You can check the result modulo c whenever you like; this should be a extremely effective check on the computation modulo n as well.

Constant factors

How well does this work? The complexity seems correct, but I worry about the constant factors. Back in 1996, computers were fairly homogeneous, and Rivest et al could reasonably write

We know of no obvious way to parallelize it to any large degree. (A small amount of parallelization may be possible within each squaring.) The degree of variation in how long it might take to solve the puzzle depends on the variation in the speed of single computers and not on one’s total budget. Since the speed of hardware available to individual consumers is within a small constant factor of what is available to large intelligence organizations, the difference in time to solution is reasonably controllable.

But I wonder how true this is. Successive squaring does not seem to be a very complex algorithm to implement in hardware. There are more exotic technologies than GPUs we might worry about, like field-programmable gate arrays which may be specialized for successive squaring; if problems like the n-body problem can be handled with custom chips, why not multiplication? Or, since squaring seems simple, is it relevant to forecasts of serial speed that there are graphene transistor prototypes going as high as 100GHz? Offhand, I don’t know of any compelling argument to the effect that there are no large constant-factor speedups possible for multiplication/successive-squaring. Indeed, the general approach of exponentiation and factoring has to worry about the fact that the complexity of factoring has never been proven (and could still be very fast) and that there are speedups with quantum techniques like Shor’s algorithm.

Memory-bound hashes

Of course, one could ask the same question of my original proposal - what makes you think that hashing can’t be sped up? You already supplied an example where cryptographic hashes were sped up astonishingly by a GPU, Bitcoin mining.

The difference is that hashing can be made to stress the weakest part of any modern computer system, the memory hierarchy’s terrible bandwidth and latency8; the hash can blow the fast die-level caches (the CPU & its cache) and force constant fetches from the main RAM. They were devised for anti-spam proof-of-work systems that wouldn’t unfairly penalize cellphones & PDAs while still being costly on desktops & workstations (which rules out the usual functions like Hashcash that stress the CPU). For example, the 2003 “On Memory-Bound Functions for Fighting Spam”; from the abstract:

Burrows suggested that, since memory access speeds vary across machines much less than do CPU speeds, memory-bound functions may behave more equitably than CPU-bound functions; this approach was first explored by Abadi, Burrows, Manasse, and Wobber [8]. We further investigate this intriguing proposal. Specifically, we…

2. Provide an abstract function and prove an asymptotically tight amortized lower bound on the number of memory accesses required to compute an acceptable proof of effort; specifically, we prove that, on average, the sender of a message must perform many unrelated accesses to memory, while the receiver, in order to verify the work, has to perform significantly fewer accesses; 3. Propose a concrete instantiation of our abstract function, inspired by the RC4 stream cipher; 4. Describe techniques to permit the receiver to verify the computation with no memory accesses; 5. Give experimental results showing that our concrete memory-bound function is only about four times slower on a 233 MHz settop box than on a 3.06 GHz workstation, and that speedup of the function is limited even if an adversary knows the access sequence and uses optimal off-line cache replacement.

Abadi 2005, “Moderately hard, memory-bound functions” develop more memory-bound functions and benchmark them (partially replicated by Das & Doshi 2004):

…we give experimental results for five modern machines that were bought within a two-year period in 2000-2002, and which cover a range of performance characteristics. All of these machines are sometimes used to send e-mail-even the settop box,which is employed as a quiet machine in a home…None of the machines have huge caches-the largest was on the server machine, which has a 512KB cache. Although the clock speeds of the machines vary by a factor of 12, the memory read times vary by a factor of only 4.2. This measurement confirms our premise that memory read latencies vary much less than CPU speeds.

…At the high end, the server has lower performance than one might expect, because of a complex pipeline that penalizes branching code. In general, higher clock speeds correlate with higher performance, but the correlation is far from perfect…Second, the desktop machine is the most cost-effective one for both CPU-bound and memory-bound computations; in both cases, attackers are best served by buying the same type of machines as ordinary users. Finally, the memory-bound functions succeed in maintaining a performance ratio between the slowest and fastest machines that is not much greater than the ratio of memory read times.

Colin Percival continues the general trend in the context of finding passwords schemes which are resistant to cheap brute-forcing, inventing scrypt in the 2009 paper “Stronger Key Derivation via Sequential Memory-Hard Functions”. Percival notes that designing a really good memory-bound function requires not overly relying on latency since his proofs do not incorporate latency, although in practice this might not be so bad:

Existing widely used hash functions produce outputs of up to 512 bits (64 bytes), closely matching the cache line sizes of modern CPUs (typically 32-128 bytes), and the computing time required to hash even a very small amount of data (typically 200-2000 clock cycles on modern CPUs, depending on the hash used) is sufficient that the memory latency cost (typically 100-500 clock cycles) does not dominate the running time of ROMix.

However, as semiconductor technology advances, it is likely that neither of these facts will remain true. Memory latencies, measured in comparison to CPU performance or memory bandwidth, have been steadily increasing for decades, and there is no reason to expect that this will cease — to the contrary, switching delays impose a lower bound of Ω(log N ) on the latency of accessing a word in an N-byte RAM, while the speed of light imposes a lower bound of Ω( √N ) for 2-dimensional circuits. Furthermore, since most applications exhibit significant locality of reference, it is reasonable to expect cache designers to continue to increase cache line sizes in an attempt to trade memory bandwidth for (avoided) memory latency.

In order to avoid having ROMix become latency-limited in the future, it is necessary to apply it to larger hash functions. While we have only proved that ROMix is sequential memory-hard under the Random Oracle model, by considering the structure of the proof we note that the full strength of this model does not appear to be necessary.

Percival constructs a password algorithm on his new hash function and then calculates costs using 2002 circuit prices

When used for interactive logins, it is 35 times more expensive than bcrypt and 260 times more expensive than PBKDF2; and when used for file encryption — where, unlike bcrypt and PBKDF2, scrypt uses not only more CPU time but also increases the die area required — scrypt increases its lead to a factor of 4000 over bcrypt and 20000 over PBKDF2.

That is quite a difference between the hashes, especially considered that bcrypt and PBKDF2 were already engineered to have adjustable difficulty for similar reasons to our time-lock crypto puzzles.


  1. Ebringer 2008, applying time-lock puzzles to enhancing the ability of computer viruses & trojans to defeat anti-virus scanners, describes Rivest’s original successive-squaring solution somewhat sarcastically:

    Even in the original paper, the authors struggled to find a plausible use for it. To actually use the construction as a “time-lock” requires predicting the speed of CPUs in the future, resulting, at best, in a fuzzy release-date. This assumes that someone cares enough to want what is allegedly wrapped up in the puzzle to bother to compute the puzzle in the first place. It is not obvious that in the majority of situations, this would have a clear advantage over, say, leaving the information with a legal firm with instructions to release it on a particular date. Although this paper proposes a practical use for time-lock puzzles, the original authors would probably be dismayed that there is still not a widespread usage that appears to be of net benefit to humanity.

    On the other hand, a similar criticism could and has been made about Bitcoin (supporters/users must expend massive computing power constantly just to keep it working, with no computational advantage over attackers), and that system has worked well in practice.

  2. Actual numbers; the difference really is that large.

  3. While there are tens or hundreds of thousands of nodes in the Bitcoin P2P network, only a few of them are actual miners because CPU mining has become useless - the big miners, who have large server farms of GPUs or ASICs, collectively control much of the hash power. This has not yet been a problem, but may. Using a (partially) memory-bound hash function is one of the selling points of a competing Bitcoin currency, Litecoin.

  4. eg. the 2008 Graves thesis, “High performance password cracking by implementing rainbow tables on nVidia graphics cards (IseCrack)” claims a 100x speedup over CPU generation of rainbow tables, or the actively developed utility, RainbowCrack (which you can even buy the generated rainbow tables from).

  5. On Hacker News

    To add checkpoints, one could release both the original seed of the chain A, and a number of pairs of hashes (x0,y0)(x1,y1) … Let’s say you wanted to do 1-month chains. Hash the seed A for a week, then take the current value x0 such that H(B)=x0. You know the value of B, since you’ve been computing the chain. Pick another random value y0, and continue the chain with H(By0). Write (x0,y0) in the output, and hash for another week. Do the same for (x1,y1) (x2,y2) and (x3,y3). Each chain then has a seed value and 4 pairs of ‘checkpoints’. When unlocking the crypto puzzle, these checkpoints can’t be used to jump ahead in the computation, but they can tell you that you’re on the right track. I think that you could even use a secondary hash chain for the yn values, so yn+1=H(yn). If you also derived y0 from A (e.g. y0=H(Aconst)), you would just need to publish the seed value A and each checkpoint hash xn in order to have a fully checkpointed crypto puzzle.

  6. Colin Percival’s “Insecurity in the Jungle (disk)” presents a table giving times for brute-forcing MD5 hashes given various hardware; most dramatically, <$1m of custom ASIC hardware could bruteforce a random 10 character string in 2 hours. (Hardware reaps extreme performance gains mostly when when few memory accesses are required, and a few fast operations applied to small amounts of data; this is because flexibility imposes overhead, and when the overhead is incurred just to run fast instructions, the overhead dominates the entire operation. For example, graphics chips do just a relative handful of math to a frame, again and again, and so they gain orders of magnitude speedups by being specialized chips - as does any other program which is like that, which includes cryptographic hashes designed for speed like the ones Bitcoin uses.)

    Wikipedia gives an older example using FPGAs (also being used for Bitcoin hashing):

    An important consideration to be made is that CPU-bound hash functions are still vulnerable to hardware implementations. For example, the literature provides efficient hardware implementations of SHA-1 in as low as 5000 gates, and able to produce a result in less than 400 clock cycles2. Since multi-million gate FPGAs can be purchased at less than $100 price points3, it follows that an attacker can build a fully unrolled hardware cracker for about $5000. Such a design, if clocked at 100MHz can try about 300,000 keys/second for the algorithm proposed above.

  7. As of 2013, the highest-frequency consumer CPU available, the AMD FX-9000, tops out at 5GHz with few prospects for substantial increases in frequency; Rivest projected 10GHz and increasing:

    Based on the SEMATECH National Technology Roadmap for Semiconductors (1997 edition), we can expect internal chip speeds to increase by a factor of approximately 13 overall up to 2012, when the clock rates reach about 10GHz. After that improvements seem more difficult, but we estimate that another factor of five might be achievable by 2034. Thus, the overall rate of computation should go through approximately six doublings by 2034.

    Moore’s law, in the original formulation of transistors per dollar, may have continued post-1997, but the gains increasingly came by parallelism, which is not useful for repeated squaring.

  8. From Abadi 2005:

    Fast CPUs run much faster than slow CPUs - consider a 2.5GHz PC versus a 33MHz Palm PDA. Moreover, in addition to high clock rates, higher-end computer systems also have sophisticated pipelines and other advantageous features. If a computation takes a few seconds on a new PC, it may take a minute on an old PC, and several minutes on a PDA. That seems unfortunate for users of old PCs, and probably unacceptable for users of PDAs…we are concerned with finding moderately hard functions that most computer systems will evaluate at about the same speed. We envision that high-end systems might evaluate these functions somewhat faster than low-end systems, perhaps even 2-10 times faster (but not 10-100 faster, as CPU disparities might imply). Moreover, the best achievable price-performance should not be significantly better than that of a typical legitimate client…A memory-bound function is one whose computation time is dominated by the time spent accessing memory. The ratios of memory latencies of machines built in the last five years is typically no greater than two, and almost always less than four. (Memory throughput tends to be less uniform, so we focus on latency.) A memory-bound function should access locations in a large region of memory in an unpredictable way, in such a way that caches are ineffectual…Other possible applications include establishing shared secrets over insecure channels and the timed release of information, using memory-bound variants of Merkle puzzles [Merkle 1978] and of time-lock puzzles [May 1993; Rivest et al. 1996], respectively. We discuss these also in section 4.

    And here I thought I was being original in suggesting memory-bound functions for time-lock puzzles! Truly, “there is nothing new under the sun”.