Tuesday, February 12, 2013

The Birthday Attack Myth


[EDIT: Rewritten because the first version sucked.]

In cryptographic circles, it's often said that a hash is at best as strong as the square root of the size of its state space. In other words, assume that we have a hash consisting of 2N bits. Then the number of possible hashes is at most H=2^(2N). The difficulty of "cracking the hash" -- in other words, finding any 2 different messages with the same hash, known as a "hash collision" -- is then no harder than O(2^N). This is the fundamental theory behind a birthday attack. (This might allow someone to "prove" that bogus credentials were legitimate, for instance.)

The birthday attack myth is that this complexity estimate applies to the execution time (in processor-seconds), and not only the operation count. It does not, for the reasons that follow.

The first problem is one of practicality: the attack only predicts the creation rate of messages with equal hashes -- not the creation rate of messages with a specific hash, which would cost O(H), assuming an optimally hard hash. So this is not a useful tool for modifying existing authenticated messages whilst ensuring that the hash still works out. Granted, to the extent that (2^N) messages are available in the wild as targets for malicious modification, then generating an additional (2^N) fake messages would be moderately likely to result in one of them having the same hash as one of the originals, the exact probability being given in the article linked above.

The second problem goes to the heart of the complexity question: interprocessor communication. There are 2 extremes to consider, with other configurations lying somewhere in between:

At the serialized extreme, assume we have a single processor. The fastest way to solve the problem is probably to sort the list of (2^N) hashes (in O(2^N) time, with a radix sort). Then we can scan through the sorted hashes, looking for one which occurs twice, again in O(2^N) time.

At the (classically) parallelized extreme, assume we have a processor for each hash, and that each processor is connected to every other. Then the hashes could sort themselves by moving to their expected position in the sort, and adjusting dynamically. A given hash might find its proper place in the list in O(N) time (probably worse on account of network congestion). In any event, this is certainly much faster than the serial case, but the computational complexity is actually greater, because we're using many processors inefficiently in order to run faster than one processor used efficiently. It's O(N*2^N), at best.

Thus we can't expect to do better at minimizing the computational complexity than to stick with the serial case. But that's fine because, as the birthday attack predicts, it's O(2^N), right?

Well, not exactly. The operation count is indeed O(2^N). But what if the time that an operation takes is itself a function of N? Then the execution time must be disproportionately larger.

The problem is one of memory size. The larger the number of hash samples, the larger the memory, in a physical sense. Memory latency is a function of the time it takes for a signal to traverse a 3D memory, which is O((2^N)^(1/3)), assuming that all the samples are packed into the smallest possible space, which is approximately a sphere.

Thus the execution time is O(2^N) times O((2^N)^(1/3)), which is O((2^N)^(4/3)). (Parallelizing the system will reduce the "wall clock" execution time, but not the execution time measured in processor-seconds. Moreover, the same (2^N)^(1/3) factor applies to the parallel case as well, because hashes must fly a certain distance before arriving at their sorted locations.)

By the way, given that the rate of radiation of a sphere is proportional to its surface area rather than its volume, the maximum degree of parallelism would occur with all the processors packed tightly on the surface of a sphere. Any attempt at tighter packing (using the interior of the sphere) whilst running all processors in parallel could only be sustained at such lower rate of computation as to render the topology performance-equivalent to the surface case. (This thermodynamic point bears uncanny resemblance to the Holographic Principle's implication that the information content of a black hole is (merely) proportional to its surface area.) The upshot is that the interprocessor communication latency given above is underestimated, in the regime of maximum efficiency packing of processors executing at maximum sustainable speed.

As to quantum computers, I was originally hopeful that Grover's algo could reduce the execution time. On second thought, I don't think so. The problem is that the hashes must either be loaded into the quantum computer from classical media or they must be generated from candidate messages produced inside quantum superposition. But multiverses in a superposition can't communicate, so far as I know. So in each case, we would have to search for each hash via classical media. So there would be no complexity reduction in either case.

The point of all this is simply that the computational complexity of a birthday attack is exponentially less than its predicted execution time in processor-seconds, and that this would appear to hold true even in the quantum computing regime. Because the latter metric is what matters in the physical world, the implication is that hashes are actually stronger than supposed, independent of the question of weaknesses unrelated to birthday attacks.

No comments:

Post a Comment