Tuesday, December 18, 2012

GetLMD with Checksum Capability

GetLMD source code is here (and more specifically, here) on Github for Linux (32/64-bit), Mac (64-bit), and Windows (32-bit) now has the ability to calculate 8-bit, 16-bit, 32-bit, and 64-bit checksums over nested folders, in addition to LMD, LMD2, LMD3, LMD7, and LMD8. Precompiled binaries included. See readme.txt or run it (getlmd64 for X64 Linux, getlmd32 for X86 Linux, or getlmd.exe for Windows) for syntax instructions. As with LMDx, checksums are evaluated as though the file is a multiple of the checksum width, padded with high 0 bytes.

Thursday, September 20, 2012

GetLMD, a File Hashing Utility

Want an easy way to find duplicate files, or check for errors? Try GetLMD. It's a command line utility that I wrote long ago, and only just got around to publishing. It was intended to demonstrate the LMD hash family, so the code is written in an obtuse manner which forgoes some obvious optimizations in exchange for clarity. Nevertheless, it's very fast. The output looks like this, for example:

GetLMD build 47
Computes the LMD/LMD2/LMD3 of a file or directory.
Copyright (c) 2012 Russell Leidich, all rights reserved.
http://leidich-message-digest.blogspot.com

D2E2C9132E17DF1A: getlmd/build.h
BE37DA2850485A67: getlmd/constant.h
94EED84F26B6CC2C: getlmd/COPYING
844D8481E6860212: getlmd/file_sys.h
5056B4444D7F617C: getlmd/getlmd.c
6E6A243555254A52: getlmd/getlmd.exe
9A74D885D069BF92: getlmd/getlmd32
F9EA958FDAC7F040: getlmd/getlmd64
373446334609A081: getlmd/linux32_build.sh
93E60B9867442314: getlmd/linux64_build.sh
253A2F6F4ED2A8BC: getlmd/readme.txt
59997B11FC9D6990: getlmd/win32_build.bat

476542E8D2313940 (Sum of file-size-hashed (LMD2)s modulo 2^64.)

The source code is included under the licensing terms explained in COPYING. It does, however, seem to work in a manner consistent with the hash definitions provided in this blog, and has been tested against the LMD2 and LMD3 reference code. See readme.txt for details. Really, this was all just intended as a demo, so the code could be much more elegant and performant. Nevertheless, it's quite useful.

For example, you can capture the hashes to a file, like this (in 64-bit Linux):

./getlmd64 your_folder >your_output_file.txt

Or detect duplicate files by looking for the same LMD twice in a row:

./getlmd64 your_folder |sort

I don't have time to do further development, but I'll try to help if you find any bugs. pkerjjy@gmail.com, without the fourth letter.

Enjoy, and by all means feel free to build something better with it!

Wednesday, April 4, 2012

The LMD4, LMD5, and LMD6 Secure Hashes

If you're just looking for source code, it's here. But it's poorly documented (sorry), so definitely check out my expanded notes and examples here.

This post won't make any sense unless you've read the previous one.

First of all, a semantic point: I use the word "hash" to mean "digest". But I use the former because industry experts often discuss "secure hashes" but rarely "secure digests". The rest of this article presumes that you understand the basic requirements for a secure hash: (1) that it be intractable to construct 2 messages with the same hash and (2) that the message cannot be derived with any useful statistical certainty from the hash. If this doesn't make sense, then Google around first.

Secure hash design is a humbling process (as distinct from evolution, which gave rise to the LMD family), because computers are so good at demonstrating the hidden statistical weaknesses of hashes which appear strong. To make the point obvious, consider a checksum. A checksum is a crude hash. At first blush, a 128-bit checksum is almost impossible to defeat because there are 2^128 possible checksums. But on closer examination, it turns out that it's very easy to get the same hash from 2 different messages -- even by accident! Namely, 2 bit flips in the same column simply add and subtract the same power of 2 from the checksum, resulting in no change. This concept extends even to choatic iterators, which, contrary to common sense, often show predictable behavior when seen through the right statistical analytic lens.

LMD4, LMD5, and LMD6 (abstractly, "LMDx"), when used in tandem with a symmetric encryption algo, are secure hashes, provided that said algo is harder to crack than the hashes themselves. That's because they rely on the algo for pseudorandom data.

These 3 hashes share a common precision-abstract algo, as defined below. So first, without further comment, I will provide their parameters, which constitute the entirety of their mutually distinguishing features:

LMD4 (256 bits)
N=128
A=(2^128-2^125-2^110-2^100)
B=(2^128-2^101-2^98-2^76)

LMD5 (512 bits)
N=256
A=(2^256-2^243-2^236-2^194)
B=(2^256-2^220-2^206-2^183)

LMD6 (1024 bits)
N=512
A=(2^512-2^498-2^496-2^427)
B=(2^512-2^481-2^404-2^362)

The number of bits in each hash is 2N. Half this value happens to be a useful quantity for the sake of defining the algo.

A and B are multiply-with-carry generator coefficients, in the LMD and LMD2 sense, chosen such that:


A*(2^N)-1 is prime;
A*(2^(N-1))-1 is prime;
B*(2^N)-1 is prime;
B*(2^(N-1))-1 is prime;

and A and B are easy to multiply by (for speed).


This is critical so that we can guarantee that the period of the generators (A,X0,C0) and (B,Y0,D0), in which (X0, C0)<>(0,0) and (Y0, D0)<>(0,0), is given by:


A*(2^(N-1))-1 or B*(2^(N-1))-1, respectively.


The important thing is simply that these huge cycle lengths can be guaranteed ahead of time, so we're not just stuck hoping that they're "long". And for their part, transient subcycle resonances are thwarted by virtue of the LMD generators' compliance with the reflexive density constant.

Granted, it's possible to maliciously construct a block L of data which exhibits resonance with the outputs of a particular LMDx generator. That's why, as shown below, (1) the seeds are implied by the cryptographic context, but not stored; and (2) we use 2 orthogonal generators to the compute the hash. Such construction is still possible because, after all, the hash is always many times shorter than L itself. However, I suspect that it's computationally intractable for reasons discussed below..

LMDx should be sufficient to satisfy both hash criteria above. However, it's still encrypted, as a measure of cheap insurance. It's specifically designed to plug into an encryption algo which has the following properties:


1. It can produce 4 pseudorandom seeds, each of N bits, to initiate LMDx. These seeds -- (x0, c0) and (y0, d0) -- are known only to the 2 parties which share a private symmetric encryption key. They are implied by the cryptographic context, and thus not transmitted in any form.

2. It can produce a 2N-bit xor mask, M, which is used to encrypt (by xoring) and decrypt (by xoring again) to the LMDx.

3. Neither of the above pseudorandom values is to be used for other purpose, which might permit statistical analysis attacks.


LMDx operates on a block of 2^15 bits (2^12 bytes, or 4KiB) of data -- regardless of N. This block is denoted as L. L[N,r] means the (r+1)th N-bit value of the block. For example, L[256,1] means the second 256 bits of the block. The maximum value j of r is thus:


j=(2^15)/N-1


Here, then, is the LMDx algo in terms of N, A, and B above; the seeds (X0, C0) and (Y0, D0); and the xor mask, M; where "x/y" denotes the integer quotient obtained by dividing x by y.


j=(2^15)/N-1
x=X0
c=C0
y=Y0
d=D0
FOR (i=0 to j){
x=x xor y
p=A*(x xor L[N, i])+c
x=p MOD 2^N
c=p/(2^N)
q=B*(y xor L[N, i xor ((j+1)/2)])+d
y=q MOD 2^N
d=q/(2^N)
}
p=p+(y*(2^N))+d
p=p MOD 2^(2N)
z=p xor M


The LMDx of the block is then given by z, which is 2N bits long.

The algo above may appear arbitrary, but every step in the process serves a specific purpose, without which the hash would have known weaknesses. These weaknesses were extrapolated by statistical analysis of comparable hashes where N={8,12,13}. It's possible that other weaknesses (or strengths) occur when N>=128, but I lack the computational tools to expose them. However, the behavior in the aforementioned "small N" cases is so consistent, as to lend confidence to the latter case.

Assume that z is stored in plain view. Thus we could crack the hash if we could find U such that:


LMDx(L,N,A,B,X0,C0,Y0,D0,M)=LMDx(L',N,A,B,X0,C0,Y0,D0,M xor U)


where L' is a (probably maliciously) modified derivative of L, because we could simply xor U to z.

At first blush, it's impossible to find U because the 4 seeds are secret; one must crack the encryption in order to find them, in which case the cracking the LMDx is probably a moot consideration anyway. One might surmise that 2^(2N) values of U are possible, given that U consists of 2N bits.

While this may be true in the general case where L and L' do not resemble one another, we want to concentrate on cases where L and L' differ by only a bit. In the worst case, this amounts to setting a single bit in a field of 0s, or clearing a single bit in a field of 1s. (Anything more complex is probably an exercise in self-deceipt with respect to hash strength.) In these painfully simple cases, it appears that only about 63% of all possible U values are actually roots (solutions) of the above equation.

63% is very close to the reflexive density constant. The upshot is thus that the roots of the hash appear as though chosen at random. In turn, this lends credence to the notion that we have a strong hash, the binary equations for which being as complex as possible, at least in the sense of root obfuscation, given the computational complexity required to compute the hash in the first place. (A more complex algo would create more obfuscated roots, but given the computational cost that we have, we appear to be getting maximum bang for the buck, because if any piece of it is removed, it exhibits statistical weaknesses.)

Again, this algo didn't come from thin air. It started as something more like LMD2, and evolved, step by step, until it satisfied the reflexive density constant and Poisson Criterion constraints sufficiently well.

So I hope you'll excuse its "warts", which are a necessary evil. In particular, the "x=x xor y" may seem arbitrary, particularly as it communicates data from the second generator to the first, but not the other way round. Without this instruction, the algo is a lot weaker. But with it, we seem to have achieved our goal. Providing a reverse dataflow is then just a waste of time, and might actually allow for easier hash compensation.

Secondly, the 2 instructions before the final xor may seem weird; why are we endian-flipping the second generator state before adding it to the first, and why not xor, which is faster than add due to the lack of carry propagation? Well, in the first case, the endian flip is required to ensure that the high bit of the generator state is not biased, which it would (slightly) be otherwise. (This is important with respect to the Poisson Criterion.) And we use addition instead of xor because -- you guessed it -- the algo would otherwise exhibit weaknesses.

Lastly, what's up with the "i xor ((j+1)/2)" in the expression for the second generator? All this does is ensure that the 2 generators are processing data which is halfway out of phase. What this means is that they are 128, 64, and 32 iterations apart for LMD4, LMD5, and LMD6, respectively. This guards against final-iteration attacks, in which changes to the last chunk of data in the block may fail to result in sufficient bit flips in the hash.

What about files which do not happen to be a multiple of 2^15 bits? And how do I compute the LMDx of a null file? That's all discussed here, in the "UPDATE" at bottom.

What's that? You want a demo? OK fine. Compile the source code from here.

After all that, did you manage to crack LMDx with "arbitrary" seeds, even in the case of a single-bit message modification? Cool! Please tell me how you did it, at hpkejjy@gmail.com without the "h".

Tuesday, April 3, 2012

The Reflexive Density Constant

Before proceeding with further discussion, it's useful at this point to introduce what I call the "reflexive density constant", or "R". This may sound totally unrelated to error detection and secure hashes, but bear with me.

What is R? The best way to define it is procedurally:


1. Set all N bits in a bitfield to 0.

2. N times, choose one of the bits in the bitfield at (true) random. Set the bit to 1, regardless of its current value.

3. Count all the 1s in the bitfield. Divide by N. R is the limit of this ratio as N approaches infinity.


There is an infinite expression tree for R, which I don't have time to investigate. So I approximated it using a true random number generator. It appears to be close to 0.6321...

This value appears to equal 1-1/e. (Isn't transcendental reverse approximation cool? Thank you, WolframAlpha!) [EDIT: This turns out to be correct, as proven here.] Granted, this could all just be an elegant coincidence, as I have no formal proof. Either way, I surmise that an infinite Bayesian logic tree is at the heart of R. [EDIT: Alternatively, one could use the Jytter userspace true random number generator to arrive at an empirical approximation of R.]

1-1/e also comes up in the online stochastic bipartite matching problem. Is this significant? I don't know and don't have time to find out, but maybe you do.

Anyway, what about the name? "Reflexive" refers to choosing N bits at random from a bitfield which contains equally many bits. "Density" refers to the fraction of 1s. "Constant" reflects the apparent constancy of R in the infinite limit of N.

Now, how does this relate to pseudorandom number generators (PRNGs), EDCs, and secure hashes? Well, it turns out that compliance with R is a very good measure of the quality of each of these:

R AND PSEUDORANDOM NUMBER GENERATORS

For a PRNG to look convincing, its outputs must mimic an unbiased true random number generator (TRNG). We can ascertain the quality of this mimicry by running the test above using the PRNG instead of a TRNG. The result should be close to R. If it's much less than R, then the PSNG is getting trapped in a small subset of possible outputs (a tight orbit). If it's much more than R -- in the worst case, 1 -- then it's starting to act like an out-of-order counter.

R AND ERROR DETECTION CODES

A good EDC requires a PRNG of the above criteria. Otherwise, who knows what data resonances might occur, which would weaken said EDC. All multiply-with-carry generators in the LMD family appear to adhere to within 1% of R.

R AND (SECURE) HASH FUNCTIONS

Consider the hash function H(L), where L is a message and H(L) is its hash. Given L', a derivative of L, which is probably only slightly different, e.g. a single bit flipped. We can then crack this hash by finding some integer U such that:


H(L)=H(L') xor U


Simply xor U to H(L'), and presto, L' appears to be legit. (The xor could be replaced by addition or some other operator, depending on the cryptographic context. The point is, if it's easy to find U, the "difference" of the hashes, then it's easy to crack the hash.)

We can also crack the hash by finding L' such that:


H(L)=H(L')


but in practice, because U is generally much shorter than L', it's easier to take L' as a given, and solve for U.

Now... how do we know how easy it is to solve for U, in the general case of a long hash? We don't -- at least, not precisely. However, by gathering statistics from similar hashes of tractable length (perhaps 24 bits or less, given the limits of desktop computer power), we can often extrapolate to hashes of similar construction but merely longer length. Is this foolproof? No. But it's probably the best that we can do, absent more computer power. (And if we had enough computer power to do a complete analysis of the real hash, then, prima facia, the hash would be of insufficient length to be secure.) At least, if the short hash turns out to be weak, then we know that the long hash is in jeopardy. If the short hash turns out to be strong, then we can attempt to try as many samples as possible, in order to bolster our faith that the real hash is also strong.

How does all this relate to R? Well, think of U as an N-bit integer, where N is equal to the number of bits in the hash itself. Then R tells us that ~63% of the 2^N possible U values are actually roots of the hash with respect to L vs. L'. The number of actual roots, divided by 2^N, is what I call the "root density", D. If D is not close to R, then we have a weak (or, suboptimally strong) hash. Why?

Here's where your brain has to twist... fasten your seatbelt.

First of all, remember that we can't gather statistics on a hash of "interesting" length because, after all, the length of the hash is meant to hamper all attempts to solve for U. So, we're solving for U on hashes of tractable length, but similar construction, in the hopes that the statistics are extapolatable to the actual hash in question. (It's easy to solve for U in the above case, for example. Just compute H(L) and H(L'), then xor them together. Depending on the cryptographic context, this process could be more complex, but the concept is the same.)

So just create 2^N versions of L', and count the populations of all associated values of U. What you get is a population histogram of U over all 2^N versions of L'. (The same process allows us to compare how H(L) differs from H'(L), where H and H' differ by only their pseudorandom seeds. Suffice to say, this technique can be applied to a wide variety of hash modifications.)

So the question is, how many roots (unique values of U) are there, compared to how many there could possibly be (2^N)? For example, if we find only 3 unique values of U, which occur over and over again, then the answer is 3/(2^N). If we find all 2^N values, then the answer is 1. I would suggest that we want the answer (D) to be as close to R as possible.

To understand why, first consider the extreme cases: 1 unique root, and 2^N unique roots. In the first case, it might be that the binary equation trees for deriving U from L and L' (and yes, there are such equations, despite the fact that you couldn't write them down in a lifetime) might have a hidden simplicity, which a smart computer could discover. In the second case, a comparable argument prevails: the set of solutions is "too perfect", such that every possible root is in fact a root. In this case, there might be a boring linear relationship between L, L', and U. In the absence of any other information, then, it's compelling to believe that the binary equations reach maximum complexity when D=R.

The point here is wider than simply creating memory capacity pains for classical computers attempting to subvert the hash. There's a quantum computer issue here, as well: if the roots appear to have been chosen at random, then it stands to reason that a quantum computer will behave as if they are in fact arranged at random, which is to say that they will represent (unstable) excited states for the wave function, eluding the quantum computer's attempts to discover the global minimum, i.e. the particular value of U which legitimizes a particular L'.

But there's one other consideration for the quantum case: root multiplicity. Essentially, the larger the ratio between maximum and (nonzero) minimum root multiplicity (the "root multiplicity ratio"), the greater the odds that a quantum computer will find the root with maximum multiplicity.

What distribution of root multiplicities would we expect if we chose 2^N values of U at random? That's a question related to Poisson distributions and beyond the scope of this blog. (I call it the "Poisson Criterion".) Suffice to say that if my extrapolations from shorter hashes are valid, then LMD4, LMD5, and LMD6 all exhibit root multiplicity ratios close enough to what a Poisson distribution would dictate, as to be unhelpful to the best known attack method, which is a birthday attack.

Got it? Good. Now we can talk about secure hashes...

Monday, March 26, 2012

A Word on File Size

For those of you who wish to use LMDx for purposes other than detecting accidental errors on fixed-block-size storage devices...

As explained in previous posts, LMD2 and LMD3, which I published in the previous post, act on 32-bit values ("u32" in the code). This means that a file which is a multiple of 4 bytes in size will have the same LMD2 (and LMD3) as another file whose length is 1, 2, or 3 bytes longer, provided that those additional bytes are all 0s.

While this seems bad, it's not actually abnormal. After all, most other digest algos operate on bytes, even though a file may end on a bit boundary. If you want to include the exact size (to the nearest bit, or byte, or whatever), then simply ensure that you integrate it into the LMDx that you're using. For example, you could xor the low bits of the generator seeds (X0 being lower in bit position than C0) with the file's size. Or, you could simply ensure that its size is embedded in a header, which itself falls under LMDx protection. [EDIT: A temptingly simple approach is to set the first bit of the padding, and clear it beyond that bit. However, I think the aforementioned seed xor method is superior. Appending a 1 creates a case in which it's easy to generate the same hash maliciously, or even accidentally.]

Of course, if the file size is implicit based on other content, then you already have a way to know whether it has been accidentally truncated or expanded, independent of LMDx.

UPDATE: Likewise, LMD4, LMD5, and LMD6 operate on blocks of 2^12 bytes, so files which do not have this granularity may have the same hash due to similar 0 padding, with the same solution as for LMD2 above. (The hash of a null file is the hash of a block of 0s.)

Lastly, why does this font suck? 0 (zero) comes out like O (the letter "oh"). Ah, the mysteries of Blogger...

C Source Code for LMD2 through LMD8

It's available here (and more specifically, here) on Github, for Linux and Windows. See the file "COPYING" for information on licensing terms, which are very liberal but different than for (obsolete) LMD. See README.txt for instructions on building the demo.

LMD4 through LMD8 are very different creatures from LMD2. I will discuss them in a future post. (I'm the sort of person who prefers to get the code done first, then the documentation, and not the other way around.) And no, these new algos do not obsolete LMD2. LMD2 (or LMD3, if you prefer) are still the recommended EDCs for the rapid detection of accidental (as opposed to malicious) errors.

NOTES:

1. Do not call the C functions with pointers to overlapping memory regions.

2. A and B in the post linked above are known as LMDx_A0 and LMDx_A1 in the source code, respectively, where x=[4,5,6].

3. LMD6 comes in 2 flavors, one optimized for 32-bit systems, and the other for 64-bit.

Friday, March 16, 2012

LMD3 for 64-bit Pseudorandom Sequences

[EDIT: This post was revised on September 16, 2012 because the LMD3 code was right, but this documentation was wrong.]

LMD3 was introduced informally in the previous post. It is not indended to be used as an error detection algo, unless you're more interested in long nonzero sequences than maximimizing double-bit error detection coverage length. For the latter, LMD2 still serves its purpose.

Alternatively, LMD3 can supplement LMD2, if you're paranoid and want to use a 128-bit EDC. In such a regime, 64-bit pseudorandom numbers can be rapidly generated, by concatenating the "x" values of LMD2 and LMD3. (Remember, "c" is not considered part of the pseudorandom output; it's just present to prevent "x" from behaving too predictably.) A smooth transition between 32 and 64 bits can occur.

Let me explain. I happen to be working on an unrelated algo at the moment, which requires that 32-bit pseudorandom numbers be used. However, when the data volume reaches a critical point, 32 bits is no longer enough; I need 64 bits. The problem is that I need a relatively smooth transition between the two, such that the 32-bit fraction at iteration "n" is very close to the 64-bit fraction at iteration "n". Otherwise, the transition would be awkward to the user, in the sense that the program's behavior would be radically different.

So, for instance, if one of the pseudorandom values is 0x8D903EE7 in the first sequence, then a compatible value would be, say, 0x8D903EE79CA4B29B. If we look at these values as fractions on [0,1), then as you can see, the second fraction is within 1 part in 2^32 of the first, which constitutes a smooth transition within the limits of 32-bit precision.

One could do this using any 2 32-bit pseudorandom sequences. But we want the high and low parts to be uncorrelated, yet rapidly generated without the need to compute 128-bit products. Double use of LMD2_A (0xFE001000) helps in this regard.

LMD3 has the seeds that it has because it caters to the cases wherein one is more concerned with long nonzero sequences than double-bit coverage. But if used in the manner above, then it can also provide a smooth (and fast) transition between 32-bit and 64-bit sequences. Granted, because the multipier is still LMD2_A, the period of the resulting sequence is the same as for LMD2, regardless of whether LMD2 and LMD3 are combined. For practical applications, this probably doesn't matter. This could be avoided by using a different multiplier, for example, 0xF7FBFFFF, which is only slightly slower than LMD2_A.

Following is the definition of LMD3. Note that it behaves exactly like LMD2, except for the seeds:


ITERATE_NO_ZERO_CHECK means:

p = 0xFE001000 * x + c
x <- p MOD 2^32
c <- p / 2^32 (Shift p right by 32.)

(x0, c0) = (0x0, 0xDA6D32BA)

(That's not a typo. "x" should rightly start at 0 in order to maximize the number of subsequent nonzero terms.)

Thus:

(x1, c1) = (0xDA6D32BA, 0x0)
(x2, c2) = (0x5F2BA000, 0xD8B865FB)
(x3, c3) = (0x92B865FB, 0x5E6D4EB3)

Thus:

(LMD3 of null) = 0x38DA816D92B865FB


If what you really want is a pseudorandom sequence with a period roughly the square as long, then LMD3 will not help you. Instead, try combining LMD2 with a sequence using the multiplier 0xF7FBFFFF:


0xF7FBFFFF * x = (x * 2^32) - (x * 2^27) - (x * 2^18) - x


As required, (0xF7FBFFFF * 2^32 - 1) and (0xF7FBFFFF * 2^31 - 1) are both prime. The cycle length (in the absence of LMD2) is the latter, or 8,934,578,708,602,159,103, which is enough 32-bit "x" values to exceed the size of a 64-bit address space. Because LMD2 and this new sequence have prime and unequal cycle lengths, the cycle length of their concatenated "x" values is the product of their cycle lengths, or 81,763,217,765,900,274,931,684,699,996,617,179,137, which is just under 2^126.

And finally, just because I happened to take the time to compute it: if a long nonzero initial sequence is what you want, then try the following with 0xF7FBFFFF:


(x0, c0) = (0, 0x938A52)


It will produce 44,342,898,605 nonzero values, starting with the output of the first iteration.

Using LMD2 to Generate a Long Nonzero Pseudorandom Sequence

Borrowing the multiplier from LMD2, we can rapidly create a long (49,327,206,862 outputs, excluding the initial x0=0) sequence of nonzero 32-bit values by simply altering (x0, c0) as follows, while keeping ITERATE_NO_ZERO_CHECK as described in the previous post:


As usual, ITERATE_NO_ZERO_CHECK means:

p = 0xFE001000 * x + c
x <- p MOD 2^32
c <- p / 2^32 (Shift p right by 32.)

Except that, unlike with LMD2:

(x0, c0) = (0, 0xDA6D32BA)

Thus:

(x1, c1) = (0xDA6D32BA, 0)
(x2, c2) = (0x5F2BA000, 0xD8B865FB)
(x3, c3) = (0x92B865FB, 0x5E6D4EB3)

(Only the "x" values are the pseudorandom outputs.)


Don't forget that 0xFE001000 is easy to multiply by, as described in the previous post.

Someone out there probably has a use for such a sequence. And it's not bad from error detection standpoint, either.

(Blogger's font antics are driving me insane. I've changed to this idiotic font that displays hexadecimal numbers with different number and letter heights. Of course, it's the default. So let's hope it fixes the problem.)

LMD2: Breaking the Megabyte Barrier

Perhaps this sounds like the title to an article in an issue of Byte Magazine from 1988. But I'm not talking about memory capacity. The focus of this post is LMD2, an algorithm closely related to LMD, but with the capacity to offer comparable 2-bit error protection strength over a full megabyte, instead of slightly less than that. This is significant because it permits the use of megabyte block size for storage or transmission protocols, without having to worry about 1-bit or 2-bit errors (except in the negligible case where an error bit in the message compensates for an error bit in the LMD2, which appears to have odds of 1:2^64 in the long message limit).

The LMD2 algo is identical to LMD, with the exception of the seed values and the multiplier. Specifically:


ITERATE_NO_ZERO_CHECK means:

p = 0xFE001000 * x + c
x <- p MOD 2^32
c <- p / 2^32 (Shift p right by 32.)

(x0, c0) = (
0x129E5CFA, 0xC97A34B3)

Here are the first few values:

(x1,c1) = (0xBB49D4B3, 0x1279216A)
(x2, c2) = (0x49C4516A, 0xB9D34CBE)
(x3, c3) = (0x2AE9ECBE, 0x4930CD64)

which implies that:

(LMD2 of null) = 0x12AB02173D8849B8


You might reasonably wonder why we multiply by 0xFE001000 on each every iteration. This value is chosen so that (0xFE001000 * 2^32 - 1) and (0xFE001000 * 2^31 - 1) are both prime. This is the same constraint satisfied by 0x7FFFFDCD in LMD. This requirement comes from the multiply-with-carry generator previously discussed.

The cycle length is thus (0xFE001000 * 2^31 - 1), or 9,151,323,238,909,870,079 -- about double that of LMD. This is just under 2^63, which implies that a complete cycle of 32-bit "x" outputs would exceed the size of a modern 64-bit memory space. The reason LMD was limited to half as much was some doubt on my part as to whether or not it was safe to use a coefficient with MSB 31, as opposed to MSB 30. That doubt has now been removed.

But there's something else which is special about 0xFE001000: it's easier to multiply by, which may accelerate the process on certain microprocessor architectures:


0xFE001000 * x = (x * 2^32) + (x * 2^12) - (x * 2^25)


Apart from the coefficient, LMD2 has a longer unique shiftoid sequence length. Specifically, the first 263,837 x outputs (excluding the seeds themselves) constitute unique shiftoids. Because each "x" is 32 bits, this implies error coverage for up to 1,055,348 bytes -- enough to cover a 1MiB data block plus 6772 bytes of metadata.

The noise quality appears to be similar to LMD.

All of these features suggest that LMD2 should replace LMD in practice.

Sorry again for the incorrigible font problems. Blogger sucks, but it's free.

The Finer Points of Digest Finalization

I want to go into a bit more detail regarding the digest finalization process, reiterated below:


(Finish computing dot product, y, above, leaving (x, c) as its most recent value, such that x has been used to compute the last term of y.)

z = (y + (c * 2^32) + x) MOD 2^64
x <- z MOD 2^32
c <- z / 2^32
ITERATE_NO_ZERO_CHECK
ITERATE_NO_ZERO_CHECK
ITERATE_NO_ZERO_CHECK
z = (z + (c * 2^32) + x) MOD 2^64

z is the (final) digest. Done!


It occurred to me the that the above process appears somewhat arbitrary. It is not.

The first step, in which we add the dot product, y, to the current (x,c), is important because we want the LMD of a null file to be "random". Otherwise, imagine how easy it would be to have a null file with an LMD of 0, which appears to be valid. Because y can be any of 2^64 states, z has the same domain.

We pass through the iterator 3 times because, as explained previously, that seems to be optimal from an entropy accrual standpoint, balanced against the need to avoid excessive latency.

In the last stage, we combine z with (x,c) from the beginning of the process. This is critical in order to ensure that z can assume any of 2^64 states; otherwise, it would be constrained by virtue of having passed through the iterator thrice.