More Candy for You

Hey internet friends!

I’ve been coding candy a lot this week and now I want to share it with the world! The following utility uses SSE2 assembly intrinsics to speed up SHA-1 hashing. This was just to challenge myself - to see how low I could go - so don’t expect particularly verbose comments; but you may find it useful if you ever have the need to hash a lot of stuff all at once.

The current benchmark is over a million hashs in half a second, details about my prior benchmarks are also included.

I also recorded copies of the source for every major revision to this code, and if I can figure out a good way to visualize it, ill upload a chart or something showing how I optimized the program.

Requirements:

  • G++ and GOMP or Visual Studios 2008
  • An SSE2 CPU

As always, ping me if you find a bug, I know about a few on windows and I will post about them later, as they are kind of WTF.

Download Here

Some Source code for you

Here is the code that I have been using to play with random numbers these past few days. It comes with the 3D thing and a few statistical tests. It will compile on any platform that has some form of GLUT.

Code here.

Compile and run on Linux:

# g++ -lglut rand.cpp && ./a.out

On Mac OSX(thanks Aaron):

# g++ rand.cpp -framework GLUT -framework OpenGL && ./a.out

But you may have to adjust the header files(Instructions in the source)

Click and drag to rotate.
keys ‘1′,’2′, and ‘3′ manipulate the points in various ways
‘[' and ']‘ increase and decrease the number of dots
‘+’ and ‘-’ zoom out and in.

And enjoy, I don’t plan on touching this code again unless someone finds a bug; but I would appreciate it if anyone with a mac could send back any interesting results.

Post 2^7 and the Marsaglia effect

Another Milestone, 128 posts and I still haven’t been kicked off the internet due to lack of lulz. SUCCESS!

Now onto business. I wrote that 3d program to get my fingers wet with OpenGL(yet again) and to test out the Marsaglia effect, where random numbers become not-so random when graphed in 3D.
Interesting links:

Ben had a stroke of genius(and also a stroke, but Ill save that for another post) by suggesting I multiply two contiguous random numbers in the Microsoft random number set together and then modulo that by RANDMAX. It produced the following.

Pretty good eh? I couldn’t construct a visualization with a reoccurring pattern and the numeric series passed a few of the Die Hard tests that I implemented in my program. I thought it was fool proof until I realized that you could attack this based on the flaws in the initial PRNG. After all, Seq[t]*Seq[t+1] only requires you to break that weak random number generator.

After collecting a few other random number lists(including QBASIC’s random sequence), I started looking into academic PRNG algorithms and as it turns out, there are quite a few simple to implement ones. Here is ‘xor and shift’ designed by George Marsaglia:

Here is an example with k=5, period about 2^160,
one of the fastest long period RNGs, returns more than
120 million random 32-bit integers/second (1.8MHz CPU),
seems to pass all tests:

int x=123456789,y=362436069,z=521288629,w=88675123,v=886756453;
/* replace default x,y,z,w,v with five random seed values in calling program */
int xorshift() {
        int t;
        t=(x^(x>>7));
        x=y;
        y=z;
        z=w;
        w=v;
        v=(v^(v<<6))^(t^(t<<13));
        return (y+y+1)*v;
}

-geo@stat.fsu.edu

And how well does that PRNG fair?

Hey, that looks exactly like that badPRNG * badPRNG discussed above, in fact the box is in the exact same position which could probably lead to a little bit of confusion. But they are completely different, one was generated by multiplying two predictable positions of a bad PRNG together -thus providing a thin veil of randomness - and the other was produced by an actually ‘good’ PRNG.

I probably don’t need to explain the ramifications of a poor random number generator as my audience is mostly tech savvy[1]; and I also shouldn’t have to explain how simple the xorshift algorithm above is, and yet Microsoft decided to go with ’something else’. Recently a bug was discovered in Nmap where the random host generation[2] would start duplicating IP’s after ~500 and would reach 50% duplication around ~1000. The problem Turns out to be Microsoft using a legacy PRNG for their standard c/c++ rand() and a ‘better’ PRNG for their proprietary rand_s().

Alas, this isn’t an anti-Microsoft rant, and later next week I will be putting Linux, IRIX, and possibly others to the same task.

Until then.

[1] My audience consists of my friends, random people from the Google summer of code mailing list, a Godaddy employee, and my father - who doesn’t understand a thing I write about.
[2] Thread here. If your interested in the math side of things keep reading this thread, Kris Katterjohn, Jah, and Brandon Enright all post extremely insightful comments

Problems with Microsofts PRNG

The bounding box is the the max value generated by Microsoft’s pseudo random number generator, 2,147,483,647. The equation I used to generate this is as follows:
x = Seq[t] - Seq[t-1]
y = Seq[t-1] - Seq[t-2]
z = Seq[t-1] - Seq[t-3]

It may be hard to see from the image but all the vertices could be be bound by a rectangular parallelepiped that comfortably sits within the bounds of the PRNG limit. Why aren’t any random numbers generated near any of the other six corners?

Ill leave that question open till I port my program to Linux.

Choices choices choices…

It is a generally accepted practice in GUI design to exclude buttons that have no purpose and to avoid unnecessary input options to keep a GUI as simple as possible. So why does windows vista offer us the option to quit in the middle of an install? I think its pretty safe to assume that no user would want to format a drive, install half of windows vista, and then abort; even if they did there is always the power button, why have a GUI control to quit in an operating system install?

install-copy.JPG