This post originated from an RSS feed registered with Java Buzz
by James Gosling.
Original Post: Cosine redux
Feed Title: James Gosling: on the Java road...
Feed URL: /jag/feed/entries/rss?cat=%2FJava
Feed Description: I've been inescapably tagged as "the Java guy", despite the fact that I havn't worked in the Java product organization for a couple of years (I do still kibbitz). These days I work at Sun's research lab about two thirds of the time, and do customer visits and talks the other third. For more detail...
It's been quite a while since my last blog entry: I had a bit of a technology meltdown
that I'm not quite done with yet :-(
I was doing some recreational hacking over the holidays that involved
evaluating cosines. I ended up doing (once again!) an implementation of
cosine (don't ask why). Given the confused flamage that my previous posts
on cosine generated, I figure that showing some code would be useful. I
would never have thought that cosine would generate so much email traffic.
Yes, I know about Taylor series. No I wasn't trying to insult centuries of
standard mathematical practice. Performance is often the art of cheating
carefully.
public static float cost(float f) {
int bits = Float.floatToRawIntBits(f);
int mantissa = (bits&0x7FFFFF)|(1<<23);
int shift = (bits<<1>>>24)-127+9; // exponent, unbiased, with shift
if(shift>=32 || shift<=-32) return 1;
int fractionBits = shift>=0 ? mantissa<<shift : mantissa>>-shift;
int tableIndex = (fractionBits>>>(30-resultPrecision))&tableSizeMask;
switch(fractionBits>>>30) { // Quadrant is top two bits
case 0: return cosTable[tableIndex];
case 1: return -cosTable[tableSizeMask-tableIndex];
case 2: return -cosTable[tableIndex];
default/*case 3*/: return cosTable[tableSizeMask-tableIndex];
}
}
Lets go through this slowly:
int bits = Float.floatToRawIntBits(f); Get the IEEE 754 bits
int mantissa = (bits&0x7FFFFF)|(1<<23); The mantissa is the bottom 23 bits - to which the hidden bit must be prepended.
int shift = (bits<<1>>>24)-127+9; Extract the exponent, correct for the exponent bias, then add a bias to move the binary point to the top of the word.
if(shift>=32 || shift<=-32) return 1; If the shift is too large, the fraction bits would be zero, therefore the result is 1.
int fractionBits = shift>=0 ? mantissa<<shirt : mantissa>>-shift; Shift the mantissa so that it's a fixed point number with the binary point at the top
of the 32 bit int. The magic is in what's not here: because the argument is in turns, I get
to ignore all of the integer bits (range reduction made trivial); and because it's the
cosine function, which is symmetric about the origin, I get to ignore the sign.
int tableIndex = (fractionBits>>>(30-resultPrecision))&tableSizeMask; The top two bits are the quadrant... extract the bits below that to derive a
table index.
switch(fractionBits>>>30) { One case for each quadrant. This switch could be eliminated by
making the table 4 times larger.
case 0: return cosTable[tableIndex]; Yes! It's just a table lookup! Truly trivial.
case 1: return -cosTable[tableSizeMask-tableIndex]; ...
Since this is just a table
lookup, the resulting approximation can be pretty jagged if the table is small. But it's easy to tune the
table size depending on accuracy needs. The smaller the table is, the higher the cache hit rate will
be, and the more likely it is that the whole table will fit in cache.
Table lookups are a very common way to implement mathematical functions, particularly the periodic
ones like cosine. There are all kinds of elaborations. One of the most common for improving accuracy
is to do some sort of interpolation between table elements (linear or cubic, usually).