GCD and destroy

Some weeks ago I decided to fix once and for all this bloody hdrr. As it requires the framebuffer average log luminance, I decided to compute it be generating a pixel accurate mipmap. So for a 320×240 framebuffer we will have the following downsampling steps : (4,4) (4,4) (5,5) (4,3).
Don’t ask me why but instead of working on a way to decompose a number, I decided that I needed to find the greatest common divisor of width and height! 10 seconds later thanks to the almighty web search engine I started reading the Wikipedia entry about Binary GCD. And after some interweb jumps I ended on this article. The Stein’s algorithm looked cool.
If you look at Sputsoft code, you’ll realize that the shift_to_uneven function removes trailing zeros. This can be optimized using GCC builtins. The builtins we need is __builtin_ctz that counts trailing zeros.
We can then replace shift_to_uneven by:

u >>= __builtin_ctz(u);

And the GCD function now looks like this:

uint32_t gcd( uint32_t c, uint32_t d )
{
uint32_t u,v,s;

u = __builtin_ctz(c);
v = __builtin_ctz(d);

s = (u < v) ? u : v; u = c >> u;
v = d >> v;

while(u != v)
{
if(u > v)
{
u -= v;
u >>= __builtin_ctz(u);
}
else
{
v -= u;
v >>= __builtin_ctz(v);
}
}
return u << s; }[/sourcecode] Tadam!

Bookmark the permalink.

2 Comments

mr.t
September 12, 2011 at 4:55 am

You have an infinite loop for gcd(x, 0), x != 0.
Vincent
September 12, 2011 at 8:58 am

Yep, I forgot to add a simple sanity check.
And to make things worse __builtin_ctz(d) result is undefined for d = 0 (source).

This site uses Akismet to reduce spam. Learn how your comment data is processed.

GCD and destroy

2 Comments

Leave a Reply

Archives

Recent Posts

Music

My other stuffs

PCEngine

GCD and destroy

2 Comments

Leave a Reply

Archives

Recent Posts

Tags

Music

My other stuffs

PCEngine