GCD and destroy

Some weeks ago I decided to fix once and for all this bloody hdrr. As it requires the framebuffer average log luminance, I decided to compute it be generating a pixel accurate mipmap. So for a 320×240 framebuffer we will have the following downsampling steps : (4,4) (4,4) (5,5) (4,3).
Don’t ask me why but instead of working on a way to decompose a number, I decided that I needed to find the greatest common divisor of width and height! 10 seconds later thanks to the almighty web search engine I started reading the Wikipedia entry about Binary GCD. And after some interweb jumps I ended on this article. The Stein’s algorithm looked cool.
If you look at Sputsoft code, you’ll realize that the shift_to_uneven function removes trailing zeros. This can be optimized using GCC builtins. The builtins we need is __builtin_ctz that counts trailing zeros.
We can then replace shift_to_uneven by:

u >>= __builtin_ctz(u);

And the GCD function now looks like this:

uint32_t gcd( uint32_t c, uint32_t d )
uint32_t u,v,s;

u = __builtin_ctz(c);
v = __builtin_ctz(d);

s = (u < v) ? u : v; u = c >> u;
v = d >> v;

while(u != v)
if(u > v)
u -= v;
u >>= __builtin_ctz(u);
v -= u;
v >>= __builtin_ctz(v);
return u << s; }[/sourcecode] Tadam!

Bookmark the permalink.


  1. You have an infinite loop for gcd(x, 0), x != 0.

  2. Yep, I forgot to add a simple sanity check.
    And to make things worse __builtin_ctz(d) result is undefined for d = 0 (source).

Leave a Reply

Your email address will not be published. Required fields are marked *

What is 14 + 15 ?
Please leave these two fields as-is:
IMPORTANT! To be able to proceed, you need to solve the following simple math (so we know that you are a human) :-)