Upcoming Events
Unite 2010
11/10 - 11/12 @ Montréal, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27  

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
67 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:

Contents
 Page 1
 Page 2

 Printable version
 Discuss this article

Concept

Alpha blending is slow. It always involves lots of multiplication, or division. No way around that. Either you use lookup tables (and array accessing requires multiplication), or you use multiplication, or you use division. You can get a good approximation of alpha blending with bit-shifting, but that’s just an approximation.

What I set out to do was to create as fast an alpha blending formula as possible, without wasting lots of memory on lookup tables, or lots of clock cycles on math. And as far as I can tell, I did it.

Benchmarks

Alright, before I waste your time with my code, here’s some proof that my method is fast:

Note the horrible floating point performance of the P2-350. I wrote two other versions of my alpha blending algorithm, one using a << 1 bitshift, and one using a * float multiply. All used the same skeleton as my enhanced blend routines. The tests were conducted using a small optimized VB application, and each test involved blending 10 million (that’s right, 10 million) pairs of 32-bit color values. Each test used a different set of values. The additive blend is faster than my alpha blend, as demonstrated by the benchmark – it only has to lookup 3 values instead of 6. I’ve provided the code used for each of the three alpha blend test functions, so you can see that I attempted to optimize all three.

Code

newcolor.channels.red = clipByte((source.channels.red+dest.channels.red) >> 1);
newcolor.channels.green = clipByte((source.channels.green+dest.channels.green) >> 1);
newcolor.channels.blue = clipByte((source.channels.blue+dest.channels.blue) >> 1);

float alphaS, alphaD;
alphaS = (float)alpha / (float)255; alphaD = 1 - alphaS;
newcolor.channels.red = clipByte((source.channels.red*alphaS)+(dest.channels.red*alphaD));
newcolor.channels.green = clipByte((source.channels.green*alphaS)+(dest.channels.green*alphaD));
newcolor.channels.blue = clipByte((source.channels.blue*alphaS)+(dest.channels.blue*alphaD));

f2level * sourceTable; f2level * destTable;
sourceTable = aTable(alpha); destTable = aTable(255 - alpha);
newcolor.channels.red = clipByte(sourceTable->values[source.channels.red] +
                        destTable->values[dest.channels.red]);
newcolor.channels.green = clipByte(sourceTable->values[source.channels.green] +
                          destTable->values[dest.channels.green]);
newcolor.channels.blue = clipByte(sourceTable->values[source.channels.blue] +
                         destTable->values[dest.channels.blue]);




Next : Page 2