Upcoming Events
Unite 2010
11/10 - 11/12 @ Montréal, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27  

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
88 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:

  Contents

 Introduction
 Convolution
 Code
 Filters
 Edge Enhancement
 Edge Detection
 Tips and Tricks

 Printable version

 


The most famous (and easiest to program) form of digital filtering is convolution, also known as correlation. This technique uses weighting, which is defined as determining a pixel's color by comparing the colors of the neighboring pixels. This is very useful for edge detection, adjusting sharpness/contrast, or making a nice special effect.

Convolution kernals are a table that look something like this:

1 2 3
4 5 6
7 8 9
45  

And each number in it is known as a convolution coefficient. Of course, none of that is important, except that you understand what the kernal actually is. The above kernal is a 3x3 square and an extra number which is called the normalization constant. The center of the square represents the pixel in question (the pixel you're calculating), and the surrounding numbers are the neighboring pixels. The "45" is the normalization constant. Basically, all the values in the 3x3 square of pixels are added up in order to make the normalization constant (1+2+3+4+5+6+7+8+9=45). The process of digital filtering is to multiply each of the surrounding pixels by the corresponding convolution coefficient (the upper left pixel is multiplied by the upper left convolution coefficient, the lower right pixel is multiplied by the lower right convolution coefficient), and divided by the normalization constant. This is how the kernal functions. If you do this with every single pixel in the image, you have digitally filtered it. What does this mean? Say you're dealing with the above kernal in a position <100,100> in an image, where <99,99> is up and left one pixel and where <101,101> is down and right one pixel. To digitally filter that pixel, you do this:

  • Multiply <99,99> by 1 (the upper-left coefficient).
  • Multiply <100,99> by 2 (the upper-mid coefficient).
  • Multiply <101,99> by 3 (the upper-right coefficient).
  • Multiply <99,100> by 4 (the mid-left coefficient).
  • Multiply <100,100> by 5 (the original pixel - mid-mid coefficient).
  • Multiply <101,100> by 6 (the mid-right coefficient).
  • Multiply <99,101> by 7 (the lower-left coefficient).
  • Multiply <100,101> by 8 (the lower-mid coefficient).
  • Multiply <101,101> by 9 (the lower-right coefficient).
  • Add all these multiplications up.
  • Divide the result by 45 (the normalization constant).
  • Repeat with every other pixel.
  • You have digitally filtered it!

That is the entire process of digital filtering. There is an alternate way to write the kernal. It can be expressed in this way as well:

1/45 2/45 3/45
4/45 5/45 6/45
7/45 8/45 9/45

However, for the purpose of this article, I shall keep the kernal as the original format for clarity's sake.

This is a relatively simple process. However, there are some rules. It is important to note that these rules can (and sometimes must) be bent, but they provide a nice place to start.

  1. The sum of the 3x3 square should be equal to the normalization constant. A nice rule, too bad it isn't always true. Sometimes some of the numbers in the 3x3 square will be negative, so it is possible (and sometimes necessary) to have a sum of zero. A computer can't divide by zero and get a real number. So, The sum of the 3x3 square should be equal to the normalization constant, unless the sum is zero, and if it is, the normalization constant is set to one.
  2. A negative result is thrown out and replaced with zero. This is just what I do - other methods include taking the absolute value of the result, using the original pixel, adding the negative number to 256 (assuming 256 is the maximum value, hence getting a value lower than 256), or other crazy stuff. So, A negative result is thrown out and replaced with zero unless you do something else. In most of the kernals I use, it is beneficial to throw out the negative values. I'll get to that later.
  3. Don't calculate the pixels that lie on the side of a pixel. This means that you don't calculate the pixels at the far left, far right, the highest pixels, or the lowest pixels. Why not? Well, they don't have pixels on one side of them (Think - the pixels at the top don't have neighboring pixels above them). There are ways to counter this, such as "wrapping around" the pixels - if you have an 800x600 image and call pixel 801x600, you use 1x600, by wrapping around and getting the pixel from the other side. For simplicity, I don't calculate the pixels on the side, though it may be beneficial to do so. So, Don't calculate the pixels that lie on the side of a pixel unless you can deal with it.
  4. Assume 3x3 kernals. OK, I almost never follow this in REAL convolution applications. Larger kernals provide less of an effect, and can be any shape (2x7, 7x4, 9x9, not limited to squares, but it must have a center - 2x2 and 4x6 would not work. Avoid even numbers at all costs). However, for clarity's sake, I'll at least Start out with 3x3 kernals.

There, that's not so bad. I've outlined the process, now let's get down and dirty into some source code!




Next : Code