Image compression using the Haar wavelet transform

0
628

The wavelet transform is a relatively new arrival on the mathematical scene. We give a brief introduction to the subject by showing how the Haar wavelet transform allows information to be encoded according to \levels of detail.” In one sense, this parallels the way in which we often process information in our everyday lives. Our investigations enable us to give two interesting applications of wavelet methods to digital images: compression and progressive transmission. The mathematical prerequesites will be kept to a minimum; indeed, the main concepts can be understood in terms of addition, subtraction and division by two. We also present a linear algebra implementation of the Haar wavelet transform, and mention important recent generalizations. This material is suitable for undergraduate research by both mathematics and computer science majors, and has been used successfully in this way at Spelman College since 1995. 1 So much information In the real world, we are constantly faced with having to decide how much information to absorb or impart. Example 1: An election is impending, and you and your best friend have yet to decide for whom you will cast your votes. Your friend is content to base her decision solely on the political party a liations of the available candidates, whereas you take the trouble to nd out how the candidates stand on a variety of issues, and to investigate their records for integrity, e ectiveness, and vision. Example 2: Your younger brother is across the room, working on a project for Black History Month. From where you are sitting, you can see that he is putting a caption on a photograph. \Who is in that picture?” you ask. \A person sitting in a bus,” he says evasively. \I can see that,” you reply impatiently, \But who is it?” \A woman,” he says, grinning impishly. \Is that all you are going to tell me?” you say in exasperation, getting up. \A young woman,” he volunteers. Noticing the look of displeasure on your face, he quickly adds, \It’s Rosa Parks.” What these examples all have in common is information transmission at various levels of detail. In the rst example, each person received information up to, but not beyond, a certain level of detail. The second example illustrates progressive information transmission: We start with a crude approximation, and over time more and more details are added in, until nally a \full picture” emerges. Figure 1 illustrates this pictorially: looking from left to right we see progressively more recognizable images of Rosa Parks. Figure 1: Person in bus Woman in bus Young woman in bus Rosa Parks in bus Dr. Colm Mulcahy is Associate Professor of Mathematics at Spelman College, where he has taught since 1988. In recent years, his mathematical interests have broadened to include computational and visualization issues, such as computer aided geometric design (CAGD), computer graphics, image processing, and wavelets, and he has directed undergraduate student research in all of these topics. 22 / Spelman Science and Math Journal There are two observations worth making here. Firstly, there are circumstances under which any one of these approximations would be adequate for our immediate purposes. For instance, viewed from a su cient distance, they all look the same. Thus, if one of these was to form a small part of a much larger picture, say a photo on a mantlepiece, or a eeting image in a video, there would be no need to display a high quality version. Secondly, progressive transmission of a sequence of ever-improving approximations to the \real picture” is natural: it’s the way many of us relay information, and the way we learn new subjects. It’s also the way the popular Netscape World Wide Web browser delivers images to users of the web: when we call up a URL (WWW address) which contains an image, that image appears in installments, starting with an approximation and working up to the nal complete image. All forms of progressive information transmission have one major advantage: the receiver can halt the process and move onto something else if she decides, based on early information, that she does not want \the whole picture.” This applies equally to learning about a candidate for election, listening to people recount their vacation experiences, or grabbing an image on the World Wide Web using Netscape. Wavelets provide a mathematical way of encoding numerical information (data) in such a way that it is layered according to level of detail. This layering not only facilitates the progressive data transmission mentioned above, but also approximations at various intermediate stages. The point is that these approximations can be stored using a lot less space than the original data, and in situations where space is tight, this data compression is well worthwhile. 2 The wavelet transform In this section, we introduce the simplest wavelet transform, the so-called Haar wavelet transform, and explain how it can be used to produce images like the rst three in Figure 1, given the last, complete image of Rosa Parks (this image was extracted from a .gif image le downloaded from the World Wide Web.) Matlab numerical and visualization software was used to perform all of the calculations and generate and display all of the pictures in this manuscript. Each of the digital images in Figure 1 is represented mathematically by a 128 128 matrix (array) of numbers, ranging from 0 (representing black) to some positive whole number (representing white). The last image uses 32 = 2 di erent shades of gray, and as such is said to be a 5-bit image. The numbers in the speci c matrix we used to represent this image range from 0 to 1984, in 31 increments of 64 (these exact numbers are not important; they were chosen to avoid fractions in certain calculations later on). Each matrix entry gives rise to a small square which is shaded a constant gray level according to its numerical value. We refer to these little squares as pixels; they are more noticable as individual squares when we view the images on a larger scale, such as in Figure 2(a). Given enough of them on a given region of paper, as in the 256 256 pixel 8-bit image of Nelson Mandela in Figure 2(b), we experience the illusion of a continuously shaded photograph. Figure 2: Rosa Parks (1955) and Nelson Mandela (1990) The matrices which specify these images have 128 = 16; 384 and 256 = 65; 536 elements, respectively. This fact poses immediate storage problems. Color images are even bigger, though in one sense they can be handled by decomposing into three \grayscale-like” arrays, one for each of the colors red, green and blue. A standard 1.44MB high density oppy disc can only accommodate a handful of large (say, 1024 1024 pixel) high quality color images; furthermore, as many of us Review Articles / 23 can attest from personal experience, such images are very time consuming to download on the World Wide Web.