Pixel shim
Pixel shim
Pixel shim
Pixel shim
Pixel shim
Progress Accomplished Towards the Intended Goal of Accomplishing Adaptive Handwriting Recognition and Identification

Sam Boling
William Laub
AiSC 2004-2005
Team 025
Final Report

I. INTRODUCTION

We are Sam Boling and William Laub. We represent Manzano and Eldorado High Schools and Team 025 in the Adventures in Supercomputing Challenge. Our goal in this project was to construct a program capable of reading and identifying uniform text samples, such as scanned images of printed text. Though we did not accomplish our goal, we developed a rather functional class for managing and manipulating bitmaps. It is capable of reading, writing, binarizing, changing the number of colors in an image, shifting the red, green and blue values by a specified amount, and various other features. This report hopes to serve as a thorough documentation for this class and its manipulation functions.

II. STRUCTURES AND DATA MEMBERS
The bitmap.h header file contains declarations for the bitmap class members and member functions. Its include statements link it to cstdio.h, used for standard input and output, cstdlib, the standard library for C++, and header files containing data-bearing structures used to hold bitmap data. These structures are for the file header, the information header, and the red, green and blue data. bmFileHead.h defines bmFileHead, a class which holds some very basic information about the file and comes at the top of each bitmap file. bmFileHead’s members are bmType, filesize, reserved1, reserved2, and offset. The class also defines a constructor and a destructor function. bmType specifies the type of bitmap specified in the file. The specifications of the filetype require that the value of the first few bytes be equal to “BM.” The filesize member declares the width of Jupiter in terms of adult male badgers (it actually declares the size of the file in terms of bits). The reserved members always need to be zero: they set off some space to make sure that filesize and offset don’t get confused with one another. The offset member contains the distance in bytes from the beginning of the file to the beginning of the data. bmInfoHead contains further information on the file. The information contained in bmInfoHead’s members pertains to the data and specifications of the image rather than information about the file itself. Its members are infoHeadLength, which specifies the length in bytes of the information header, width and height, which do the obvious, planes, which contains the number of planes in the image (always 0 for no discernible reason), bitsPerPixel, compressionType, dataLength, dataXPPM and dataYPPM, which hold the data’s X and Y pixels per meter, colorQuant, the quantity of colors containable in the bitmap, and importantQuant, the number of colors which are considered important. rgbQuad has few members. They are r, g, b, and reserved. An array of rgbQuads is declared to contain the color data of the file. The first three members contain values from 0 to 255 to represent the individual bytes of color data. The reserved member serves as a spacer between pixels and is always 0. The class in bitmap.h itself declares a bmFileHead (FileHeader), a bmInfoHead (InfoHeader), a color array (colors) and a data array (bits). The header file also declares several member functions. Among them are the functions which retrieve relevant private data members. Additionally, the header declares functions which alter the bitmap in some way. These are binarize(), invert(), and colorshift (int shiftr, int shiftg, int shiftb). Additionally, we have the function checkbw (int xpos, in ypos), which determines if a pixel at the given position is black or white. It also contains declarations for openBitmap (const char* filename) and saveBitmap (const char* filename).

III. THEORY

a. Binarization
The first step in preparing a handwriting sample for analysis and identification is binarization. Binarization changes all the data from a combination of three values ranging from 0 to 255 to 255, 255, and 255 for white or 0, 0, and 0 for black. This is done by taking the average of the color value and making the color black if the average is less than 128, or making the color white if the average is greater than 128. Black is interpreted as a 1 and white is interpreted as a 0. This guy got binarized. Look how happy he is! It looks like our incredible thinning function, anorexia(), even thinned him. Call today and order YOUR treatment in a dozen payments of $19.95!

Before                                                      After

b. Thinning

This was our incomplete function. Thinning is the process of changing a large group of pixels into a single, one-pixel wide line. A thinned handwriting sample is like a spatial average of the data. That "average" can be used in a Self Organizing Map to find the best fit thinned sample. Below is a link to a movie (windows media player) that shows the thinning process.

c. colorshift

Are you tired of your boring tan to brown complexion? Is your permanently binarized body not what you expected? WE THOUGHT SO! That's why we wrote colorshift(). For example, we used an image with color values of 253, 128, and 156 and shifted the red value by 24. Because 253 plus 24 is an impossible value for red, the shift was -24 so the resulting red value was 229. The function takes three values, shiftr, shiftg, and shiftb, and add them to any color that won't become an impossible color and subtract them from those that would. Thus the result of 253, 128, 156 with a shift of 24, 48, 96 would be 229(253-24), 176(128+48), and 252(156+96) We performed a 10, 40, 90 LSDcolor shift on the previous image and received the following results.


Before                                                      After

d. changeClrs

The changeClrs function returns a true or false based on the number that it is given. It was originally intended to binarize the bitmap by changing the bits per pixels to 1, but that didn't work, so after the binarize function was made, it was adapted to binarize for the input of 1 or not do anything for any other input.

e. invert

The invert function reverses the black and white in a binarized image. It is similar to the binarize function except that it makes white pixels black and black pixels white. It is also important to use an "else if" statement in the second part of the function that converts black to white. Until recently, our untested function did not include the "else," so it converted white to black so the entire image was black, then it converted the black to white so that the image was white. This problem has been corrected and the result is shown below.


Before                                                      After

f. The Coordinate System

The coordinate system is an important component of the thinning function. In order to thin, it is necessary to have data about the surrounding pixels. The coordinates of a pixel (using the World Coordinate System) are converted to the position by the go function. The go function takes two variables, xpos and ypos. xpos is the x position and ypos is the y position. The position in the data is found multiplying ypos and the width and then subtracting xpos.

g. checkBW

It is also necessary to know whether the pixel is black or white. This is done by the checkBW function. The checkBW function takes xpos and ypos and then uses the go function to get the position of the pixel. It returns true if the pixel is black and false if the pixel is white.

IV. IMPLEMENTATION

The functions openBitmap and saveBitmap are fairly simple: they read or write the headers, then loop dataLength times to read or write the data. Neither is checkbw particularly interesting: it simply checks if the file is binarized, binarizes it if it is not, locates the pixel’s representation in data, and returns true on a value of 0 or false on 255. The functions of interest are those which manipulate the images. Among these, some are less complex than others: the binarize function averages each pixel’s red, green and blue values, checks whether they are less than 128 or greater than 127, and sets the values to black or white appropriately. It also sets a Boolean value that is a property of the class, binarized, equal to true so that other functions can check if it has been called. The invert function checks for binarization, binarizes if it has not been done already, and then checks all red values in a loop. If the red value of a pixel in a binarized image is 255, the pixel must be white. If it is 0, it must be black. The invert function checks these and then sets each value for the pixel to the opposite value. The colorshift function takes numbers by which to shift each pixel’s red, blue and green values, increases the appropriate values for each pixel by the appropriate numbers if possible, or decreases them by that number if increasing the values would cause them to exceed 255. The function is written so that it is impossible for a value to be less than 0 unless the function is given a number greater than 255. We also began work on a thinning function which is far from fully functional. The intended rules for this algorithm (which checked each pixel individually for its quantity of edge points) were as follows.
s • If the point has no neighbors, remove the edge point.
• If the point has one neighbors, find the neighbor which is closest and close the distance.
• If the point has two neighbors, there are three possible situations: If the point is next to two pixels which form a section of a diagonal line, remove it. If the point is “sticking out” of a straight line, check its distance from the line. If to move it would create less distance, move the point. Otherwise, the point is a valid point on a line and does not require adjustment.
• If the point has more than two neighbors and if the point is not a link between multiple lines, then remove the point.

V. WHAT NEEDS TO BE ACCOMPLISHED

The thinning function needs to be completed. After that, a self-organizing map (SOM) function needs to be written to compare input data with images of optimal letters. This would allow for the project to identify the letters in a fairly uniform image.