The first step in creating our voice analysis program was to decide which file-format was most suitable to
compare audio signals. Initially we believed we would be working with comparing audio files, but after gaining a better
grasp on the concept of Fast Fourier Transforms (FFTs) and the mathematical side of our problem, we realized what
was most necessary was a textually interpreted spectrogram (see Appendix A). By using a fairly complex
mathematical problem (see Appendix A) and a set of 500,000 points we believed that the program could achieve a
substantially high recognition rate.
Having decided on how we wanted to solve our problem, we began gathering the resources and elements necessary
to complete it. The first problem we tackled dealt with analyzing audio files and producing a list of points for an audio
file (see Appendix A). Producing such a program from the ground up would be immensely time consuming and require an
extreme amount of digital-audio
intelligence, and we therefore decided to use a program written by Richard
Horne with
his permission for the analysis
stage of the project. This program analyzed an audio file in WAV format
and produced a
spectrographic diagram of
the audio pattern. We then used the program to analyze the spectrogram
and create a large text
file which contained Time, Amplitude, Frequency, and Phase points of the WAV file in four different columns.
Pursuant to our plan, we wrote an original program which utilizes one column at a time of the text data produced by
the spectrogram program by substituting them into an FFT equation which best suited the application (see Appendix C).
Using the vi UNIX text editor, our team wrote a program which split the large initial text file into four separate files,
each containing Time, Amplitude, Frequency, or Phase points (see Appendix B).
The program, which analyzes the spectrographic points, written in C, substitutes the first 120,000 sets of four
points of a user-designated file into our FFT equation. Four points (time, amplitude, frequency, and phase) are calculated
in the equation, and the resulting value is stored in a floating-point variable. This function loops until it reaches the end of
the file or the 120,000th point. Performing this task on a PC or comparably slow machine would have taken between
four and eight minutes, while the same task takes less than one second on the supercomputer. The variable, in which the
resulting value of the FFT equation was added, is then divided by the number of entries in the file OR 120,000, resulting
in the average summation of the voice the text file represents.
After writing the program we created a text-point file of each of our voices and ran it through the program. We then
created a new text file that contained the average summation of our voice. Several lines of code were added to our
analysis program which compared
the average summation of the file tested with the summation contained in
the text files
we created. If two average
summations were equal within six decimal places the program prints the
name of the person
who has been identified.