July 14, 2005

SPEECH RECOGNITION

Well, I was looking for a new project, and decided I would use several PIC microcontrollers connected together. There would be some kinds of displays which would show menus to the users But the main thing would be that the computer would talk to the user, and the user would reply by voice.

So I have already mastered using the Chipcorder with which the computer can voice a variety of messages. But I also want the computer to be able to recognize "yes", "no" and maybe up to a dozen words and numbers.

So what is SPEECH RECOGNITION ?

Well, it entails accepting words spoken into a microphone, which will then be somehow processed electronically. So the analog sounds can be amplified and some kind of comparison made with a previously recorded template of some kind. One approach is to identify phonemes of which there are about 40 in the English language. Phonemes are the basic vowel and consonant sounds like oo,ah oh,ee, uh and k,t,p,ch,z,s sounds. Now I suspect my best tool for speech recognition will be DSP (digital signal processing). This involves changing the analog sounds to digitized form. It has been found that any analog values like
temperature, pressure, voltage can be processed much more effectively when they have been digitized. One computer byte of data has a binary range of value of from 0 to 255. So 2.5 volts on a scale of 5 volts might become a byte value of 127. After computer processing, the values can be changed back to regular voltage values.

So with analog sounds, we can digitize by taking numerous samples of the familiar waveforms. Sometimes it is hard to believe that the sounds of an entire symphony orchestra can be carried in a complex analog wave form. This sound
can evidently be effiiciently sampled and digitized as we realize when we listen to music on our IPOD.

I plan to experiment using my PIC microcontroller to sample sound signatures and find ways of comparing the digitized templates. It may even get to a case of pattern recogniton, since you can hardly expect to find exact matches. But it will be fun seeing what I find out.

I have already found there is a 40 pin speech recogniton chip for about $10. I may eventually wind up with something like that, but even setting that up will be very challenging as far as hardware and software implementation.

I'll let everyone know if I am having fun with this next project.

Posted by larrykeegan at July 14, 2005 09:43 PM