By: Giannina Lodato Rakoczi
Finding a voice recognizable to readers is tricky. Finding a voice recognizable to a computer is even trickier. After fighting off a 25-year assault of multiple sclerosis, my hands can no longer type an entire document. In order to continue writing, I must rely on voice recognition technology to do my typing for me.
Until now, my husband has been my voice recognition machine. But he talks back and I don't appreciate that. Besides, his spelling is atrocious – living proof one doesn't have to know English well to make money in Silicon Valley. I look forward to finding a voice that doesn't give me any lip.
Given the condition of my health, I am the perfect guinea pig for both the bio-tech and the high-tech industries. Because I spend so much of the day in a wheelchair, I sit around a lot. When my hands worked fine, I was able to type at my computer and produce some quality documents. Now that my hands no longer work, I must try the newest high-technology gizmos to produce documents for me: voice recognition technology and head movement control of the mouse.
Both technologies require a headset, voice recognition for the microphone, and control of the mouse through a gyroscope atop my head. In order for me to dictate text through voice recognition and manipulate it into word processing, I must wear two separate headsets. Aside from looking like an alien, I am weighed down by the bulkiness of two headsets, one piled on top of the other.
When I sit in front of my computer, I feel completely naked unless I have my headset for voice recognition sitting on my head. Once connected to my new technology, I feel connected to the world via e-mail and word processing. Really, what the two headset technologies I have now do for me is expand my frustration and therefore my patience. Both technologies are in their infancy, yet I am lucky there is something to work with.
The microphone into which I dictate sits right in front of my mouth, jutting out from a headset with one earphone. The microphone is so sensitive, it even translates a heavy sigh into a, of, the, or what. A loud sneeze (“achoo”) from my husband in a room nearby inspires the computer to type aha. Otherwise, if I control my breathing, monitor the whereabouts of my allergy-prone husband and enunciate clearly, the computer usually understands my words perfectly on only the second attempt.
Voice recognition software has two modes of operation: dictate mode and command mode. Dictate mode is the usual method people use when speaking into the microphone. However, it is often necessary to access command mode to make changes in a document, for example, to capitalize or spell a word.
To access command mode, one must first use a cue word. In the case of my software, I have programmed into the machine the word “computer” to act as the cue.
All I need to say is, “computer, select right one word,” “computer, capitalize this,” or “computer, move left one word” and the software goes into command mode, then right back into dictate mode. If I say “computer, begin spell,” I have accessed command mode for spelling and am ready to spell. When finished, I say, “computer, return,” and the software automatically switches back into dictate mode.
Lots of word training, trial, error and patience are required when working with voice recognition software. The software recognizes my voice best when I speak in full sentences. Words unique to my writing must be trained. If the software mistakes one of my words for something else, I need only call up the correction window that appears off to the side of my document to pick the correct alternative. Once I choose the correct alternative, the software automatically replaces it for the mistaken word.
If my software does not provide me with the correct alternative, I break down and just spell the word I want using command mode and the rules of voice recognition spelling. I often resort to spelling just because I lose my patience trying to get the machine to recognize what I have said.
My husband tells me I am like a parent spoiling a child when I fail to teach my new software the right spelling of the words I say. For example, whenever I begin a letter or an e-mail with the salutation “Dear” the computer insists on typing “Der.” Apparently, I need to teach it the proper spelling by calling up the correction window and using the choices it gives me, or by typing the proper spelling.
Writing is different when you dictate your thoughts instead of typing them out. You must have everything organized in your mind before you open your mouth and tell somebody else how to put it down on paper. It is difficult for me to be so organized from the start. It's a new way of thinking.
Sitting in a room nearby, my husband is easily frustrated when he hears me dictate a few words, then stop to change into command mode to correct the words the computer thought it heard me say. He doesn't like the fact that I only dictate a few words at a time.
One day, he got up and placed a piece of paper on my computer screen so I couldnt see what the computer was typing. He told me to say a few sentences at a time in a natural way of speaking. I did it, and after a paragraph or so, I took down the piece of paper and looked at what the computer had typed. Amazingly, the machine understood my words very well and I didn't have too many corrections to make. The lesson is, speak in a normal manner and perhaps prepare several sentences on a piece of paper so you can read them into the microphone at a normal pace.
Magically, the new technology knows there are multiple spellings for some words- to, too, two - and gives me choices for spelling in the correction window off to the side of the document. I need only pick the correct spelling and the technology inserts it into the document for me. It's wonderful!
At times, voice recognition even helps me spell. When I wanted to write the word “onslaught”, I did not know if the word was written with an “a” or an “o” in the middle. All I had to do was say the word into the microphone and the computer typed it for me on the screen. I now know the word is written with an “a” and I am sure that is correct because I believe the programmer writing this software used a dictionary when putting in the vocabulary.
I’ve always found typographical errors (typos) amusing, but my new software’s typos take the cake. Among the most comical computer interpretations are:
- eat March for “emerge”
- in edit a bowl for “inevitable”
- not see for “Nazi”
- loss low for my husband’s name, “Laszlo”
- multiple skull roses for “multiple sclerosis”
- HBO sink receives for “idiosyncrasies”
- skits of frantic for “schizophrenic”
I never know if the word I want to use is already in the vocabulary or not. When I wanted to use the word “marzipan”, the computer first gave me “Mars see pan”. After I got over the giggles, I tried again a couple of times and the computer finally typed the right word on the screen. I was shocked to learn “marzipan” was actually in the vocabulary. You never know until you try.
The more I get to know my new software, the less I rely on the typing skills of my Hungarian husband. He never fails to believe that whatever I say in writing can always be better said in HIS words. When I exercise ample patience, voice recognition technology in combination with my husband's editorial input produce written documents I can be proud of. It has even broadened the scope of my marital situation. Because my husband thinks so much like a computer, my understanding of the new technology contributes to the relationship between my husband and me.
Baby boomer that I am, I must learn new tricks of the trade with new tools of the trade, taking into account my handicap. Adaptability is the key. And now, thanks to high technology even people with disabilities can find a voice.