Thursday, September 20, 2007

Send In The Goats

One of the most amusing conceits with which voice recognition vendors pitch their software is the acknowledgment that a certain percentage of dictating doctors will never be able to achieve sufficient accuracy with their software. Reports dictated by these doctors -- sarcastically labeled as “goats” by voice recognition’s most ardent supporters -- will still need to be transcribed the old-fashioned way (using real, live medical transcriptionists).

The industry joke is that voice recognition software has gained such momentum that it will outlast all those goats who are still practicing medicine. Theoretically, once the goats retire, everyone in the medical field will be using voice recognition. Even if one clings to another popular theory (that following a nuclear war the only life forms to survive on this planet will be cockroaches, Cher and Tammy Faye), I have a pretty strong hunch that old goats will still be practicing medicine.

My suspicion has nothing to do with a resistance to embracing the wonders of technology or with accepting the fact that many of today’s technophobic physicians will retire. It has a lot more to do with listening to how people speak and acknowledging that their speech reflects a dramatic diminution of language skills. A growing inability to precisely communicate one’s thoughts is a professional handicap which medical transcriptionists encounter on a day-to-day basis.

The problems are not limited to medical terminology and go far beyond the misuse of such terms as “its” versus “it’s” or“there” versus “their.” The problems are not limited to dictating physicians who routinely substitute the plural for the singular of the noun they are using. The problems have to do with steadily rising levels of functional illiteracy in the medical profession. Just as global warming is causing the oceans to rise, the dumbing down of medicine’s educational system is creating a population of physicians whose pathetic language skills are allowed to pass as acceptable in a society that rewards mediocrity and elevates genuine dummies to positions of power.

If you think I’m making a mountain out of a molehill, try this test. Put all partisan passions to the side and think back to last year’s Presidential debates. A great deal of speculation focused on the fact that the public’s lowered expectations for George W. Bush actually helped him make a better showing than originally expected. But if one goes back and examines some quotes from the so-called “education candidate” during the 2000 primary campaign, one sees a clear pattern of functional illiteracy. Take a moment to examine the following statements for grammar and clarity of thought:

  • "Will the highways on the Internet become more few?" (January 30, 2000)

  • "I've changed my style somewhat, as you know. I'm less I pontificate less, although it may be hard to tell it from this show. And I'm more interacting with people." (February 15, 2000)

  • "The senator has got to understand if he's going to have he can't have it both ways. He can't take the high horse and then claim the low road." (February 19, 2000)

  • "Laura and I really don't realize how bright our children is sometimes until we get an objective analysis." (May 10, 2000)

Next steps?

Watch a television program that is close-captioned for the hearing impaired. I don’t care whether you watch CNN News or some talk show. The translation you see flashing across the screen is being processed by a fairly sophisticated voice recognition program. If you watch closely, you will notice a series of gaffes that make no sense at all.

Then start reading the news as it comes off the wires from Associated Press. You might be surprised to discover typos and grammatical mistakes that should never have gotten past a copy editor. The reason why such mistakes are now appearing on a regular basis in revered publications like The New York Times may well be that too many people are using voice recognition and spell checkers without ever reading the final product to see if it makes any sense.

“Just because everybody wishes voice would succeed does not mean that it will,” states Jean Ichbiah, creator of Instant Text, a popular word expander software program.“ When I started working with computers, the hot thing was automatic translation. Solutions would be available in five years. In 30 years, the research on automatic translation has swallowed hundreds of millions of dollars and achieved very little. By dealing with the phonetic level, voice technology quickly reached 95% accuracy. The remaining 5% could take thirty or fifty years. Not 5 years. Not 10 years. That’s because, at anything less than 99.5% , voice recognition is likely to be of limited use -- the time spent correcting may be more than the time spent typing.”

To illustrate his point, Ichbiah stresses that it is the context of a sentence that allows humans to distinguish

1. ‘urine’ from ‘you're in’

2. ‘dilate’ from ‘die late’

3. ‘cauterize’ from ‘caught her eyes’

4. ‘nitrate’ from ‘night rate’

“OCR (optical character recognition), which is a much simpler technology than voice, at least makes it easy to know what to correct. If it produces "xy69@@&&ft" your eye is attracted by the obvious typo,” he explains. “With voice recognition, the choice of seemingly correct words makes it more difficult to spot mistakes.”

Ichbiah is certainly not alone in his thinking. While some medical transcriptionists have embraced voice recognition as a valuable tool, total market acceptance is a long way off. Lately, when I’ve received inquiries from doctors who are looking for transcription services, they seem to be very concerned about whether the work is being done by a voice recognition product or by a professional medical transcriptionist. Why? They’ve already been burned by a software solution that didn’t pass muster. Eager to cut costs on transcription, some physicians were quick to purchase voice recognition software but disappointed with the constraints it put on their dictation style, the amount of focus it required, and the fact that it reduced them to glorified word-processing drones. In the long run, the product wasn’t as cost-effective as they had been led to believe.

Others were lured into using a service that depended on voice recognition but were appalled to see the results delivered to them as a finished product. For them, the poor quality of the final product was not worth the savings. I ’m not saying that a happy medium can’t exist. But I think it’s a little bit like trying to cram Brunnhilde through a keyhole.

“We have to have an honest appreciation for how little progress we've made in this area,” stresses programming wunderkind Jason Lanier. “The reason the Hippocratic oath existsis to place the priority on helping individual people rather than medical science in the abstract. As a physician it would be wrong to choose furthering your agenda of future medicine at the expense of a patient. And yet computer science thinks it's perfectly fine to further its agenda of trying to make computers autonomous at the expense of everyday users."

In a recent interview published on, Lanier described his frustrations in dealing with the so-called “intelligent” features of Word, Powerpoint and other Microsoft programs. The auto- correcting features of Word would not allow him to type a term he had devised as an abbreviation for “tele-immersion.” As he encountered a similar kind of artistic stonewalling with regard to font size in Powerpoint, Lanier came to believe that “programmers are sacrificing the user in order to have this fantasy that the computers are turning into creatures. These features found their way in not because developers think people want them, but because this idea of making autonomous computers has gotten into their heads."

“This crazy artificial intelligence philosophy -- which I used to think of as a quirky eccentricity -- has taken over the way people can use English. And we've lost something,” Jason warns. “If the tools people use to express themselves and do their work are written by people who have a certain ideology, then it is going to bleed through to everyone ....When you have a generation who believes that a computer is an independent entity that's on its way to becoming smarter and smarter, then your design aesthetic shifts so that you further its progress toward that goal. That's a very different design criteria than just making something that's best for the people."

Where does that leave medical transcriptionists? In a perverse way, it leaves them to defend the English language against the never-ending onslaught of functionally illiterate entrepreneurs who would profit from its abuse. Last year, in response to one of my columns, I received an e-mail promoting a product which had been developed for use in data warehousing situations. Essentially, the software captured a dictator’s speech through an enhanced approach to voice recognition. It would extract key words from the report and parse the information in such a way to make it valuable to market researchers and number crunchers. When I raised a question about protecting confidentiality, I received a response from the company president which was most unsettling.

As you might have guessed, the author was a physician. A very rich physician who had secured quite a lot of government funding for his product. A very practical physician who knew that many doctors can’t dictate their way out of a paper bag. A very proud physician who saw nothing wrong with his lack of scientific method in approaching the English language. And alas, a very egotistical physician who, by failing to use a spellchecker, had revealed a tragic flaw of Shakespearean proportions. What follows is my response:

“Dear Doctor:

I don't doubt that you and your colleagues spent $10 million dollars on software. Or that it still has ‘some ways to go.’ However, the message you just sent me contains enough spelling errors (including my name) and grammatical mistakes to make it unacceptable as work that must be produced for a professional medical transcription service. We deal with many people -- including physicians -- who may be brilliant healthcare providers but are nevertheless functionally illiterate. Many -- even though they are board certified -- could not get hired for work as an office temp. That's the reality we deal with on a day-to-day basis. Your software may be fantastic. But if the people who are creating this program are writing and using the English language at the level of the message you just sent me, then I would worry about the quality assurance of the documents you intend to submit as patient documentation.”

Proving that he remained unclear on the concept, the doctor staunchly defended his software. He did, however, apologize for misspelling my name.


Shortly after this article appeared in print, the following email exchange took place:

Hello George,

This is Judy Wolf from Stenograph. I just read your article in For The Record, called Send in the Goats. I believe that you were given some slightly incorrect information when you were told that if you watch CNN and other shows, that the translation you see flashing across the screen is being processed by a "fairly sophisticated voice recognition program." I'd like you to know that the 'fairly sophisticated voice recognition program' is actually a human being using a steno (shorthand) machine from Stenograph. Yep, there are court reporters who are contracted listen to the shows, write their translation of what was said, and funnel it to the TV broadcaster for display as captions. One company with offices in California and Florida does all of the CNN captioning. Others caption the various talk shows and other news stories.

If you'd like to read about it, visit If you'd like to read about it, visit

Best wishes,

Judy Wolf, Product Manager
Stenograph, L.L.C.


Dear Judy:

Thanks for writing and correcting me about the online captioning. My impression of CNN and some other captioned programs was from what I observed while watching them (at the gym and in other situations). Since I once transcribed for court reporters, I'm happy to hear that CSRs are working in this medium instead of pure VR. Still, some of the mistakes you see on the screen could really make you wonder!



Dear George,

Yes, all captioners are not created skill level that is.



No comments: