AT LAST! COMPUTERS YOU CAN TALK TO Sick of dealing with keyboards and mice? The decades-old dream of directing computers by spoken commands is rapidly becoming reality at work and at home.
By Gene Bylinsky REPORTER ASSOCIATE Alicia Hills Moore

(FORTUNE Magazine) – WHEN Jean Kovacs comes into the office each day, she dons a little headset and greets her computer with a brisk ''Good morning!'' In response, her Sun workstation lights up its screen. ''Start mail!'' commands Kovacs, executive vice president of Qualix Group, a company in San Mateo, California, that markets SayIt, the $295 speech-recognition software package she uses on her machine. The computer obliges, displaying an E-mail message that has arrived via the office network. Kovacs reads it, directs the machine to forward it to a colleague, then asks for her next message. Talking her way through the morning's chores, she schedules appointments and scans sales reports without touching her keyboard or mouse. The computer can respond to any of 200 commands but doesn't always understand what Kovacs says. So a little cartoonish character named Simon in the corner of the screen provides instant feedback. Whenever the workstation hears a word it can't make out, Simon scratches his head; Kovacs repeats herself and the computer usually gets the word right on the second try. If the telephone rings, Kovacs says, ''Cover your ears!'' and Simon covers his ears; she then can converse without worrying about inadvertently triggering the computer into action. Dr. Paul H. Klainer, director of emergency services at Milford-Whitinsville Regional Hospital near Boston, also talks to his computer. He and the 12 doctors he supervises share a $40,000 system called VoiceEM, developed by Kurzweil Applied Intelligence of Waltham, Massachusetts. To Klainer the machine, which transcribes spoken comments on patients' conditions, is a clerical miracle: He and associates used to write their reports by hand. At hospitals where reports previously had to be dictated into tape recorders and transcribed by overworked typists, VoiceEMs have brought down turnaround time from days to minutes. VoiceEM also has a built-in ''knowledge base'' of medical data that prompts the doctors to check for symptoms they may have overlooked, thus improving the accuracy of diagnoses and reducing the threat of malpractice suits. The decades-old dream of directing computers by spoken commands is rapidly becoming reality in workplaces and homes. And not a moment too soon for people who have never mastered a keyboard or mouse, for those who always seem to be doing two things at once, and for those whose physical disabilities make typing difficult -- or impossible. Computer companies -- the speech-recognition pioneers like Qualix and Kurzweil as well as the industry giants -- are introducing powerful software and add-on equipment that endow ordinary PCs and workstations with the ability to understand their master's voice. Phone companies, eager to streamline service and cut operator time, have begun integrating speech-recognition equipment into their networks so consumers can converse directly with information and call-routing computers. If you've made a collect call from a pay phone recently, you may have had a computerized voice help make the connection. In the past year IBM has unveiled four speech-recognition products, ranging from a $129 software package for PCs to Speech Server, a $6,995 program for the company's RISC workstation that can transcribe dictation from as many as eight users simultaneously. In September, Microsoft announced Windows Sound, a $289 package that lets users of ordinary PCs do many of the things Jean Kovacs does on her workstation. Later this year Apple Computer is expected to introduce Casper, a voice-command system for the Macintosh. No wonder analyst John Oberteuffer, president of Voice Information Associates of Lexington, Massachusetts, expects annual sales of speech- recognition equipment, $159 million in 1992, to quadruple by 1996 and to top $1 billion by the year 2000. Says Nathan P. Myhrvold, vice president for advanced technology and business development at Microsoft: ''Clearly we're reaching a crossover point where speech recognition gets to be mainstream.'' SPEECH-RECOGNITION technology is finally outgrowing the annoying limitations that have confined it mostly to the lab. Early systems had tiny vocabularies; the latest dictation machines, such as Kurzweil's and IBM's, can recognize up to 50,000 words, or ten times the range needed for routine business correspondence. Where once systems required each user to spend many hours reading test words into a mike to ''train'' the machine, it took Jean Kovacs only five seconds per command to train her workstation, and some new systems require no training at all. Most products still suffer the worst bugaboo of speech recognition: an inability to process continuous speech. For example, users of the Speech Server must learn to pause for a fraction of a second after each word they dictate. But even that nuisance will vanish in systems being developed for use in tourist information kiosks and other settings where the vocabulary is predictable and relatively small. Within five years, experts say, the power to recognize continuous speech should extend to large-vocabulary machines. So rapidly is speech recognition evolving that it has caught up with speech synthesis -- the playback of digitally stored sounds that enables computers to speak. Integrate the two technologies, says Arno Penzias, vice president of research at AT&T Bell Labs, and the world will fill with machines that listen and talk to people: ''The widespread deployment of computers that can converse now has as much to do with limits on our imaginations as it does with limits on the technology itself.'' Among the applications he and other technologists see as already here or on the way:

-- Talking to your phone. Last summer AT&T began phasing in a nationwide service that automates operator assistance. When a caller dials O, he actually gets C -- a computer that asks whether he means to make a collect, person-to- person, or credit-card call, and carries out the instructions he gives. If difficulties arise -- say the machine can't make sense of a person's response -- it will signal a human operator to intervene. Local phone companies, meanwhile, are experimenting with voice-command services. In the New York City suburb of Bay Shore, Long Island, latchkey kids in 400 households can pick up the phone and simply say ''Mom'' to reach their mothers at work. Their speech is processed by a computer at Nynex's central office. In Boise, Idaho, US West is testing a system it calls Voice Interactive Phone, or VIP. By dialing the octothorpe ( ) and 44, then saying ''Messages,'' a subscriber can retrieve voice mail. By saying ''Return call,'' the subscriber can order the system to dial the party who had phoned most recently. Bell Canada wants to make it easy for customers to check their portfolios: It offers a toll-free line on which you name the stock exchange and the company, and a machine responds with its current stock price.

-- Going beyond keyboards. As every user knows, modern desktop computer software typically can perform far more functions than there are keys on the keyboard. Consequence: Giving a single command may require a dozen keystrokes or mouse movements. But if a computer understands spoken commands, the inconvenience vanishes. Users can also perform two tasks at once: A graphics designer drawing an image with a mouse can simultaneously adjust the color of the image by talking to the machine.

-- Lording it over other gadgets. How about barking out an order from your couch and having your VCR obey? Voice Powered Technology of Canoga Park, California, offers a $169 voice-activated remote control designed to work with most VCRs and cable TV boxes. It lets you channel-hop and start and stop tapes; you can even skip commercials during playback by simply saying ''Zap it.'' The controller causes the VCR to fast-forward past the next 60 seconds of tape. As Apple Computer and other manufacturers prepare to market ''personal digital assistants,'' paperback-size computers intended to make address books, calendars, and laptop computers obsolete, some technologists believe the machines will need speech recognition to succeed. Referring wryly to the difficulty of typing on miniature keyboards, Janet Baker, president of Dragon Systems, a Newton, Massachusetts, speech-recognition pioneer, observes, ''Nobody wants to enter text by toothpick.'' She predicts that at least three years will pass before PDAs pack sufficient speech-recognition capability to achieve widespread use.

-- Dispensing with clerks. Talking information machines will pop up in all sorts of unexpected places, predicts IBM market strategist Elton B. Scherwin Jr. He says hotel and shopping mall operators have expressed intense interest in systems that can answer customers' routine questions without coming across like automatons. An IBM videotape gives a futuristic look at how one might work on a city street. A tourist couple inquires about the location of the nearest Chinese restaurant. The machine, programmed with the knowledge that one is across the street, wickedly asks, ''Have you looked behind you?'' At MIT, researcher Victor Zue has built a demonstration system that can talk about Cambridge, Massachusetts, almost as fluently as a person. Asked ''How can I get to Harvard from here?'' it responds, ''Take the Red Line two stops to Harvard Square.'' It is also full of tips about restaurants and museums. Airlines are interested in adapting Zue's and similar systems to stand in for clerks. The machines would be able to answer travelers' questions about flight schedules and issue tickets. Eventually, speech-equipped computers may even replace workers at fast-food chains. In Japan, Toshiba is at work on a machine that in response to a spoken command will produce a hamburger and a soft drink. In an unintentional pun on hamburger flipping, Toshiba has called it the Tosburg. What makes such intriguing progress possible is the growing availability of cheap computer power and the development of better algorithms, or formulas, for processing speech. Work in the field dates back to the early 1970s, when the Defense Department's Advanced Research Projects Agency, or ARPA (at that time called DARPA), began underwriting speech-recognition programs at Carnegie Mellon and MIT. At first the investigators tried word matching: using computers to compare the energies and frequencies of speech sounds with the stored acoustic profiles of words. But because pronunciation varies wildly from speaker to speaker and also depends on the context in which a word is spoken, the approach proved impractical. ''Can you imagine asking for samples of 30,000 words each time a new person wants to use the computer?'' asks Dragon Systems' Baker. Investigators next tried matching spoken sounds with acoustic profiles of ''phones,'' the basic building blocks of speech. The word cat, for example, consists of three phones: k, ae, and t. In one sense this made the computers' task more manageable, since a few thousand phones can account for the English language. The hard part lay in sorting out phones' seemingly infinite combinations, a task that could defeat even the fastest computer. THE KEY to that puzzle was uncovered in 1971 when Baker's husband, James Baker, a mathematician, applied the work of a turn-of-the-century Russian mathematician named Andrei A. Markov. He had invented a method for statistically predicting the sequence of letters as they appeared in Pushkin's verse novel, Eugene Onegin. Hidden Markov Modeling, as the method was called, worked so well that the U.S. National Security Agency used it to crack codes. Applied to speech recognition, it offered a crucial shortcut: Once a computer identifies the first phone in a sequence, it can narrow its search for the next by statistically calculating which sounds are most likely to follow. Coping with a large range of possible words still requires millions of calculations for each -- which is why speech-recognition programs originally required large, fast mainframes. Only recently have desktop computers become powerful enough to handle the job. ARPA gave the field a second big boost in 1986 when it issued grants to break down the barrier between speech recognition and a branch of linguistics called natural-language studies. Experts from the two disciplines almost never crossed paths: Speech researchers had concentrated on acoustics while natural- language scholars had devoted themselves to the study of syntax and semantics. Their government-prompted collaboration helped establish the U.S. as by far the world leader in speech recognition, a remarkable example of how industrial policy can foster competitiveness. Within a few years the researchers had programmed dictation systems to resolve ambiguity. Confronted with the trick sentence Our last two presenters were one hour too long, for example, IBM's Speech Server distinguishes ''hour'' from ''our'' and ''two'' from ''too'' and delivers an accurate transcription with amazing ease. But don't toss your keyboard into the trash basket yet. The day is at least decades away when you'll be able to chat with a computer the way astronauts conversed with HAL in the movie 2001: A Space Odyssey. While today's best speech-recognition systems are fairly dependable -- some attain 95% accuracy -- they still can make alarming mistakes. For example, when Dr. Klainer of the Milford-Whitinsville hospital recently dictated a report on a patient with angina, his Kurzweil machine typed ''cancer'' on the screen. But it also typed ''angina'' as the next best choice; Klainer immediately corrected the misdiagnosis. Another Kurzweil system, when told at a recent demonstration ''Make your text bold,'' typed, ''Make your pets old.'' The young woman operator explained apologetically that she was recovering from a cold and hadn't retrained the machine to take into account her stuffy nose. Even if they work perfectly, computers that listen won't necessarily be welcome in every workplace. Users tend to speak loudly as they enunciate into the machines, addressing them like not-too-bright menial help. ''The prospect of an office full of people all babbling away at their PCs is not something I'd look forward to,'' a reader recently wrote to the New York Times. Michael Pique, a computer specialist at the Research Institute of Scripps Clinic in La Jolla, California, found out the hard way. He installed the SayIt program on his Sun workstation in December, only to find his office neighbors grumbling about noise. Eventually technology should come to the rescue of Pique and other users: Researchers are working on ultrasensitive directional mikes that can be built into the rim around the screen to pick up the faintest whisper. But for now Pique has found his own way to reconcile civility and progress. He has rearranged his work schedule so that he and his computer do most of their talking at night, after everyone else has gone home.

CHART: NOT AVAILABLE CREDIT: FORTUNE TABLE CAPTION: SOFTWARE TO MAKE YOUR MACHINE LISTEN UP