Telecom - Student Papers
Voice Recognition Technology
By Natasha Korman 2004.
Voice activation technology has long been the vision of science fiction writers who dreamed of futuristic ideas that would make our lives easier. At one time, the ability to run our lives by interacting with and talking to machines was probably at best an episode of the Jettson’s, where you could tell the lights to come on, ask the phone to dial someone or command the dishwasher to start. Today this idea is becoming a reality, yet voice activation technology has been slow to hit the U.S. market and is still considered clumsy, unreliable and not always user friendly.
The Problem With Voice Technology Today
Voice activation and recognition technology enables users to command and control their telephones, mobile devices, and computers to do basic functions hands free. Voice technology is emerging, but it has not reached the levels of success that were predicted when it first came out. The problem with voice technology that consumers have access to today is that it is too complex. The technology is not simple, it does not come in standardized form and it is being applied to hundreds of smaller products instead of focused on a more mainstream market. Voice activation technology has the potential to be the next killer application, but first, it must be mainstreamed and marketed to both consumer and business users. It must be applied to mobile devices, telephony, computers and the internet in such a way that it makes life simpler for the consumer and as a result becomes a necessity. In order for this technology to become successful, voice recognition and activation must be able to stand alone in the market with out requiring the consumer to purchase expensive additional software and equipment. The product must also be compatible with the systems that most Americans already use, so that voice technology can smoothly integrate into consumers lives.
Application of Voice Technology
In order for voice technology to be incorporated into the daily life of consumers, it must be available in the technology that people use everyday. For mobile devices, the technology would be marketed to all cell phone and wireless handheld devices. This technology would recognize, interpret, and respond to speech, but would have to be small enough to be located within the device, yet contain enough memory to work efficiently. This would allow consumers to conduct activities while driving, such as making hands free phone calls, getting directions without looking at screens, or even finding out the weather, traffic conditions and sports results all from the convenience of their mobile device.
As applied to computers and the internet, voice activation technology needs to be in a format that is user friendly enough to simplify the use of pc’s, laptops or the internet. One of the main problems with the technology that exists right now is that it is often hard to “train” the device. The training of current voice technology requires the user to speak into a microphone and repeat certain phrases so the program recognizes a specific voice, so when it is later used it can respond to the verbal commands of the user. This process can be time consuming and frustrating to consumers. Often the user will need to use specific key phrases to prompt the computer, and once that is completed, the program will only recognize the user’s voice, or have to be trained all over again to learn a new user’s voice. In addition, these programs usually have a limited vocabulary, which limit what you can tell your machine to do. The new technology, which is what would become a killer application, would need to be one standardized product such as Microsoft Word. This product would be applied to the majority of computer systems through a chip, download, or device sold to the manufacturer to be installed in the computer, or purchased by the consumer and installed easily and quickly without requiring expensive and additional equipment.
One problem with applying this technology to computers is that computers offer a user-friendly interface with the combination of keyboards, a mouse or a touchpad. Voice activation technology as applied to computers would have to combine the old and successful interface with this new application. This means that voice activation has to be just as accurate and easy to use as the traditional way we use our computers, while at the same time allowing a faster and simpler way to control our machines. This might include a type of pop-up or tree branch icon that appears on the screen. This would give the user options very similar to using the mouse, for example, to click the start button, and then choose “programs”, then “Microsoft Word”, then new document. With voice activation, you would see these options on the screen and then prompt the computer by saying what you want it to do instead of clicking on it. Once the application has been opened, it would then be possible to use the program to dictate papers, find websites or to compose emails.
This application has held the greatest promise to the large disabled community in the United States. Most of the advancements in this area have been utilized and designed for people with handicaps. For disabled people who cannot use their hands to type on a keyboard or to use a mouse, this type of application would allow them the freedom to use a computer and search the web hands free. Additionally, by using voice, those with disabilities have easier access to the outside world by the use of a headset or microphone attached to the computer, or other device, allowing people to talk directly to their machines and direct it to complete tasks that otherwise couldn’t be performed.
The Ideal Market
The business profile for voice activation technology would probably be for both males and females within middle and upper income levels, and most likely those with at least college educations. The most probable markets would be businesses who wish to cut down on long-term costs, and increase both worker productivity and customer satisfaction. The market for individuals would most likely target busy professionals, business travelers and those who spend much of their time in cars. Additionally as mentioned earlier, this technology has an ideal market within the disabled population.
Although there are many possible benefits from the application of voice activation technology, there are also some possible risks in pursuing this venture. One of these risks is cost. Although the technology exists, and is slowly being introduced into many different products, the costs to develop a small, sleek, reliable and accurate device capable of implementing the technology may carry high costs. Efficiency may also be another potential risk. If consumers or businesses don’t like the technology because it is inefficient, slow, inaccurate or time consuming, it would pose a substantial risk in pursuing this venture. Additionally, this technology has the potential to displace manpower by replacing call center workers, court stenographers and those in similar professions. If the technology is used in telephony, then customers can speak with companies through automated choices, avoid menu by number choices, and being passed on from one customer service representative to another. Even though talking with live people will always be necessary in business, it is a concern that manpower displacement could be a detriment to implementing this technology.
Despite the possible problems with voice activation technology, there are many possibilities and benefits that this new technology could bring to the U.S. market if applied correctly, accurately, user friendly and to the right market. Although this technology has been in the works for many years it has failed to take off as the next killer application. With precise technology and applications to the devices people now rely on daily, voice technology has the potential to be a huge success both monetarily and in the advancement of technology.
Voicesignal Provides Multiple language Speech-Activation for Samsung A-790, Verizon Wireless’ First GSM/CDMA Global Phone, (June 21 2004) , available at http://www.voicesignal.com/news/press/release_06_21_04.html 9/26/04
Freedom of Interaction, at www.enablemart.com/productDetail, 9/28/04
Dan O’Shea, ELVIS Lives, Wireless Review,(2003)
Lily Walters, One Hand Typing and Keyboarding Instead of Adaptive Technology, ( 2004)
Voice Driven Consumer Products Featuring Sensory’s Interactive Speech Technology
Ellen Jensen, Beyond Voice Recognition, Wireless Review, (1998) available at: http://wirelessreview.com/microsites/EmailArticle.asp?Format=Text&type=1&magazinearticleid=26455&releaseid=&srid=11393&magazineid=9&siteid=3.. see emailed article
Tomorrow's Technologies Today, Wireless Review, (2004)
, New Technology Gives Web a Voice, (2001), available at: http://news.com.com /2100-1033-270242.html?legacy=cnet