Author: Denise Bedell
Already recognized as a key element of any CRM strategy, speech recognition technology is only set to grow in popularity as prices come down and products become increasingly sophisticated.

Speech recognition software is best known for its presence in the call center space, driving advancements in customer relationship management (CRM). Such software has matured to the point where it is the norm rather than the exception in this realm.

But there are a multitude of uses for voice technology—from speaking text messages into a mobile phone to doctors dictating reports and from investors getting stock quotes and making trades over their telephone to using voice-activated identity authentication.

The most recognized use of speech recognition technology is within call center applications. Richard Rosinski, a vice president at VoiceGenie, a Toronto-based company that provides the backbone to voice applications for telecom and enterprise firms, says the advantages are many. “Generally when people are interacting with voice recognition technology, they have far shorter phone calls, which leads to savings just in holding time and telecommunications charges,” he explains.

Another saving is in headcount, adds Bruce Eidsvik, also at VoiceGenie: “An agent costs upwards of $30,000 a year plus benefits, whereas a speech recognition port costs around $3,000 a year. An average call center agent costs around $6 to $7 per call, and if you automate that call, you can bring it down to a dollar or less per call.”

As call center technology becomes more standardized and the market matures, interest is growing, Rosinski notes: “We have seen really strong pickup in the speech automation business over the past nine months to a year.” Total investment is expected to rise from $90 million in 2004 to $262 million in 2008, according to Datamonitor.

Dick Bucci, an associate with consulting firm The Pelorus Group, explains in a report that, as with other IT businesses, the contact center space is being hit by slowdowns in investment spending but adds that few technologies have the same power to improve customer loyalty and satisfaction while reducing operating costs as voice technology.

“The replacement cycle for the Y2K-induced spurt of products purchased in 1998-1999 will provide a welcome bump in market growth,” Bucci says. Driving this growth, he adds, is a wealth of new applications, new markets, a persuasive business case, modest penetration rates and corporate compliance and liability concerns.

Technology Speeds Data Inpu

Marcel Wassink, Philips: “It starts with a few customers and then begins to radiate. At some point one hits the tornado”

While the heart of the business is in call center technology, interest in other uses for speech software is exploding. One area that promises greater efficiency is that of automated stock quotes and trades. UK telecom firm BT, for example, offers a speech-driven, collaborative trading system where trades can be entered, contacts dialed, emails sent through mobile phones or voice-enabled PDAs and customer relationship management (CRM) systems updated.

The group went live with such a system for the New York Mercantile Exchange in February this year, rolling it out to 740 trading positions across the exchange. John Barbara, director of telecommunications at the New York Mercantile Exchange, said in a statement, “They worked with us in blocks over a series of weekends, avoiding any disruption to our mission-critical trading operations.”

Another growth area is document dictation. German group Philips Speech Recognition Systems, a subsidiary of global conglomerate Philips, is one of a number of companies offering software to handle document creation for case management systems. Marcel Wassink, CEO of Philips Speech Recognition Systems, says, “Whether it be dictating, brain dumping or what have you, our software offers complete hands-free document creation.” Proponents of the technology note that it can allow customers or company agents to input complex data in a short period of time with great accuracy. The information input is then integrated with enterprise systems—giving greater automation to the entire data management process.

Open Standards


Peter Mahoney, Scansoft: “The increased packaging of speech applications drives down the cost”

There are two major areas that are of critical interest both in the speech recognition market itself and for clients and potential clients. The first is the development of more flexible voice user interfaces—the prime subunit in the speech recognition hierarchy, and where speech meets data.

Peter Mahoney, vice-president of worldwide marketing for the Speechworks division of Scansoft, explains: “Applications are becoming more flexible, and this is really what clients are looking for. Instead of going through a hierarchical menu, as was the case with the original speech-activated systems, these solutions can handle a much more conversational exchange. Users, be they customers or internal stakeholders, just provide the system with as much information as possible about the request, and the solution can pull out the appropriate information and fill in the blanks.”

The other big development involves what happens to the information once it is received—how a voice recognition solution can connect to back-end systems within the organization to provide seamless processing of information across the group. This is possible as a result of the development and use of open standards, such as voiceXML, that allow for voice applications to interact with other data applications.

In this context, voice software becomes one block in the sequence of information provision, processing and results. It makes it possible for users to get real-time information from internal systems, from web applications, from any part of the digitized information chain. And this information will be the same as that viewed by all others in the chain.


Richard Rosinski, VoiceGenie: “Strong pickup in the speech automation business over the past nine months to a year”

The technology is still facing considerable resistance in some areas, particularly in bricks-and-mortar companies where the payoff may not be immediately obvious. Wassink says, “Many people have tried general packages, and, as they are not tuned to specific user-groups, they have been disappointed.” But, Mahoney says, it is becoming more accessible for all companies. “In the past we have seen smaller companies shy away from implementing speech software,” he says. “Two things are changing that: First, there is the increased packaging of speech applications, which drives down the cost—where in the past only big companies could afford it. Second, firms are delivering increasingly sophisticated packaged applications,” he adds. For example, there are packaged applications in the contact center space for utilities, insurance providers and financial services companies, all of which can be delivered to smaller companies with some ability to be personalized for their needs.

Eidsvik agrees: “When a company wants to deploy a solution, they now have a reasonable chance of not being the first in their industry. They can get applications that just need to be customized, which can be deployed much faster and at lower cost than previously.”

Another deterrent may be existing solutions. Dan Miller, senior analyst at Opus Research, says some companies are waiting to invest in this space until past product purchases reach the end of their life cycle: “For example, if you bought a service point in 2000 that you are depreciating over seven years, it is going to stay there for seven years.” Some resistance may also arise out of organizational issues. Says Miller, “You see resistance at a departmental level because people are loyal to technology they are using here and now.”

The Need to Show ROI

Dan Miller, Opus Research: “The value of creating loyal customers and happy employees is clear”

The incentives for investing in speech recognition technology are often built, at least in part, around intangibles, such as improving customer service and providing a uniform customer experience across several touch points, says Miller. “As well as showing the easily-quantifiable benefits—reduced headcount, shorter call times and lower telecommunications cost—you need to be able to quantify those intangible concepts, such as the value of there being greater efficiency throughout varied contact points and with internal systems and the value of better systems for service personnel,” he says. “You can make this into an ROI. The value of creating loyal customers and happy employees is clear,” he adds.

But the picture is becoming more complex. Investments in speech recognition will increasingly be justified within end-to-end IT infrastructure spend, from data to voice and back again, because savings are seen across a range of modalities with the addition of a voice user interface. You are no longer talking about just replacing a body in a call center, explains Miller. “You are talking about optimizing the entire client interface, eliminating redundant back-end systems and doing a better job of integrating your customer databases and transaction databases and really leveraging the investment that you have made already in your web-based self service,” he says.

The case for companies to acquire voice recognition technology is growing. “The sheer amount of applications is phenomenal,” says Wassink. One of the areas he points to for future application is the management of all correspondence—from email to voice to letter. This would save time, increase efficiency of users and save on telecommunications and other costs.

Mahoney at Scansoft expects to see a spike in the area of advanced applications for phone carriers, particularly with wireless phones. “Soon we will have the ability to provide multimodal applications through a mobile phone. For example, we could provide automated voice applications along with a graphical display,” he says. “If you, say, call directory assistance, we can show you the number you are looking for, show you a map to the location and show ads for businesses in the area you are searching,” he says.

Miller says one area he will be watching is that of voice biometrics and the introduction of conversational authentication. “There is greater awareness, and indeed risk, of identity theft and fraudulent use of financial instruments today. At present, voice is being underused as an identification modality, but I think this will change,” he says.

Companies are certainly becoming more accepting of voice recognition. Wassink adds: “There is a lot of interest for this type of technology, but it is like any early technology that one brings into the market. It starts with a few customers and then begins to radiate. At some point one hits the tornado.”

Denise Bedell