Summary of How Interactive Voice Response (IVR) Works
Summary: Interactive Voice Response (IVR) systems automate telephone interactions using DTMF tones, speech recognition, and text-to-speech to route calls, provide information, and perform transactions. Modern IVRs often use VoiceXML and consist of telephone and TCP/IP networks, a VXML telephony server, web/application servers hosting VXML applications, and databases. Organizations can host systems in-house or subscribe to IVR-hosting services for customization and support. System success is measured by the percentage of callers requesting live operators.
Parts used in the IVR Project:
- Telephone network (PSTN or VoIP)
- TCP/IP network
- VXML telephony server
- Web/application server
- Databases
- Telephony board or telephony card
- IVR software
- Speech-recognition software
- Text-to-speech (TTS) software
- IVR-hosting service (optional)
It’s hard to think of a customer-oriented business that hasn’t made the switch from live operators to IVR. When you call your credit card company, you can use the IVR to pay your balance or report a fraudulent charge. Airlines use extensive IVRs to book reservations and check the real-time status of flights. Pharmacies use IVRs for refilling prescriptions. And just about everybody uses IVRs to route calls to separate extensions or to access the company phone directory.
Large and small businesses have adopted IVR technology because it saves money that would otherwise be spent on living, breathing (expensive) employees. An IVR system’s effectiveness is rated by the percentage of callers who ask to speak to a live operator. The lower the percentage, the more successful the system. Of course there are some IVR systems that never give you the option of speaking to a live operator. But even among IVR fans, that’s considered bad practice.
So how do these automated phone systems work? Are we actually talking to a robot or just a smart piece of software? Read on to learn more about the technology behind IVR systems.
IVR Systems
IVR systems are an example of computer-telephone integration (CTI). The most common way for a phone to communicate with a computer is through the tones generated by each key on the telephone keypad. These are known as dual-tone multi-frequency (DTMF) signals.
Each number key on a telephone emits two simultaneous tones: one low-frequency and one high-frequency. The number one, for example, produces both a 697-Hz and a 1209-Hz tone that’s universally interpreted by the public switched telephone network as a “1.”
A computer needs special hardware called a telephony board or telephony card to understand the DTMF signals produced by a phone. A simple IVR system only requires a computer hooked up to a phone line through a telephony board and some inexpensive IVR software. The IVR software allows you to pre-record greetings and menu options that a caller can select using his telephone keypad.
More advanced IVR systems include speech-recognition software that allows a caller to communicate with a computer using simple voice commands. Speech recognition software has become sophisticated enough to understand names and long strings of numbers — perhaps a credit card or flight number.
On the other end of the phone call, an organization can employ text-to-speech (TTS) software to fully automate its outgoing messages. Instead of recording all of the possible responses to a customer query, the computer can generate customized text-like account balances or flight times and read it back to the customer using an automated voice.
Many of today’s most advanced IVR systems are based on a special programming language called voice extensible markup language (vxml). Here are the basic components of a VXML-based IVR system:
- Telephone network — Incoming and outgoing phone calls are routed through the regular Public Switched Telephone Network (PSTN) or over a VoIP network.
- TCP/IP network — A standard Internet network, like the ones that provide Internet and intranet connectivity in an office.
- VXML telephony server — This special server sits between the phone network and the Internet network. It serves as an interpreter, or gateway, so that callers can interface with the IVR software and access information on databases. The server also contains the software that controls functions like text-to-speech, voice recognition and DTMF recognition.
- Web/application server — This is where the IVR software applications live. There might be several different applications on the same server: one for customer service, one for outgoing sales calls, one for voice-to-text transcription. All of these application are written in VXML. The Web/application server is connected to the VXML telephony server over the TCP/IP network.
- Databases — Databases contain real-time information that can be accessed by the IVR applications. If you call your credit card company and want to know your current balance, the IVR application retrieves the current balance total from a database. Same for flight arrival times, movie times, et cetera. One or more databases can be linked to the Web/application server over the TCP/IP network.
[source: VoiceXML Review]
A company or organization can choose to purchase all of this hardware and software and run it in-house, or it can subscribe to an IVR-hosting service. A hosting service charges a monthly fee to use its servers and IVR software. The hosting service helps the organization customize an IVR system that best fits its needs and provides technical support should anything go wrong.
For more Detail: How Interactive Voice Response (IVR) Works
- How do IVR systems receive input from callers?
IVR systems receive input via telephone keypad tones called DTMF signals and increasingly via speech-recognition software for voice commands. - Can IVR systems speak customized account information?
Yes; IVR systems can use text-to-speech software to generate and read customized information like balances or flight times. - What hardware is needed for a computer to understand phone keypad tones?
A telephony board or telephony card is needed for a computer to interpret DTMF signals from a phone. - What programming language do many advanced IVR systems use?
Many advanced IVR systems use VoiceXML (VXML) to build applications. - Where do IVR applications run?
IVR applications run on a web/application server connected to the VXML telephony server over a TCP/IP network. - How do IVR applications obtain real-time information like account balances?
IVR applications retrieve real-time information from databases linked to the web/application server over the TCP/IP network. - What role does the VXML telephony server play?
The VXML telephony server acts as a gateway between the phone network and the TCP/IP network and controls functions like TTS, voice recognition, and DTMF recognition. - Can organizations outsource their IVR system?
Yes; organizations can subscribe to IVR-hosting services that provide servers, software, customization, and technical support.

