Speech Recognition Chips Go Multilingual

Summary of Speech Recognition Chips Go Multilingual


Sensory, Inc. announced five new languages for its RSC-4x line of low-cost speech recognition chips: German, Spanish, Japanese, Korean, and soon Mandarin Chinese. The RSC-4128 chip integrates a 16-bit ADC, DAC, digital filter unit, math unit, 4K RAM, and 128K ROM to replace standard 8-bit microcontrollers. It supports speaker-independent recognition, voice recording, compression, and music synthesis without user training. The Quick T2SI Toolkit enables rapid vocabulary creation by converting text directly into recognition sets, while Quick Synthesis tools allow easy speech output generation. These innovations aim to expand global market access for consumer electronics with minimal incremental cost.

Parts used in the RSC-4x Speech Recognition Project:

  • RSC-4128 integrated circuit
  • 16 bit ADC
  • DAC (Digital-to-Analog Converter)
  • Digital filter unit
  • Math unit
  • 4K RAM
  • 128K ROM
  • Output amplification
  • Timers
  • Comparators
  • Quick T2SI Toolkit
  • RSC-4x Demo/Evaluation board
  • Quick T2SI IDE

SANTA CLARA, CA (PRWEB) July 30, 2004

Sensory, Inc., the world’s leading supplier of embedded speech technologies, today announced the availability of 5 new languages to be supported by Sensory’s RSC-4x line of low cost speech recognition chips and associated tools. In addition to American English, speech developers can now create embedded applications in German, Spanish, Japanese, Korean, and soon in Mandarin Chinese.

RSC-4x Series Microcontrollers Supply General Control with Speech I/O

Sensory’s RSC-4128 integrated circuit allows manufacturers to replace existing 8 bit microcontrollers with a voice-enabled solution. Adding speech I/O is becoming increasingly popular in the next generation of user interfaces for consumer electronic products, allowing ease of use and functionality never before available. The RSC-4128 is actually a powerful general purpose microcontroller inside a voice-recognition-system-on-chip (including 16 bit ADC, DAC, digital filter unit, math unit, 4K RAM, 128K ROM, output amplification, timers, comparators and more). A variety of technologies run on this chip, including speaker independent recognition, speaker dependent recognition, speaker verification, voice record, speech compression/playback and midi-like music synthesis. The RSC- 4x IC’s sell for under $ 2 in large volumes, allowing replacement of existing microcontrollers with minimal incremental cost.

New Languages and New Technologies Improve Accuracy Worldwide

Sensory’s new speaker independent (SI) technologies use state-of-the-art Hidden Markov Modeling combined with advanced neural networks to eliminate the need for user training. These recognition technologies are derived from a database of sampled speech and are based on the rules of phonology – the environment in which sounds occur in languages. This process significantly accelerates adding new languages to Sensory’s speaker independent technologies. Previously, this unique approach was only available on-chip in American English, but is now being released in German, Spanish (covering a mix of Latin American dialects), Japanese, and Korean, with Mandarin Chinese available in Q4. According to Todd Mozer, CEO of Sensory, “We have had substantial demand for low cost speech recognition chips for products to be sold in Asia and Europe. We are very excited to be announcing our first round of international language products as this will significantly expand our customer base and enable innovative consumer products to come to world markets.” Speech synthesis is now available in any language as well, so products can truly talk and hear around the world.

Development Tools Enable Rapid International Vocabulary Development

Sensory’s Quick T2SI Toolkit (Text to Speaker Independent) complements the RSC-4x line as it allows the rapid creation of speaker independent sets by simply typing in the desired vocabulary words as text. The speaker independent recognition set of words or phrases can then be downloaded onto the included RSC-4x Demo/Evaluation board for rapid prototype creation and testing. According to Edgar Chau of Cyberworkshop, Ltd. in Hong Kong, “Other approaches for creating speaker independent recognition sets take months and don’t work very well. Sensory’s Quick T2SI approach takes just a few minutes, and consistently works great. The tool has empowered us to develop vocabulary sets with a quick turn around time, freeing us to focus on other design considerations.” A full suite of speech editing tools is integrated in the Quick T2SI IDE for enhanced ease of use. Quick Synthesis tools allow rapid development of speech output files by recording and compressing a file with the touch of a button, no longer requiring complex marking by linguists or vocabulary development fees. Purchasers of the Quick T2SI Toolkit have access to any language pack currently available as well as future language releases.

About Sensory, Inc.

Sensory, Inc., based in Santa Clara, CA, is the world leader in embedded speech technologies. Sensory offers a complete line of IC and software-only solutions for speech recognition, speech & music synthesis, speaker verification and other voice and audio technologies. Sensory’s customers are leaders in consumer electronics and include Avon, Fisher-Price, Hasbro, JVC, Kenwood, Matsushita, Mattel, MGA, Mitsubishi, Radica, Sega, Sharper Image, Sony, Tektronix, Toshiba, Uniden, and many others. Interactive Speech™ line of low-cost ICs includes the award-winning RSC Series (general purpose microcontrollers for speech I/O), SC Series (music and speech synthesis) and the SVC biometric voice chips. Sensory’s software products are available on a range of hardware platforms from microcontrollers to DSPs. Sensory’s Internet address is http://www.sensoryinc.com

# # #








More Microcontroller Press Releases

Quick Solutions to Questions related to RSC-4x Speech Recognition Project:

  • Which new languages are supported by Sensory's RSC-4x line?
    The new languages include German, Spanish, Japanese, Korean, and soon Mandarin Chinese.
  • What components are included inside the RSC-4128 chip?
    The chip includes a 16-bit ADC, DAC, digital filter unit, math unit, 4K RAM, 128K ROM, output amplification, timers, and comparators.
  • How does the Quick T2SI Toolkit create speaker independent sets?
    It allows users to rapidly create sets by simply typing desired vocabulary words as text.
  • Does the RSC-4128 require user training for speaker independent recognition?
    No, the technology uses Hidden Markov Modeling and neural networks to eliminate the need for user training.
  • What is the approximate cost of the RSC-4x ICs in large volumes?
    The chips sell for under $2 in large volumes.
  • Can products using this technology support speech synthesis in any language?
    Yes, speech synthesis is now available in any language to allow products to talk around the world.
  • What hardware is needed to test the speaker independent recognition sets created by the toolkit?
    The sets can be downloaded onto the included RSC-4x Demo/Evaluation board for rapid prototype creation.
  • How does the Quick Synthesis tool simplify speech output development?
    It allows users to record and compress files with the touch of a button without requiring complex marking by linguists.

About The Author

Ibrar Ayyub

I am an experienced technical writer holding a Master's degree in computer science from BZU Multan, Pakistan University. With a background spanning various industries, particularly in home automation and engineering, I have honed my skills in crafting clear and concise content. Proficient in leveraging infographics and diagrams, I strive to simplify complex concepts for readers. My strength lies in thorough research and presenting information in a structured and logical format.

Follow Us:
LinkedinTwitter
Scroll to Top