Text to Speech (TTS) and Speech Recognition (SR)

Many of our application use TTS or SR and they use a general interface call SAPI. At this time we only support SAPI 4.0, with plans in the future to update to add SAPI 5.x support. This means you must check what engine your process supports.

Text to Speech

This is the method where you can take some type text and convert it into spoken words for use in on of our applications as a prompt or just to be saved as a mp3 or wav file. The speech options you see in the Windows XP Control Panel are for SAPI 5.x engines only, by default there are no SAPI 4.0 voices installed. If you are using a 3rd party voice set then please confirm they support SAPI 4.0.

To verify if any are installed then just use the NCH Swift Sound application of your choice and select the Text to Speech option. It will list the names of the available voices.

If you do not have any please check www.nch.com.au/speech/index.html. It includes downloads for the free Microsoft Speech engines.

Speech Recognition

This is the process where a spoken message (dictation for example) can be converted to text to be included in a document or email. Once again we can only use engines that support SAPI 4.0 interfaces but many engines (including Dragon Naturally Speaking and IBM Via Voice) support both SAPI 4.0 and 5.x.

For reliability of recognition you must train your speech recognition engine extensively. Please refer to its documentation for more details.

Otherwise you must configure the NCH Application to select the engine and the user. The user represents the person who made the original dictation/audio file, not the person who is using the machine. This especially important for Express Scribe where you receive dictations from a variety of sources.