

Speech interface for HOAP2
Speech processing software
The first stage was to look for open source programs in the domain of speech recognition and speech synthesis.Speech system
The second stage consisted in incorporating these programs to an application able to interact with humans, programs and machines. In order to do this, a client-server architecture was chosen, offering the possibility to be autonomous and easily adaptable to new applications for speech processing.On the figure below, we can see the two entities, communicating over the LAN. The microphone and the loudspeaker are plugged to the server, which only handles row strings. The ones recognised by Sphinx are sent to the client, and the ones sent by the client are redirected to Festival. Exceptions to that are the three orders executed directly by the server: change the speaker used by Festival, do a break for recognition and halt down.

Results
The whole system works correctly. The developed applications are simple to use. Their execution is reliable, rapid and not resource consuming. Festival reached our expectations, but we have to be lenient with Sphinx. Indeed, for good results, we have to speak distinctly, with an American accent and, over all, the less vocabulary we have in the dictionary, the better the chances are to get a correct answer. Men and women are recognised in an equivalent way. With an appropriate algorithm, we can handle sentences recognised incorrectly and still react the right way.
![]() |
![]() |
Videos
![]() |
| Interaction with robot HOAP2 (1:41) XviD, 23.0Mb Cinepack, 45.5Mb |
People involved in this project