None
Thursday, 16 June 2011

The Aussie accent is being captured and recorded all over the country this winter.

More than 30 scientists from 11 Australian universities, including UWA , will be recording the speech patterns of 1,000 adults from 17 different locations in all states and territories.

AusTalk is a major project involving The University of Western Australia, the University of Western Sydney and Macquarie University with the Australian Speech Science and Technology Association and funded by the Australian Research Council.

The benefits from AusTalk will flow to speech recognition systems and hearing science, including computer aids for hearing-impaired children.

The project is also an opportunity to create a national treasure.

The permanent record of Australian English will become a cultural resource, representing the regional and social diversity and linguistic variation of our language at the beginning of the 21st century, including Australian Aboriginal English.

Language preserved for posterity Two of the chief investigators are UWA 's Associate Professor Roberto Togneri from the School of Electrical, Electronic and Computer Engineering, and Winthrop Professor Mohammed Bennamoun, Head of the School of Computer Science and Software Engineering. Assistant Professor Celeste Rodriguez Louro from Linguistics (School of Humanities) is an associate investigator on the project.

Engineering Honours student Damien Pontifex is a research assistant and he will be supported by research engineer Dr Serajul Haque.

The UWA team is taking the AusTalk project to another dimension by adding a visual element and including facial expression recognition to speech recognition.

"The visual component will be three-dimensional, so muscle and jaw movement can also be used, as well as facial expression and lip-reading, to take us to the next generation of human-computer interaction," Professor Togneri said.

He and Professor Bennamoun are experts in the fusion of speech processing and computer vision.

"Facial expression takes us beyond the AusTalk project," Professor Bennamoun said. "But it's the next logical step. It will provide extra security for voice-activated systems, such as internet banking."

Dr Haque said a visual dimension would provide better recognition of speech when conditions were adverse. "If background noise interfered with voice reception, the speech could be verified with lip-reading and facial expression," he said.

"Expect to see your next generation devices, such as smart phones and televisions, understand your instructions by voice and gesture rather than having to fiddle with a touchscreen, keyboard or remote," Professor Togneri said. "And they will be getting it right every time, and from everywhere."

He said some people criticised current speech recognition systems, such as automated telephone services. "But they are surprisingly good, as long as you use the words the system is expecting to hear, such as ‘billing' or ‘complaints'. You have to remember these systems are tuned by their creators for their specific purposes," he said.

Damien, who is supervised by Professors Togneri and Bennamoun, will be conducting the recording of volunteers' voices. Each of them will be recorded on three separate occasions, using both scripted and spontaneous speech. The team hopes to record 100 Western Australians' voices.

Collection of the data will start in the third quarter of 2011 and more participants from WA are needed. If you, your friends or your colleagues can help, please register your interest and find out more about the project by going to https://austalk.edu.au/ get-involved-public.html

Later, the database will be expanded to include more age groups, including children, more accents and more ways of speaking.

In the meantime, the UWA team will push the boundaries with its ARC Discovery grant to work on 3D audio-visual speech recognition.

Published in UWA News , 13 June 2011

Tags

Groups
UWA Forward