Research network aims to enable efficient voice dialog systems for SMEs

Bild der Pressemitteilung

Dietrich Klakow, Professor for Spoken Language Systems at Saarland University. Foto: Iris Maurer



In contrast to tech large corporations, small and medium-sized companies can usually only draw on a few dialogs with customers to develop chatbots for customer conversations. A research network involving scientists from Saarland University wants to change that. The project partners want to develop a voice dialog system that works with small amounts of data, but in the end is just as good as a system from an IT corporation. “SLIK – Synthesis of Linguistic Korpus Data” is funded by the BMBF.

Technical progress is advancing by leaps and bounds. The largest IT companies such as Google, Amazon and Microsoft can use vast amounts of data to train their artificial intelligence (AI)-based algorithms, for example in the field of speech technology. Anyone who uses the voice dialog systems of these companies, for example in the form of Amazon’s “Alexa,” often no longer notices almost any difference compared to a human counterpart.

That is convenient for the users of these systems and, of course, for the giant tech companies. But what about the countless small and medium-sized enterprises (SMEs) that don’t have this huge amount of training data at their disposal? They are at risk of running out of steam in the competition with the big players. Because they either have to buy in the know-how of the big providers at great expense, or they have to go to great lengths and also spend a lot of money to gather training data and develop their own voice dialog system. Because when developing an AI-based speech dialog system, the simple principle still often applies: A lot helps a lot. A lot of customer conversation data is required to program a good chatbot, for example. Smaller companies would therefore have to set up call centers and record and transcribe masses of customer dialogs and then use them to train an AI. In most cases, that is neither financially nor organizationally feasible.

At this point a research project comes in, in which Dietrich Klakow, Professor of Speech and Signal Processing at Saarland University, is involved. Together with the project partners Kauz GmbH, which is leading the project, and Aristech GmbH, two software companies specializing in speech technology, Dietrich Klakow wants to develop an application that makes it possible to develop well-functioning dialog software from very little training data. There are a few things to keep in mind. “Such smaller companies often have a very narrow customer base,” says Dietrich Klakow. So a mid-sized industrial company has customer conversations that have very specific vocabulary and content. To train an AI on this, one would have to record many dialogs, which is often not an option for the reasons mentioned.

The approach of the research trio from the two companies and Dietrich Klakow’s chair is therefore to combine the best of two worlds. “Kauz develops classic dialog systems, Aristech is strong in speech recognition, and my chair specializes in speech technology artificial intelligence,” adds Dietrich Klakow. So to make virtue out of necessity, turning a few training data into many, is the goal that the collaborative partners are trying to achieve. “Data augmentation,” the artificial multiplication of data, both on the basis of the linguistic knowledge of Kauz and Aristech and with the help of methods developed by Klakow’s chair, is intended to create the foundation for building up large “corpora,” i.e., a large pool of usable speech phrases. With this, an artificial intelligence can then learn just as well as one based on “naturally obtained” corpora that are available to large companies.

The advantage of the classic method of developing “non-learning” software is that it is very individually tailored to customer requirements. The downside: “Such dialogue systems are not perfect in all situations,” explains Dietrich Klakow. “If we now link this classic approach with an AI-based system, we could ideally build a new dialog system from very little training data that can work well in many conversational situations that are relevant for SMEs.” The vision of the cooperation partners is ambitious. In the end, they hope to design a solid voice dialog system with very few sample dialogs – ten, twenty, thirty – so that customers at the “other end” of the dialog won’t be able to tell the difference between it and a system from the tech giants.

The Team’s success will be determined in the spring of 2024. Until then, the “SLIK – Synthesis of Linguistic Korpus Data” project, which started this spring, will run. Of the 1.44 million euros in funding coming from the Federal Ministry of Education and Research, around 300,000 euros will go to Saarland University.

Further information: https://slik-projekt.de

Questions answers:

Prof. Dr. Dietrich Klakow
Tel.: (0681) 30258122
E-Mail: dietrich.klakow(at)lsv.uni-saarland.de

 

Editor (German):
Thorsten Mohr
Tel.: 0681 302-2648
Mail: presse.mohr(at)uni-saarland.de

Saarland University
Press and public relations
Campus, Building A2 3
66123 Saarbrücken

Translation: Saarland Informatics Campus