Improving AI: Computer scientists detect weaknesses in machine learning algorithms

Bild der Pressemitteilung

Computer scientists Jonas Fischer and Michael Hedderich developed software that can point out weaknesses in highly complex machine learning algorithms

View all images

Machine learning is the biggest revolution in computer science in decades. Thanks to learning algorithms, computers can perform sensational feats even on abstract tasks. But, like humans, computers make mistakes in the process – and understanding why a machine learning algorithm makes certain errors is one of the key challenges of modern computer science. This is where Michael Hedderich and Jonas Fischer come in with their research. They have developed software that can detect weaknesses in highly complex machine learning algorithms and thus help to correct them.

Thanks to machine learning algorithms, computers can perform astounding feats, even in domains that were previously attributed only to humans – such as language and the fine arts. These computational methods are based on so-called artificial neural networks. “These are networks of mathematical functions that weight an input based on certain customizable parameters and generate an output from it,” explains Michael Hedderich, a researcher  at Saarland University in Germany and Cornell University in the United States. These functions, called neurons, are connected in series and trained with the help of data, so that the computers are able, for example, to filter out the cats in millions of photos or to produce deceptively real-looking dialogs with people.

“One of the most advanced and currently much-cited text synthesis algorithms in the world, OpenAI’s GPT-3, processes input using 175 billion parameters before a result is output. It’s almost impossible for a human to follow this and understand where errors are happening,” says Jonas Fischer, who is currently a postdoctoral researcher at Harvard University. The previous state of the art was to analyze the output of a machine learning algorithm for errors and list these errors one by one. Then it was up to experts to find patterns in the data sets, which could easily contain thousands of entries. “In our new ‘PyPremise’ software, we use data mining techniques to automatically search in these error data sets for specific patterns and output them bundled together at the end as understandable ‘error categories’. So instead of enumerating each error individually, our software is able to summarize errors on a more abstract level and make statements like: ‘Your ML algorithm has problems with formulations containing the question ‘How much’. This can be seen from the incorrect outputs in the cases X,Y and Z’,” explains Michael Hedderich.

The researchers from Saarbrücken tested their software on both synthetic and real-world data sets. In the process, they were able to show that their method scales to very large data sets with many different properties of the individual data points and delivers reliable results. “The information thus obtained about the weak points of a machine learning algorithm can then be used by operators to revise their training data, for example, and thus correct errors in the system,” explains Jonas Fischer. The software tool developed by the two computer scientists initially only applies to algorithms in the field of natural language processing. However, one of their goals is to expand the tool so that it can be applied to other domains as well.

Michael Hedderich is a computer scientist working at Cornell University as well as the research group “Spoken Language Systems” of computational linguistics professor Dietrich Klakow at Saarland University. Jonas Fischer earned his doctorate from Saarland University last summer, after having pursued his Ph.D. at the Max Planck Institute for Informatics under supervision of Professor Jilles Vreeken of the CISPA Helmholtz Center for Information Security. Jonas is now a postdoctoral researcher at Harvard University. The researchers first presented the scientific foundations of the software in July 2022 at the “International Conference on Machine Learning (ICML),” one of the world’s largest and most prestigious conferences in this field where in general only one-fifth of the scientific papers submitted are accepted for presentation.


Link to the freely available software PyPremise:


Original publication from July 2022:

„Label-Descriptive Patterns and Their Application to Characterizing Classification Errors“; Michael A. Hedderich, Jonas Fischer, Dietrich Klakow, Jilles Vreeken; Proceedings of the 39th International Conference on Machine Learning, PMLR 162:8691-8707, 2022.


Questions can be directed at:  

Dr. Michael Hedderich
Universität des Saarlandes
Tel.: +16073272574


Dr. Jonas Fischer
Harvard University


Background Saarland Informatics Campus:
900 scientists (including 400 PhD students) and about 2500 students from more than 80 nations make the Saarland Informatics Campus (SIC) one of the leading locations for computer science in Germany and Europe. Four world-renowned research institutes, namely the German Research Center for Artificial Intelligence (DFKI), the Max Planck Institute for Informatics, the Max Planck Institute for Software Systems, the Center for Bioinformatics as well as Saarland University with three departments and 24 degree programs cover the entire spectrum of computer science.


Philipp Zapf-Schramm
Saarland Informatics Campus
Phone: +49 681 302-70741

Press photos for download for use free of charge in connection with this press release:

Informatiker Michael Hedderich Michael Hedderich
blank Dr. Jonas Fischer