Robust Machine Learning based on Information Fusion – State-of-the-art Research at the Institute Industrial IT
Industrial sensor systems are inherently prone to uncertainties, vagueness, errors, and drift. There is always a certain degree of possibility that measurements do not tell the whole truth. For example, sensor readings may be affected by noise, altered due to a sensor’s age, or compromised by environmental changes (due to temperature changes or wear and tear of the observed system). For machine learning algorithms, the susceptibility of sensors – or more general information sources – to errors and drift poses a significant challenge. Most often machine learners train on pre-recorded offline data. The learned model is then applied to live streaming data. Specifically, in industrial applications offline and live data may differ in details due to said drift or erroneous behaviour of sensors. At the Institute Industrial IT we research combinations of machine learning and information fusion techniques to increase the robustness of learned models in such situations.
Figure 1: Example of a multi-sensor system monitoring a printing process and a taxonomy of the main types of sensor errors.
Robust Machine Learning
For some of the modern machine learning algorithms there are approaches to mitigate the problem of lacking robustness. For example, a training variant for neural networks involving dropouts increases the robustness of the trained model (besides reducing the risk of overfitting). Other learners, such as random forests, are inherently robust at least to a certain degree. Theses existing approaches tackle the problem of robustness specifically for a learner and only for this specific learner. Information fusion offers a general approach which in principal can be used with any learner.
The Benefit of Information Fusion
The main idea is to exploit redundancy inherent to multi-sensor systems. Often technical industrial sensor systems provide at least partly redundant information – either by design or due to interrelations in the monitored process. In an orchestration step the multi-sensor system is analysed regarding redundant information. Information from redundant sensors is then fused by fusion algorithms specifically designed to identify drifts or defects. In this way defects are buffered by the fusion step. Only then is the fused information passed to the machine learning algorithm. The machine learner has a more stable foundation to work with. In  we have shown that by fusing information, machine learners lose less performance in case of sensor defects.
Figure 2: Information Processing. In the sensor orchestration step redundant sensor are identified. Defect or drifted sensor data is buffered in the fusion step. Finally, the machine learner trains on the fused – less error affected – data as usual.
A particular challenge in this fusion approach is to identify the redundancies between sensors correctly. Guyon et al. argue that correlation does not imply redundancy . A simple correlation is not enough to accurately quantify redundancy. Furthermore, industrial machines tend to produce repetitive data. This results very easily in spurious correlations . But the presented fusion approach relies in correctly identified redundancies – spurious correlations or redundancies would lead to a decrease in performance. Therefore, we present in  a redundancy metric based on possibility theory which is specifically designed both to work with uncertainty prone sensors and to be cautious in light of spurious correlations.
 Holst, Christoph-Alexander, and Volker Lohweg. "Feature fusion to increase the robustness of machine learners in industrial environments." at-Automatisierungstechnik 67.10 (2019): 853-865.
 Guyon, Isabelle, et al., eds. Feature extraction: foundations and applications. Vol. 207. Springer, 2008.
 Calude, Cristian S., and Giuseppe Longo. "The deluge of spurious correlations in big data." Foundations of science 22.3 (2017): 595-612.
 Holst, Christoph-Alexander, and Volker Lohweg. "A Redundancy Metric Set within Possibility Theory for Multi-Sensor Systems." Sensors 21.7 (2021): 2508.
Funding: This research was partly funded by the German Federal Ministry of Education and Research (BMBF) within the project ITS.ML, grant number 01IS18041D.
Christoph-Alexander Holst, M. Sc.
Research Group Manager
Tel.: +49 5261 702 - 5592
inIT - Institute Industrial IT
Technische Hochschule Ostwestfalen-Lippe
Campusallee 6, 32657 Lemgo