"Sparsity and Machine Learning - no contradiction"
Till the last decade, signal processing systems were designed using conventional model-based approaches, which rely on mathematical and physical models of real world phenomena. Holistic modeling of real world problems is usually too complex to handle analytically. Therefore, the whole problem was typically divided into mathematically tractable subproblems.
Analogously, signal processing aims to decompose complex signals using elementary functions like sinusoids which are easier to manipulate. Due to these simplifications, handcrafted mathematical models are valid only under certain assumptions, i.e. they may work perfectly in a simulated environment, but their performance degrades significantly under real-world conditions. These limitations can be resolved by utilizing the discipline of machine learning (ML), which builds up mathematical models by recognizing patterns in real-world measurement data.
The planned joint JKU-SAL embedded Signal Processing and Machine Learning (eSPML) Lab aims to bring together these two research fields in order to provide hybrid solutions for complementing and/or replacing model-based advanced signal processing algorithms. The first pre-project, that is introduced below, serves as a warm up for both sides. Due to that, I could cooperate with researchers like Venkata Pathuri Bhuvana who also belongs to the conventional signal processing school. At the same time, it was a great opportunity to share our expertise with other SAL researchers such as Bernhard Lehner and Christian Huber who represent the machine learning discipline in our Lab. This diversity in the scientific backgrounds can trigger new ideas and solutions to already existing and future problems.
The first pre-project is related to the design of sparsity aware signal processing systems. The concept of sparsity has been studied for nearly a century, but it revealed its true potential nature due to the advent of compressed sensing in 2006. The theory suggests that a sparse signal can be reconstructed by exploiting only a few measured values, which can go below the fundamental Nyquist sampling rate. Another interpretation of sparsity can be given by Occam's razor: "among competing representations that predict equally well, the one with the fewest number of components should be selected".
Implementing sparse signal estimators is a challenging task. Model-based approaches sometimes oversimplify real-world problems in exchange for mathematically tractable solutions. Another drawback is the considerably higher computational complexity that usually prevents real time applications. These algorithms also require an initial estimate of the maximum number of signal components, which we do not know in advance. Machine learning approaches provide a good alternative to solve these challenges by building up data-driven mathematical models.
These algorithms use a two-stage methodology to explore patterns such as sparsity hidden in the given data. In the so-called training phase, the algorithms learn to recognize relevant patterns and identify the trend in the training data set. This is followed by the prediction of new or unseen objects, which were not provided during the training phase. This way, the main computational load is shifted to the offline training phase, which permits real-time applications in the prediction phase, even with embedded hardware. One of the main advantages of machine learning is that the internal structure of the data, such as sparsity, is explored/learned automatically by computer algorithms. However, in order to achieve this goal, a huge amount of data is required, which is sometimes difficult to collect.
Medical imaging is a typical example, where the internal structure of the human body is to be predicted in order to detect anomalies such as tumors. The training data may comprise MRI images, or CT scans, while the desired output can be the position and size of the tumor. These are expensive imaging modalities, which require medical experts to manually localize tumors. Therefore, creating training examples for machine learning algorithms is limited by both financial and human resources. In order to solve these problems in our project, we utilize the best properties of the model-based and the machine learning principles. Namely, we first reduce the amount of training data by applying model-based sparse estimation techniques, which are further processed by machine learning methods.
In this project, we consider the problem of non-destructive testing (NDT) as a case study. Analysis of structural imperfections of materials, spare parts or components of a system, is important in the prevention of malfunctioning devices. This can be done by active thermography, which, besides non-destructive testing, has several other applications such as structural health monitoring, material characterization, or thermal imaging in medicine. In thermographic imaging, the specimen is heated up by flash-lamps, lasers, etc., then the corresponding temperature evolution is measured on its surface. The resulting thermal pattern is used to reconstruct the heat distribution inside the material, which provides the main information for revealing the internal structure of the specimen and for detecting anomalies such as defects, cracks, corrosion, etc.
The concept of sparsity applies very well in NDT, since the previously mentioned anomalies occur rarely inside the material. In most of the test cases, there will be a few defects compared to the full volume, or there will be no defects at all. The project has already been in an advanced stage, we have high quality results that will be presented at international conferences and will shortly be submitted to international journals. Since the combination of sparsity aware signal processing and machine learning is quite new in thermographic imaging, the results can have a high impact in the previously mentioned fields of applications, and can open new ways for innovative future technologies.