TY - JOUR
T1 - Constrained Tiny Machine Learning for predicting gas concentration with I4.0 low-cost sensors
AU - Adoui, Mohammed El
AU - Herpoel, Thomas
AU - Frénay, Benoît
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2023
Y1 - 2023
N2 - Low-cost gas sensors (LCS) often produce inaccurate measurements due to varying environmental conditions that are not consistent with laboratory settings, leading to inadequate productivity levels compared to high-quality sensors. To address this issue, we propose the use of Machine Learning (ML) to predict accurate concentrations of pollutant gases acquired by LCS integrated into an embedded Internet of Things platform. However, a key challenge is to optimize an accurate ML design under low memory and computation power constraints of microcontrollers (MCUs) while maintaining accurate ML scores. After data analysis and pre-processing, we assess and analyze the performance of five ML algorithms to predict the concentration of pollutants gases from multiple specifications (weather, presence of other gases, etc.). To support the experiments, datasets from three sources are used: (1) VOCSens, (2) Belgian Interregional Environment Agency cell, and (3) Visual-Crossing. Once the best model was optimized and validated, multiple hard constraints were added to the selected ML structure to satisfy material and expert requirements. Trained models were ported to be implemented locally in a MCU after comparing several porting libraries. The assembled code obtained is evaluated based on two metrics: storage memory consumption and inference time, relative to the highest attainable capacities. The improved random forest is the best ML model for the used dataset with an R2 score meeting of 0.72 and Root Means Square Error of 0.0028 ppm. The best generated Tiny-ML model needs 3% of RAM and 98% of Flash storage. The empirical results prove that the developed ML algorithm applied to LCS provides high accuracy to predict pollutant gases. This algorithm can also be used to adjust the LCS systems to provide calibrated data in real time, even if the platform being used is not particularly advanced or powerful.
AB - Low-cost gas sensors (LCS) often produce inaccurate measurements due to varying environmental conditions that are not consistent with laboratory settings, leading to inadequate productivity levels compared to high-quality sensors. To address this issue, we propose the use of Machine Learning (ML) to predict accurate concentrations of pollutant gases acquired by LCS integrated into an embedded Internet of Things platform. However, a key challenge is to optimize an accurate ML design under low memory and computation power constraints of microcontrollers (MCUs) while maintaining accurate ML scores. After data analysis and pre-processing, we assess and analyze the performance of five ML algorithms to predict the concentration of pollutants gases from multiple specifications (weather, presence of other gases, etc.). To support the experiments, datasets from three sources are used: (1) VOCSens, (2) Belgian Interregional Environment Agency cell, and (3) Visual-Crossing. Once the best model was optimized and validated, multiple hard constraints were added to the selected ML structure to satisfy material and expert requirements. Trained models were ported to be implemented locally in a MCU after comparing several porting libraries. The assembled code obtained is evaluated based on two metrics: storage memory consumption and inference time, relative to the highest attainable capacities. The improved random forest is the best ML model for the used dataset with an R2 score meeting of 0.72 and Root Means Square Error of 0.0028 ppm. The best generated Tiny-ML model needs 3% of RAM and 98% of Flash storage. The empirical results prove that the developed ML algorithm applied to LCS provides high accuracy to predict pollutant gases. This algorithm can also be used to adjust the LCS systems to provide calibrated data in real time, even if the platform being used is not particularly advanced or powerful.
KW - Constrained machine learning
KW - TinyML
KW - calibration
KW - gas concentration
KW - low-cost sensors
KW - random forest
KW - regression
UR - http://www.scopus.com/inward/record.url?scp=85193499214&partnerID=8YFLogxK
U2 - 10.1145/3590956
DO - 10.1145/3590956
M3 - Article
SN - 1539-9087
VL - 23
JO - ACM Transactions on Embedded Computing Systems
JF - ACM Transactions on Embedded Computing Systems
IS - 3
M1 - 51
ER -