Comparativa de modelos de random forest y redes neuronales
aplicados al mantenimiento predictivo con valores ausentes y
datos desbalanceados

Redondo Antón, Javier

Publication:
Comparativa de modelos de random forest y redes neuronales aplicados al mantenimiento predictivo con valores ausentes y datos desbalanceados

Files

Javier_Redondo_TFM.pdf (1.47 MB)

Publication Date

2021-07

Authors

Redondo Antón, Javier

Advisors (or tutors)

Gregorio Rodríguez, Carlos

Citations

Exportar

Abstract

En este trabajo se describen las tareas seguidas para solucionar un problema de mantenimiento predictivo que consiste en utilizar técnicas de aprendizaje automático para predecir si un componente específico del sistema de aire comprimido de un camión pesado se enfrentará a un fallo inminente. Este problema se modela como un problema de clasificación, ya que el objetivo es determinar si una instancia no observada representa un fallo o no. Se evalúan varios algoritmos de clasificación y se investiga cómo tratar con un conjunto de datos desbalanceado y con gran cantidad de valores ausentes. El enfoque se compone de cuatro pasos: (i) la creación de tres conjuntos de datos distintos aplicando diversas técnicas de tratamiento de datos; (ii) la creación de varios modelos de aprendizaje automático; (iii) el ajuste de sus hiperparámetros y del umbral de probabilidad para las predicciones, y (iv) la comparación de resultados entre los distintos modelos sobre los conjuntos creados para determinar la mejor solución. Los resultados muestran que una buena imputación de los valores ausentes y el ajuste del umbral de probabilidad son factores clave a la hora de mejorar el rendimiento de los clasificadores.
This paper describes the workflow used to solve a predictive maintenance problem that consists in using machine learning techniques to predict whether a specific component of the Air Pressure System of a heavy truck is facing an imminent failure. This problem is modeled as a classification problem, since the objective is to determine whether or not an unobserved instance represents a failure. Several classification algorithms are evaluated and it is investigated how to deal with an unbalanced dataset with a large number of missing values. The approach consists of four steps: (i) the creation of three different datasets by applying various data processing techniques; (ii) the creation of several machine learning models; (iii) the adjustment of their hyperparameters and probability threshold for predictions; and (iv) the comparison of results between the different models on the created datasets to determine the best solution. The results show that appropriate imputation of missing values and adjustment of the probability threshold are key factors in improving the performance of the classifiers.

Description

Calificación: 9,5

UCM subjects

Informática (Informática) , Matemáticas (Matemáticas)

Unesco subjects

1203.17 Informática , 12 Matemáticas

Citation

[1] T. Borgi, A. Hidri, B. Neef, and M. S. Naceur, “Data analytics for predictive maintenance of industrial robots,” International conference on advanced systems and electric technologies (IC_ASET), pp. 412–417, 2017. IEEE. [2] E. Rauch, C. Linder, and P. Dallasega, “Anthropocentric perspective of production before and within industry 4.0,” Computers & Industrial Engineering, 2019. (in press). [3] R. S. Peres, A. Dionisio Rocha, P. Leitao, and J. Barata, “Idarts - towards intelligent data analysis and real-time supervision for industry 4.0,” Computers in Industry, vol. 101, p. 138–146, 2018. [4] E. Sezer, D. Romero, F. Guedea, M. MacChi, and C. Emmanouilidis, “An industry 4.0-enabled low cost predictive maintenance approach for smes: a use case applied to a cnc turning centre,” IEEE International conference on engineering, technology and innovation (ICE/ITMC), p. 1–8, 2018. IEEE. [5] S. Biswal and G. R. Sabareesh, “Design and development of a wind turbine test rig for condition monitoring studies,” 2015 International conference on industrial instrumentation and control, ICIC 2015, p. 891–896, 2015. IEEE. [6] J. Wan, S. Tang, D. Li, S. Wang, C. Liu, H. Abbas, and A. V. Vasilakos, “A manufacturing big data solution for active preventive maintenance,” IEEE Transactions on Industrial Informatics, vol. 13, p. 2039–2047, 2017. [7] G. A. Susto, S. Member, A. Beghi, and C. D. Luca, “A predictive maintenance system for epitaxy processes based on filtering and prediction techniques,” IEEE Transactions on Semiconductor Manufacturing, vol. 25, p. 638–649, 2012. [8] G. A. Susto, A. Schirru, S. Pampuri, S. McLoone, and A. Beghi, “Machine learning for predictive maintenance: A multiple classifier approach,” IEEE Transactions on Industrial Informatics, vol. 11, p. 812–820, 2015. [9] A. Jezzini, M. Ayache, L. Elkhansa, B. Makki, and M. Zein, “Effects of predictive maintenance (pdm), proactive maintenace (pom) & preventive maintenance (pm) on minimizing the faults in medical instruments,” 2nd International conference on advances in biomedical engineering (ICABME), p. 53–56, 2013. IEEE. [10] A. Kumar, R. B. Chinnam, and F. Tseng, “An hmm and polynomial regression based approach for remaining useful life and health state estimation of cutting tools,” Computers & Industrial Engineering, vol. 128, p. 1008–1014, 2019. [11] T. Wuest, D. Weimer, C. Irgens, and K. D. Thoben, “Machine learning in manufacturing: Advantages, challenges, and applications,” Production & Manufacturing Research, vol. 4, p. 23–45, 2016. [12] T. P. Carvalho, F. A. A. M. N. Soares, R. Vita, R. da P. Francisco, J. P. Basto, and S. G. S. Alcalá, “A systematic literature review of machine learning methods applied to predictive maintenance,” Computers & Industrial Engineering, vol. 137, p. 106024, 2019. [13] C. H. Glock, E. H. Grosse, M. Y. Jaber, and T. L. Smunt, “Applications of learning curves in production and operations management: A systematic literature review,” Computers & Industrial Engineering, vol. 131, p. 422–441, 2019. [14] H. M. Hashemian and W. C. Bean, “State-of-the-art predictive maintenance techniques,” IEEE Transactions on Instrumentation and Measurement, vol. 60, p. 3480–3492, 2011. [15] F. T., “An introduction to roc analysis [j],” Pattern Recognition Letters, vol. 27(8), pp. 861–874, 2006. [16] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011. [17] G. Lemaître, F. Nogueira, and C. K. Aridas, “Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning,” Journal of Machine Learning Research, vol. 18, no. 17, pp. 1–5, 2017. [18] F. Chollet et al., “Keras.” https://keras.io , 2015. [19] J. Redondo, “Comparativa de modelos de random forest y redes neuronales aplicados al mantenimiento predictivo con valores ausentes y datos desbalanceados.” https://github.com/redondo96/TFM-JRA , 2021. [20] S. van Buuren and K. Groothuis-Oudshoorn, “mice: Multivariate imputation by chained equations in r,” Journal of Statistical Software, Articles, vol. 45, no. 3, pp. 1–67, 2011. [21] M. E. Tipping, “Sparse bayesian learning and the relevance vector machine,” J. Mach. Learn. Res., vol. 1, p. 211–244, Sept. 2001. [22] D. J. MacKay, “Bayesian interpolation,” NEURAL COMPUTATION, vol. 4, pp. 415–447, 1991. [23] D. Wipf and S. Nagarajan, “A new view of automatic relevance determination,” in Advances in Neural Information Processing Systems (J. Platt, D. Koller, Y. Singer, and S. Roweis, eds.), vol. 20, Curran Associates, Inc., 2008. [24] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, p. 321–357, Jun 2002. [25] H. He, Y. Bai, E. A. Garcia, and S. Li, “Adasyn: Adaptive synthetic sampling approach for imbalanced learning,” in 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328, 2008. [26] M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander, “Lof: Identifying density-based local outliers.,” vol. 29, pp. 93–104, 06 2000. [27] B. Leo, “Random forests,” Kluwer Academic Publishers, vol. 45, p. 5–32, 2001. [28] R. Prytz, S. Nowaczyk, T. Rögnvaldsson, and S. Byttner, “Predicting the need for vehicle compressor repairs using maintenance records and logged vehicle data,” Engineering Applications of Artificial Intelligence, vol. 41, p. 139–150, 2015. [29] G. Biau and E. Scornet, “A random forest guided tour,” TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, vol. 25, p. 197–227, 2016. [30] V. Mathew, T. Toby, V. Singh, B. M. Rao, and M. G. Kumar, “Prediction of remaining useful lifetime (rul) of turbofan engine using machine learning,” International conference on circuits and systems (ICCS), p. 306–311, 2017. IEEE. [31] I. Amihai, D. Pareschi, and R. Gitzel, “Modeling machine health using gated recurrent units with entity embeddings and k-means clustering,” IEEE 16th International conference on industrial informatics (INDIN), p. 212–217, 2018. IEEE. [32] J.-H. Shin, H.-B. Jun, and J.-G. Kim, “Dynamic control of intelligent parking guidance using neural network predictive control,” Computers & Industrial Engineering, vol. 120, p. 15–30, 2018. [33] J. Mathew, M. Luo, and C. K. Pang, “Regression kernel for prognostics with support vector machines,” IEEE International conference on emerging technologies and factory automation (ETFA), p. 1–5, 2017. IEEE. [34] S. Butte, A. R. Prashanth, and S. Patil, “Machine learning based predictive maintenance strategy: A super learning approach with deep neural networks,” IEEE Workshop on microelectronics and electron devices (WMED), p. 1–5, 2018. IEEE. [35] B. Luo, H. Wang, H. Liu, B. Li, and F. Peng, “Early fault detection of machine tools based on deep learning and dynamic identification,” IEEE Transactions on Industrial Electronics, vol. 66, p. 509–518, 2018. [36] J. Heaton, Introduction to Neural Networks with Java. Heaton Research, 2008. [37] T. Masters, Practical Neural Network Recipes in C++. Academic Press, 1993. [38] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017. [39] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” CoRR, vol. abs/1502.03167, 2015.

URI

https://hdl.handle.net/20.500.14352/5181

Collections

Trabajos Fin de Master (TFM)

Full item page

Publication:
Comparativa de modelos de random forest y redes neuronales aplicados al mantenimiento predictivo con valores ausentes y datos desbalanceados

Files

Official URL

Full text at PDC

Publication Date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

Research Projects

Organizational Units

Journal Issue

Abstract

Description

UCM subjects

Unesco subjects

Keywords

Citation

URI

Collections

Publication: Comparativa de modelos de random forest y redes neuronales aplicados al mantenimiento predictivo con valores ausentes y datos desbalanceados

Files

Official URL

Full text at PDC

Publication Date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

Research Projects

Organizational Units

Journal Issue

Abstract

Description

UCM subjects

Unesco subjects

Keywords

Citation

URI

Collections

Publication:
Comparativa de modelos de random forest y redes neuronales aplicados al mantenimiento predictivo con valores ausentes y datos desbalanceados