Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling

Cancilla Buenache, Juan Carlos

Publication:
Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling

Files

T38982.pdf (4.78 MB)

Publication Date

2017-06-27

Authors

Cancilla Buenache, Juan Carlos

Advisors (or tutors)

Torrecilla Velasco, José Santiago

Publisher

Universidad Complutense de Madrid

Citations

Exportar

Abstract

It is currently known that there is a direct relation between the moment a disease is detected or diagnosed and the consequences it will have on the patient, as an early detection is generally linked to a more favorable outcome. This concept is the basis of the present research, due to the fact that its main goal is the development of mathematical tools based on computational artificial intelligence to safely and non-invasively attain the detection of multiple diseases. To reach these devices, this research has focused on the breath analysis of patients with diverse diseases, using several analytical methodologies to extract the information contained in these samples, and multiple feature selection algorithms and neural networks for data analysis. In the past, it has been shown that there is a correlation between the molecular composition of breath and the clinical status of a human being, proving the existence of volatile biomarkers that can aid in disease detection depending on their presence or amount. During this research, two main types of analytical approaches have been employed to study the gaseous samples, and these were cross-reactive sensor arrays (based on organically functionalized silicon nanowire field-effect transistors (SiNW FETs) or gold nanoparticles (GNPs)) and proton transfer reaction-mass spectrometry (PTR-MS). The cross-reactive sensors analyze the bulk of the breath samples, offering global, fingerprint-like information, whereas PTR-MS quantifies the volatile molecules present in the samples. All of the analytical equipment employed leads to the generation of large amounts of data per sample, forcing the need of a meticulous mathematical analysis to adequately interpret the results. In this work, two fundamental types of mathematical tools were utilized. In first place, a set of five filter-based feature selection algorithms (χ2 (chi2) score, Fisher’s discriminant ratio, Kruskal-Wallis test, Relief-F algorithm, and information gain test) were employed to reduce the amount of independent in the large databases to the ones which contain the greatest discriminative power for a further modeling task. On the other hand, and in relation to mathematical modeling, artificial neural networks (ANNs), algorithms that are categorized as computational artificial intelligence, have been employed. These non-linear tools have been used to locate the relations between the independent variables of a system and the dependent ones to fulfill estimations or classifications. The type of ANN that has been used in this thesis coincides with the one that is more commonly employed in research, which is the supervised multilayer perceptron (MLP), due to its proven ability to create reliable models for many different applications...
Actualmente es sabido que existe una relación directa entre el momento en el cual se detecta o diagnostica una enfermedad y las consecuencias que tendrá sobre el paciente, ya que una detección temprana va generalmente ligada a un desarrollo más favorable. Este concepto es el cimiento de la presente investigación, cuyo objetivo fundamental es el desarrollo de herramientas basadas en inteligencia artificial computacional que consigan, mediante medios seguros y no invasivos, la detección de diversas enfermedades. Para alcanzar dichos sistemas, los estudios han sido enfocados en el análisis de muestras de aliento de pacientes de diversas enfermedades, empleando varias técnicas para extraer información, y diversos algoritmos de selección de variables y redes neuronales para el procesamiento matemático. En el pasado, se ha comprobado que hay una correlación entre la composición molecular del aliento y el estado clínico de una persona, evidenciando la existencia de biomarcadores volátiles que pueden ayudar a detectar enfermedades, ya sea por su presencia o por su cantidad. Durante el transcurso de esta investigación, se han empleado esencialmente dos tipos de técnicas analíticas para estudiar las muestras gaseosas, y estas son conjuntos de sensores de reactividad cruzada (basados en transistores de efecto de campo con nanocables de silicio (SiNW FETs) o en nanopartículas de oro (GNPs), ambos funcionalizados con cadenas orgánicas) y equipos de reacción de transferencia de protones con espectrometría de masas (PTR-MS). Los sensores de reactividad cruzada analizan el aliento en su conjunto, extrayéndose información de la muestra global, mientras que usando PTR-MS, se cuantifican las moléculas volátiles presentes en las muestras analizadas. Todas las técnicas empleadas desembocan en la generación de grandes cantidades de datos por muestra, por lo que un análisis matemático exhaustivo es necesario para poder sacar el máximo rendimiento de los estudios. En este trabajo, se emplearon principalmente dos tipos de herramientas matemáticas. Las primeras son un grupo de cinco algoritmos de selección de variables, concretamente, filtros de variables (cálculos basados en estadística de χ2 (chi2), ratio discriminante de Fisher, análisis de Kruskal-Wallis, algoritmo relief-F y test de ganancia de información), que se han empleado en las bases de datos con grandes cantidades de variables independientes para localizar aquellas con mayor importancia o poder discriminativo para una tarea de modelización matemática posterior. Por otro lado, en cuando a dicha modelización, se ha empleado un tipo de algoritmo que se cataloga dentro del área de la inteligencia artificial computacional: las redes neuronales artificiales (ANNs). Estas herramientas matemáticas de naturaleza no lineal se han utilizado para localizar las relaciones existentes entre las variables independientes de un sistema y las variables dependientes o parámetros a estimar o clasificar. Se ha empleado el tipo de ANN supervisada más extensamente usado en investigación, que son los perceptrones multicapa (MLPs), debido a su habilidad contrastada para originar modelos fiables para numerosas aplicaciones...

Description

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Químicas, leída el 09-09-2016

Publication:
Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling

Files

Official URL

Full text at PDC

Publication Date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

Research Projects

Organizational Units

Journal Issue

Abstract

Description

UCM subjects

Unesco subjects

Keywords

Citation

URI

Collections

Publication: Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling

Files

Official URL

Full text at PDC

Publication Date

Authors

Advisors (or tutors)

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Citations

Exportar

Research Projects

Organizational Units

Journal Issue

Abstract

Description

UCM subjects

Unesco subjects

Keywords

Citation

URI

Collections

Publication:
Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling