Publication:
Predictive analysis to find germline genetic susceptibility associated with the tumoral immune infiltration in pancreatic cancer

Loading...
Thumbnail Image
Official URL
Full text at PDC
Publication Date
2021-07
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Citations
Google Scholar
Research Projects
Organizational Units
Journal Issue
Abstract
The immune system plays an important role in the tumor microenvironment since there is an interaction between tumor cells and immune cells that affects the tumor development. In particular, in pancreatic cancer, it has been studied that after characterizing B and T cell repertoire, patients have shown a large heterogeneity among them. Additionally, it was previously demonstrated that genetic susceptibility may explain around 40% of the immune system differences across individuals. Thus, in this project, the main objective was to predict tumoral immune infiltration in pancreatic cancer patients using germline genetic variants (SNPs). T and B cell receptors were extracted from RNAseq data in 120 individuals with pancreatic cancer and richness and diversity were assessed using Expression and Entropy measures. Then, four machine learning methods were proposed (Elastic Net, Ridge Regression, Random Forest and Neural Network) focus on dealing with high dimensionality and multicollinearity problems present in high-throughput data. The performance of the four different methods was assessed through Pearson correlation. Predictions obtained by these methods were benchmarked across 10 testing subsets in three different scenarios. Neural Network which showed the highest and the most consistent correlations between observed and predicted values, overcomes the overfitting and over-specificity problems. Being able to predict the immune infiltration with genetic variants will allow us to integrate and decipher new biological insights extremely necessary in pancreatic cancer research.
El sistema inmunológico desempeña un papel fundamental en el microentorno del tumor, ya que, existe una interacción entre las células tumorales y las inmunes influyendo en su desarrollo. En particular, en cáncer de páncreas. Previamente, se ha estudiado que tras caracterizar el repertorio de las células B y T, los pacientes han mostrado una gran heterogeneidad entre ellos. Además, se ha demostrado que la susceptibilidad genética puede explicar hasta un 40% de las diferencias inmunes observadas entre individuos. Así, en este trabajo, se plantea el objetivo de predecir la infiltración tumoral inmune en individuos con cáncer de páncreas usando variantes genéticas en línea germinal (SNPs). Los receptores de las células B y T se extrajeron de RNAseq de 120 individuos con cáncer de páncreas y la riqueza y diversidad se midieron mediante las medidas de Expresión y Entropía. Se proponen entonces cuatro métodos de machine learning (Elastic Net, Ridge Regression, Random Forest y Neural Network) enfocados a lidiar con los problemas de alta dimensionalidad y multicolinealidad presentes en nuestros datos. La actuación de los cuatro métodos se evaluó a través de la correlación de Pearson. Las predicciones obtenidas por estos métodos fueron comparadas a lo largo de 10 subconjuntos de testing en tres escenarios diferentes. Neural Network, el cual mostró las correlaciones más altas y consistentes entre los valores predichos y observados, superó los problemas de sobreajuste y sobre-especificidad. Ser capaz de predecir la infiltración inmunológica mediante variantes genéticas nos permitirá integrar y descifrar nuevo conocimiento muy necesario para avanzar en el cáncer de páncreas.
Description
Keywords
Citation
1. Lepage C, Capocaccia R, Hackl M, Lemmens V, Molina E, Pierannunzio D, et al. Survival in patients with primary liver cancer, gallbladder and extrahepatic biliary tract cancer and pancreatic cancer in Europe 1999-2007: Results of EUROCARE-5. Eur J Cancer. 2015;51(15):2169–78. 2. Aune D, Greenwood DC, Chan DSM, Vieira R, Vieira AR, Navarro Rosenblatt DA, et al. Body mass index, abdominal fatness and pancreatic cancer risk: A systematic review and non-linear dose-response meta-analysis of prospective studies. Ann Oncol. 2012;23(4):843–52. 3. Molina-Montes E, van Hoogstraten L, Gomez-Rubio P, Löhr M, Sharp L, Molero X, et al. Pancreatic cancer risk in relation to lifetime smoking patterns, tobacco type, and dose-response relationships. Cancer Epidemiol Biomarkers Prev. 2020;29(5):1009–18. 4. Bosetti C, Rosato V, Li D, Silverman D, Petersen GM, Bracci PM, et al. Diabetes, antidiabetic medications, and pancreatic cancer risk: an analysis from the International Pancreatic Cancer Case-Control Consortium. Ann Oncol. 2014;25(10):2065–72. 5. Lucenteforte E, La Vecchia C, Silverman D, Petersen GM, Bracci PM, Ji BT, et al. Alcohol consumption and pancreatic cancer: A pooled analysis in the International Pancreatic Cancer Case-Control Consortium (PanC4). Ann Oncol. 2012;23(2):374–82. 6. Kirkegård J, Mortensen FV, Cronin-Fenton D. Chronic Pancreatitis and Pancreatic Cancer Risk: A Systematic Review and Meta-analysis. Off J Am Coll Gastroenterol | ACG. 2017;112(9). 7. Wolpin BM, Chan AT, Hartge P, Chanock SJ, Kraft P, Hunter DJ, et al. ABO blood group and the risk of pancreatic cancer. J Natl Cancer Inst. 2009;101(6):424–31 8. Klein AP. Genetic susceptibility to pancreatic cancer. Mol Carcinog. 2012;51(1):14–24. 9. Gomez-Rubio P, Zock J-P, Rava M, Marquez M, Sharp L, Hidalgo M, et al.Reduced risk of pancreatic cancer associated with asthma and nasal allergies. Gut. 2017 Feb 1;66(2):314 LP – 322. 10. López de Maturana E, Rodríguez JA, Alonso L, Lao O, Molina-Montes E, Martín-Antoniano IA, et al. A multilayered post-GWAS assessment on genetic susceptibility to pancreatic cancer. Genome Med. 2021;13(1):1–18. 11. Sayaman RW, Saad M, Thorsson V, Hu D, Hendrickx W, Roelands J, et al.Germline genetic contribution to the immune landscape of cancer. Immunity.2021;54(2):367-386.e8. 12. Barnes TA, Amir E. HYPE or HOPE: The prognostic value of infiltrating immune cells in cancer. Br J Cancer. 2017;117(4):451–60. 13. Foucher ED, Ghigo C, Chouaib S, Galon J, Iovanna J, Olive D. Pancreatic ductal adenocarcinoma: A strong imbalance of good and bad immunological cops in the tumor microenvironment. Front Immunol. 2018;9(MAY):1–8. 14. Ino Y, Yamazaki-Itoh R, Shimada K, Iwasaki M, Kosuge T, Kanai Y, et al. Immune cell infiltration as an indicator of the immune microenvironment of pancreatic cancer. Br J Cancer. 2013;108(4):914–23. 15. Martinez-Bosch N, Vinaixa J, Navarro P. Immune evasion in pancreatic cancer: From mechanisms to therapy. Cancers (Basel). 2018;10(1):1–16. 16. Janeway CA Jr, Travers P, Walport M et al. Immunobiology: The Immune System in Health and Disease. 2001 17. Wouters MCA, Nelson BH. Prognostic significance of tumor-infiltrating B cells and plasma cells in human cancer. Clin Cancer Res. 2018;24(24):6125–35. 18. Pineda S, Lopez de Maturana E, Yu K, Ravoor A, Wood I, Malats N, et al. Landscape of Tumor-Infiltrating B and T cell Repertoire in Pancreatic Cancer. Front Immunol. 2021 19. Liston A, Goris A. The origins of diversity in human immunity. Nat Immunol. 2018 Mar;19(3):209–10. 20. Definition of GWAS - NCI Dictionary of Genetics Terms - National Cancer Institute. Available from: https://www.cancer.gov/publications/dictionaries/genetics-dictionary/def/gwas 21. Jehan T, Lakhanpaul S. Single nucleotide polymorphism (SNP) - methods and applications in plant genetics: A review. Indian J Biotechnol. 2006;5(4):435–59. 22. Mao X, Young BD, Lu Y-J. The application of single nucleotide polymorphism microarrays in cancer research. Curr 23. Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D. Benefits and limitations of genome-wide association studies. Nat Rev Genet 24. Hellwege JN, Keaton JM, Giri A, Gao X, Velez Edwards DR, Edwards TL. Population Stratification in Genetic Association Studies. Curr Protoc Hum Genet. 2017;95(1):1.22.1-1.22.23. 25. Slatkin M. Linkage disequilibrium - Understanding the evolutionary past and mapping the medical future. Nat Rev Genet. 2008;9(6):477–85. 26. The Cancer Genome Atlas Program - National Cancer Institute Available from: https://www.cancer.gov/about-nci/organization/ccg/research/structuralgenomics/ tcga 27. Gomez-Rubio P, Piñero J, Molina-Montes E, Gutiérrez-Sacristán A, Marquez M, Rava M, et al. Pancreatic cancer and autoimmune diseases: An association sustained by computational and epidemiological case–control approaches. Int J Cancer. 2019;144(7):1540–9. 28. Bolotin DA, Poslavsky S, Mitrophanov I, Shugay M, Mamedov IZ, Putintseva E V., et al. MiXCR: Software for comprehensive adaptive immunity profiling. Nat Methods. 2015;12(5):380–1. 29. Gauch H, Qian S, Piepho H-P, Zhou L, Chen R. Effective principal components analysis of SNP data. bioRxiv. 2018;393611. 30. Hoerl AE, Kennard RW. Ridge Regression: Applications to NonorthogonalProblems. Technometrics. 1970;12(1):69–82. 31. Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol. 2005;67(2):301–20. 32. Breiman L. Random forests. Machine Learning. 2001;45:5–32. 33. Breiman L. Bagging predictors. Machine Learning. 1996;24:123–40. 34. Dietterich TG. An experimetnal comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning. 2000;40:139–57. 35. Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20(8):832–44. 36. Hastie T, Tibshirani R, Friedman J. The elements of Statistical Learning. Stanford, Calfiornia: Springer; 2008. 37. Bell J. Artificial Neural Networks. In: Machine Learning: Hands-On for Developers and Technical Professionals. 2014. p. 91–116. 38. Martín Martín Q. Investigación Operativa. Prentice Hall; 2005. 39. Sabroso Lasa S. Uso de Técnicas Semánticas y Aprendizaje para el Desarrollo de Sistemas de Recomendación basados en Contenido. Universidad de Zaragoza; 2020. 40. Tibshirani R. Regression Shrinkage and Selection Via the Lasso. J R Stat Soc Ser B 41. R: The R Project for Statistical Computing Available from: https://www.rproject.org/ 42. Elkhader J, Elemento O. Artificial intelligence in oncology: From bench to clinic.Semin Cancer Biol 43. Shahamatdar S, He MX, Reyna MA, Gusev A, AlDubayan SH, Van Allen EM, et al. Germline Features Associated with Immune Infiltration in Solid Tumors. Cell Rep 44. Le Floch É, Guillemot V, Frouin V, Pinel P, Lalanne C, Trinchera L, et al. Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse Partial Least Squares. Neuroimage. 2012;63(1):11–24. 45. Kooperberg C, LeBlanc M, Obenchain V. Risk prediction using genome-wide association studies. Genet Epidemiol. 2010;34(7):643–52. 46. Arabnejad M, Montgomery CG, Gaffney PM, McKinney BA. Nearest-Neighbor Projected Distance Regression for Epistasis Detection in GWAS With Population Structure Correction. Front Genet. 2020;11(July):1–8. 47. Seral-Cortes M, Sabroso-Lasa S, De Miguel-Etayo P, Gonzalez-Gross M, Gesteiro E, Molina-Hidalgo C, et al. Development of a Genetic Risk Score to predict the risk of overweight and obesity in European adolescents from the HELENA study. Sci Rep. 2021;11(1):1–11. 48. Sun S, Miao Z, Ratcliffe B, Campbell P, Pasch B, El-Kassaby YA, et al. SNP variable selection by generalized graph domination. PLoS One. 2019;14(1):1–18. 49. Xu C, Jackson SA. Machine learning and complex biological data The revolution of biological techniques and demands for new data mining methods. 2019;1–4. 50. Pineda S, Sirota M. Determining Significance in the New Era for P Values. J Pediatr Gastroenterol Nutr. 2018;67(5):547–8. 51. Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: A new perspective. Neurocomputing. 2018;300:70–9. 52. 20 Recursive Feature Elimination | The caret Package Available from: https://topepo.github.io/caret/recursive-feature-elimination.html