Banca de DEFESA: LUKAS IOHAN DA CRUZ CARVALHO

Uma banca de DEFESA de DOUTORADO foi cadastrada pelo programa.
STUDENT : LUKAS IOHAN DA CRUZ CARVALHO
DATE: 26/02/2024
TIME: 09:00
LOCAL: Google Meet, meet.google.com/ihd-fkdq-zht
TITLE:

EVALUATION OF A NEW NEURONAL INDUCTION PROTOCOL USING SINGLE-CELL RNA-SEQUENCING AND MACHINE LEARNING


KEY WORDS:

Single-Cell RNA-Seq; iPSC-derived neurons; Machine learning; Regional identity


PAGES: 126
BIG AREA: Ciências Biológicas
AREA: Biologia Geral
SUMMARY:

Cell type identification is a critical step in the computational analysis of scRNA-Seq experiments, involving the unsupervised grouping of cells based on gene expression profiles. Traditional methods relying on canonical gene markers exhibit limitations, such as sensitivity to variations and the absence of characteristic genes for certain cell types. To address these challenges, we propose a novel approach combining machine learning algorithms with feature selection. Our methodology involves selecting a dataset suitable for training a model to ensure generalization to new data. We chose a comprehensive dataset encompassing the central and peripheral nervous system from mice at different developmental stages. Subsequently, feature selection was applied using the DUBStepR algorithm, considering gene-gene correlations to identify optimal features for cell classification. The resulting dataset, composed of 28,795 cells and 16,960 genes, was used to train and evaluate models employing k Nearest Neighborhood (kNN), Decision Tree (DT), Naive Bayes (NB), Support Vector Machine (SVM) and Multilayer Perceptron (MLP) algorithms. All models demonstrated F1-scores exceeding 90%, except for NB. Testing on a human brain scRNA-Seq dataset confirmed the robustness of the algorithms, with area under curve (AUC) values indicating accurate cell classification. SVM and MLP were selected for further analysis due to lower false positive and false negative rates. Comparisons with existing tools such as scAnnotatR and ACTINN highlight the versatility of our approach, particularly when dealing with diverse cell types. Next, we applied the SVM and MLP models to classify neurons generated in vitro human-induced neurons (hiNs) generated using distinct protocols, achieving consistent results in identifying glutamatergic and GABAergic neurons. We also attempted to classify hiNs according to cells of different brain regions, revealing challenges in classifying GABAergic neurons by region, possibly due to a limited number of optimal features. Gene expression analysis and Gene Set Enrichment Analysis (GSEA) contributed to identify gene sets associated with the electrophysiological maturation of glutamatergic hiNs generated through an alternative protocol using ASCL1 compared to other protocols. Regulatory network analysis identified master transcription factors with higher activity specifically in this protocol. In conclusion, our integrated approach of feature selection and machine learning algorithms offers an alternative way of identifying cell groups based on gene expression profiles, enhancing the refinement of single-cell analysis in the context of differential gene expression, GSEA, and regulatory gene networks.


COMMITTEE MEMBERS:
Externo à Instituição - CECÍLIA HEDIN-PEREIRA - Fiocruz - RJ
Presidente - 1674643 - MARCOS ROMUALDO COSTA
Externo à Instituição - MYCHAEL VINÍCIUS DA COSTA LOURENÇO
Interno - 1507794 - RODRIGO JULIANI SIQUEIRA DALMOLIN
Externo ao Programa - 2183828 - TARCISO ANDRE FERREIRA VELHO - null
Notícia cadastrada em: 15/02/2024 16:01
SIGAA | Superintendência de Tecnologia da Informação - (84) 3342 2210 | Copyright © 2006-2024 - UFRN - sigaa02-producao.info.ufrn.br.sigaa02-producao