Banca de DEFESA: THAYNA NHAARA OLIVEIRA DAMASCENO

Uma banca de DEFESA de MESTRADO foi cadastrada pelo programa.
DISCENTE : THAYNA NHAARA OLIVEIRA DAMASCENO
DATA : 18/12/2018
HORA: 08:30
LOCAL: BioME
TÍTULO:

All purpose word pairing tool: Easy interaction networks for clinical data.


PALAVRAS-CHAVES:

Text Mining. Bioinformatics. Biomedical Text Mining. Graphs. NICU.


PÁGINAS: 33
GRANDE ÁREA: Ciências Biológicas
ÁREA: Biologia Geral
RESUMO:

Big Data is a term used to characterize the growing volume of existing data on different topics, whether they are biomedical or not. The enormous volume of biological and biomedical data generated daily, one of the main barriers will be an analysis of these data. The development and use of computational tools that allow the analysis of data through techniques such as Text Mining. Text Mining, a Data Mining strand, can be defined as a method that allows the extraction of relevant information contained in text. In order to allow a differentiated analysis of the data, whether these clinical data or not, a simple algorithm was developed, which allows the analysis of this data without the need of correlation with existing databases, nor the creation of new databases. From this algorithm, a WEB tool was developed so that anyone can access the algorithm (even without the knowledge of computational techniques) and promote the analysis of their data. The Integrate Paired Tool (IPT) algorithm was written in R programming language and uses Data Mining and Text Mining techniques for analyzing clinical data, not restricting its analyzes only to these specific data. IPT promotes pairing of terms by analyzing the existing frequency between data pairs, from a user-supplied .csv file. In addition, the WEB tool was developed from the languages JavaScript, HTML5, CSS and PHP. The algorithm reads the .csv file and pass through it by pairing its terms two by two, regardless of whether the columns are different sizes or incomplete until all columns are paired. After all the groupings, a value is assigned to each grouped pair, adding all pairs with the same frequencies and generating another .csv file containing the existing interactions and their respective frequencies. After the relations and their appearance frequencies are formed, a graph of interactions (in R) is shown on the WEB tool screen, so the user can do their analyzes, in addition to the .csv file with all interactions and frequencies. This graph and this table can contain variable information, depending on the percentage that the user chooses in the IPT tool. This .csv file with interaction and frequency data can be used by the user in other network visualization tools, such as Gephi, for example. For the purposes of tool testing, a data from a neonatal was used. The IPT proved to work well and reached the objectives of the research, and as future goals, we will have the hosting of the tool in the page of the Program of Postgraduate in Bioformtics of UFRN, the analysis of other data and a possible integration of the pre-processing of the data within the IPT itself.


MEMBROS DA BANCA:
Presidente - 1893445 - EUZEBIO GUIMARAES BARBOSA
Externo à Instituição - GILDERLANIO SANTANA DE ARAÚJO - UFPA
Externo ao Programa - 2432313 - RAND RANDALL MARTINS
Interno - 3063244 - TETSU SAKAMOTO
Notícia cadastrada em: 04/12/2018 17:52
SIGAA | Superintendência de Tecnologia da Informação - (84) 3342 2210 | Copyright © 2006-2024 - UFRN - sigaa11-producao.info.ufrn.br.sigaa11-producao