Banca de QUALIFICAÇÃO: MADEMERSON LEANDRO DA COSTA

Uma banca de QUALIFICAÇÃO de DOUTORADO foi cadastrada pelo programa.
DISCENTE : MADEMERSON LEANDRO DA COSTA
DATA : 11/01/2017
HORA: 09:00
LOCAL: Núcleo de Pesquisa e Inovação em Tecnologia da Informação - NPITI
TÍTULO:

Hierarchical Reinforcement Learning and Parallel Computing Applied to the K-Server Problem


PALAVRAS-CHAVES:

Metrical Task Systems, The K-Server Problem, Curse of Dimensionality, Hierarchical Reinforcement Learning, Q-Learning Algorithm, Parallel Computing.


PÁGINAS: 98
GRANDE ÁREA: Engenharias
ÁREA: Engenharia Elétrica
RESUMO:

A metrical task system is an abstract model for a class of online optimization problems, including the paging, list acessing, and the k-server problem. The use of reinforcement learning to solving these problems, although proved to be efficient, is restricted to a simple class of problems due to the curse of dimensionality inherent to the method. This work presents a solution that uses reinforcement learning, based on hierarchical decomposition techniques and parallel computing to solve optimization problems in metric spaces, in order to extend the applicability of the method to more complex problems, bypassing the restriction of its use to minor problems. As the size of the storage structure used by reinforcement learning to obtain the optimal policy grows as a function of the number of states and actions, which in turn is proportional to the number n of nodes and k of server, it is noticed that their growth is given exponentially (. To circumvent it, the problem was modeled with a multi-step decision process where we initially used the k-means algorithm as a grouping method to decompose the problem into smaller subproblems. Then the Q-learning algorithm was applied in the subgroups, aiming at achieving the best servo displacement policy. In this step, parallel computing techniques were used so that the learning and storage processes in the subgroups were executed in parallel. In this way, it was tried to reduce the dimension of the problem, as well as the total execution time of the algorithm, making possible the application of the proposed method to large instances. We will analyze aspects related to the quality of the hierarchical solution obtained when compared to the classical reinforcement learning, and its possible limitations. In addition to parallel performance analysis.


MEMBROS DA BANCA:
Presidente - 347628 - ADRIAO DUARTE DORIA NETO
Externo ao Programa - 350241 - JORGE DANTAS DE MELO
Externo ao Programa - 1673543 - SAMUEL XAVIER DE SOUZA
Externo à Instituição - FRANCISCO CHAGAS DE LIMA JUNIOR - UERN
Externo à Instituição - JOAO PAULO QUEIROZ DOS SANTOS - IFRN
Notícia cadastrada em: 26/12/2016 15:30
SIGAA | Superintendência de Tecnologia da Informação - (84) 3342 2210 | Copyright © 2006-2024 - UFRN - sigaa07-producao.info.ufrn.br.sigaa07-producao