Computer Science Thesis Proposal

— 3:30pm

Location:
Reddy Conference Room 4405 - Gates Hillman Centers

Speaker:
RUI MANUEL RAMOS TEIXEIRA Da SILVA , Ph.D. Student
/RUI%20MANUEL%20RAMOS%20TEIXEIRA%20Da%20SILVA

Extending MDPs for Planning Beyond Direct Control

Planning under uncertainty using Markov decision processes (MDPs) requires a model of the environment specifying the probabilistic effects of the actions that the agent is able of executing. The optimal policy of the agent is then computed from such model. As such, when planning, the agent assumes the environment may only change as the direct result of its actions. In this thesis we consider lifting that restriction and allow the agent to reason over additional configurations of the environment, which are reachable indirectly by soliciting the assistance of some external entity, possibly at a cost. We introduce Reachable MDPs, a new class of MDPs that accounts for these indirectly reachable environments, allowing the agent to explicitly solicit them when beneficial. Solving Reachable MDPs consists in the maximization of the expected value/cost trade-off over possible changes to the environment. We show that this optimization is a NP-Hard problem, and propose two approaches to solving it. In the first approach we assume the agent is allowed to request changes to the world during execution time, and formulate the optimization as a planning problem. In the second approach we assume the agent is only allowed to solicit a different environment before execution time, and formulate the problem as a joint optimization of the transition probabilities and optimal policy of the MDP. Furthermore, this thesis will extend the approaches proposed to support hierarchical structures of external control, where external control may be possible through different entities at different costs and yielding different outcomes.

Thesis Committee:
Manuela Veloso (Co-Chair)
Francisco S. Melo (Co-Chair) (Instituto Superior Técnico, Universidade de Lisboa)
Ariel Procaccia
Reid Simmons
Pedro Lima (nstituto Superior Técnico, Universidade de Lisboa)

Copy of Thesis Summary
 

For More Information:
deb@cs.cmu.edu