Predictive State Representations (PSRs) are an efficient method for modelling partially observable dynamical systems. They have shown advantages over the latent state-based approaches by using functions of a set of observable quantities, called tests, to represent model states. As a consequence, discovering the set of tests is one of the central problems in PSRs. Existing techniques either discover the tests through iterative methods, which can only solve toy problems, or avoid the complex discovery problem by maintaining a very large set of tests, which may be prohibitively expensive. In this paper, we propose a new approach to discovering tests by formulating it as one sequential decision making problem. To solve the decision making problem, we resort to the Monte-Carlo tree search (MCTS) algorithm that has shown significant advantage on solving complex search problems. We further develop the concept of model entropy for measuring the model accuracy as the evaluation function in MCTS. We conduct experiments on several domains including one extremely large domain and the experimental results show the effectiveness of our approach.
|Publication status||Published - 9 Jul 2016|