Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

图像的简要描述
The architecture for Tru-POMDP. (a) Task Input: Human instruction and the observed scene graph. (b) Tree of Hypothesis: An LLM infers target objects, target areas, and initial locations, producing weighted particles. (c) Hybrid Belief Update: Bayesian filtering updates the belief using particle prediction and elimination, and augments the filtered belief with LLM particles. (d) Online POMDP Planning: Belief tree search computes the optimal action with the help of dynamic action branching and an LLM-written rollout policy.

Abstract

Task planning under uncertainty is essential for home-service robots operating in the real world. Tasks involve ambiguous human instructions, hidden or unknown object locations, and open-vocabulary object types, leading to significant open-ended uncertainty and a boundlessly large planning space. To address these challenges, we propose Tru-POMDP, a planner that combines structured belief generation using Large Language Models (LLMs) with principled POMDP planning. Tru-POMDP introduces a hierarchical Tree of Hypotheses (TOH), which systematically queries an LLM to construct high-quality particle beliefs over possible world states and human goals. We further formulate an open-ended POMDP model that enables rigorous Bayesian belief tracking and efficient belief-space planning over these LLM-generated hypotheses. Experiments on complex object rearrangement tasks across diverse kitchen environments show that \algname significantly outperforms state-of-the-art LLM-based and LLM-tree-search hybrid planners, achieving higher success rates with significantly better plans, stronger robustness to ambiguity and occlusion, and greater planning efficiency.

Video Presentation

we propose Tru-POMDP, a new algorithm for task planning under uncertainty that tightly integrates commonsense reasoning by LLMs with explicit belief tracking and principled POMDP planning. Our main contributions include:

  • The first framework to integrate LLM-based reasoning with principled POMDP planning for household tasks.
  • A novel hybrid belief modeling approach that integrates LLM-generated hypotheses with principled Bayesian filtering.
  • A POMDP model for open-ended object rearrangement tasks and a practical belief-tree search planner for solving such tasks efficiently under large-scale uncertainty.

Experiment

图像的简要描述
Planned results.

We compare Tru-POMDP against a set of strong baselines, including both pure LLM-based planners and tree search planners integrated with LLM reasoning.

  • ReAct: : A closed-loop LLM-based planner that selects actions based on current observations and immediate feedback from the environment.
  • Reflexion: An extension of ReAct that adds a reflection module. Upon repeated failures, it analyzes the history and generates a revised plan. For fair comparison under online planning, we disable environment resets and trigger reflection after three consecutive failed actions.
  • ReAct* and Reflexion*: Prompt-augmented variants that provide the LLM with additional structured descriptions of the task domain, including object types, action semantics, and goal structures.
  • LLM-MCTS: A tree search method that generates a maximum-likelihood hypothesis of the task goal and hidden object placements using an LLM. It then performs Monte Carlo Tree Search (MCTS), repeatedly querying the LLM to guide action selection during simulation.
图像的简要描述
Performance comparison of TRU-POMDP and baselines. Each bar represents the average value with standard error (SE). In (c), the dashed line indicates the maximum allowed step number.

Tru-POMDP significantly outperforms all baselines, demonstrating its capability for planning under uncertainty.