Using Reinforcement Learning for Probabilistic Program Inference
Extended abstract for the Probabilistic Program Semantics Workshop associated with the Principles of Programming Languages (POPL) conference, Los Angeles, CA (January 2018).
Inference in probabilistic programming often involves choosing between different methods. For example, one could use different algorithms to compute a conditional probability, or one could sample variables in different orders. Researchers have taken a variety of approaches to handle the array of choices. Mansinghka  advocates meta-programming, in which a user guides the solution interactively. Alternatively, we  have presented an approach that decomposes inference problems into small subproblems and optimizes each separately. In general, the problem of optimizing inference falls into the general area of programming by optimization .
In this abstract, we explore the use of reinforcement learning (RL) in a novel way to optimize inference. In this approach, we automatically adjust how inference is performed based on seeing how various approaches are performing. A given inference task might involve many choices. Each of these choices is optimized by a separate RL. In this way, we get a network of interacting learners for an inference problem. We first describe our general approach and then describe three particular kinds of strategies.
 A. Lu, “Venture: an extensible platform for probabilistic meta-programming,” MIT Master’s Thesis, 2016.
 A. Pfeffer, B. Ruttenberg, and W. Kretschmer, “Structured Factored Inference: A Framework for Automated Reasoning in Probabilistic Programming Languages,” ArXiv160603298 Cs, Jun. 201.
 H. H. Hoos, “Programming by optimization,” Commun. ACM, vol. 55, no. 2, pp. 70–80, 2012.
For More InformationTo learn more or request a copy of a paper (if available), contact Avi Pfeffer.
(Please include your name, address, organization, and the paper reference. Requests without this information will not be honored.)