Partially Observable Reinforcement Learning