Policy Evaluation in Batch Reinforcement Learning