Reward function design

shape