Next Steps¶
Training Implementation¶
We have implemented PPO/SAC following the OpenAI spinning 1 up examples. The code for our training scripts can be found here.
Note
Many algorithms such as SAC might not be suitable to train on real without modification or large compute. On policy training may cause issues for the robots real-time schedule. There are a few creative work arounds that may be interesting avenues for future research.
- 1
Joshua Achiam. Spinning up in deep reinforcement learning. 2018.