gym_os2r.tasks¶
gym_os2r.tasks.monopod¶
- class gym_os2r.tasks.monopod.MonopodTask(agent_rate, **kwargs)¶
Bases:
gym_ignition.base.task.Task
,abc.ABC
Monopod task defines the main task functionality for the monopod environment. Task must be wrapped in a runtime or randomizer class to use with igntion or the real robot.
- Required *kwargs
Task requires the kwargs; ‘task_mode’, ‘reward_class’, ‘reset_positions’.
- Type
dict
- task_mode¶
The defined monopod task. current default tasks, ‘free_hip’, ‘fixed_hip’, ‘fixed’, ‘old-free_hip’, ‘old-fixed_hip’, ‘old-fixed’.
- Type
str
- reward_class¶
Class defining the reward. Must have same functions as RewardBase.
- reset_positions¶
Reset locations of the task. currently supports, ‘stand’, ‘half_stand’, ‘ground’, ‘lay’, ‘float’.
- Type
[str]
- observation_index¶
dictionry with the joint_name_pos and joint_name_vel as keys with values corresponding to its index in the observation space.
- Type
dict
- calculate_reward(obs, action)¶
Calculates the reward given observation and action. The reward is calculated in a provided reward class defined in the tasks kwargs.
- Parameters
obs (np.array) – numpy array with the same size task dimensions as observation space.
Deque[np.array] (actions) – Deque of actions taken by the environment numpy array with the same size task dimensions as action space.
- Returns
True for done, False otherwise.
- Return type
(bool)
- create_spaces()¶
Constructs observtion and action spaces for monopod task. Spaces definition is defined in ../config/default/settings.yaml …
- Returns
action space. (ndarray): observation space.
- Return type
(ndarray)
- get_info()¶
Return the info dictionary. :rtype:
Dict
:returns: Adict
with extra information of the task.
- get_observation()¶
Returns the current observation state of the monopod.
- Returns
Array of joint positions and velocities.
- Return type
(ndarray)
- get_reward()¶
Returns the reward for the current monopod state.
- Returns
True for done, False otherwise.
- Return type
(bool)
- get_state_info(obs, actions)¶
Returns the reward and is_done for some observation and action space.
- Parameters
obs (np.array) – numpy array with the same size task dimensions as observation space.
Deque[np.array] (actions) – Deque of actions taken by the environment numpy array with the same size task dimensions as action space.
- Returns
Rewrd given the state. (bool): True for done, False otherwise.
- Return type
(Reward)
- is_done()¶
Checks if the current state of the robot is outside of the reset_space. logs the reason for the reset as a debug message.
- Returns
True for done, False otherwise.
- Return type
(bool)
- reset_task()¶
Resets the environment into default state. sets the scenario backend into force controller mode Sets the max generalized force for eachcontrolled joint.
- Return type
None
- set_action(action, store_action=True)¶
Set generalized force target for each controlled joint.
- Parameters
action (ndrray) – Generalized force target for each controlled joint.
store_action (bool) – True to store action taken in action history otherwise false to ignore.
- Returns
True if success otherwise false.
- Return type
(bool)
- Raises
(RuntimeError) – Failed to set joints torque target.
gym_os2r.tasks.monopod_no_norm¶
- class gym_os2r.tasks.monopod_no_norm.MonopodTask(agent_rate, **kwargs)¶
Bases:
gym_ignition.base.task.Task
,abc.ABC
Monopod task defines the main task functionality for the monopod environment. Task must be wrapped in a runtime or randomizer class to use with igntion or the real robot.
- Required *kwargs
Task requires the kwargs; ‘task_mode’, ‘reward_class’, ‘reset_positions’.
- Type
dict
- task_mode¶
The defined monopod task. current default tasks, ‘free_hip’, ‘fixed_hip’, ‘fixed’, ‘old-free_hip’, ‘old-fixed_hip’, ‘old-fixed’.
- Type
str
- reward_class¶
Class defining the reward. Must have same functions as RewardBase.
- reset_positions¶
Reset locations of the task. currently supports, ‘stand’, ‘half_stand’, ‘ground’, ‘lay’, ‘float’.
- Type
[str]
- observation_index¶
dictionry with the joint_name_pos and joint_name_vel as keys with values corresponding to its index in the observation space.
- Type
dict
- calculate_reward(obs, action)¶
Calculates the reward given observation and action. The reward is calculated in a provided reward class defined in the tasks kwargs.
- Parameters
obs (np.array) – numpy array with the same size task dimensions as observation space.
Deque[np.array] (actions) – Deque of actions taken by the environment numpy array with the same size task dimensions as action space.
- Returns
True for done, False otherwise.
- Return type
(bool)
- create_spaces()¶
Constructs observtion and action spaces for monopod task. Spaces definition is defined in ../config/default/settings.yaml …
- Returns
action space. (ndarray): observation space.
- Return type
(ndarray)
- get_info()¶
Return the info dictionary. :rtype:
Dict
:returns: Adict
with extra information of the task.
- get_observation()¶
Returns the current observation state of the monopod.
- Returns
Array of joint positions and velocities.
- Return type
(ndarray)
- get_reward()¶
Returns the reward for the current monopod state.
- Returns
True for done, False otherwise.
- Return type
(bool)
- get_state_info(obs, actions)¶
Returns the reward and is_done for some observation and action space.
- Parameters
obs (np.array) – numpy array with the same size task dimensions as observation space.
Deque[np.array] (actions) – Deque of actions taken by the environment numpy array with the same size task dimensions as action space.
- Returns
Rewrd given the state. (bool): True for done, False otherwise.
- Return type
(Reward)
- is_done()¶
Checks if the current state of the robot is outside of the reset_space. logs the reason for the reset as a debug message.
- Returns
True for done, False otherwise.
- Return type
(bool)
- reset_task()¶
Resets the environment into default state. sets the scenario backend into force controller mode Sets the max generalized force for eachcontrolled joint.
- Return type
None
- set_action(action, store_action=True)¶
Set generalized force target for each controlled joint.
- Parameters
action (ndrray) – Generalized force target for each controlled joint.
store_action (bool) – True to store action taken in action history otherwise false to ignore.
- Returns
True if success otherwise false.
- Return type
(bool)
- Raises
(RuntimeError) – Failed to set joints torque target.