gym_os2r.common.vec_env¶

class gym_os2r.common.vec_env.SubprocVecEnv(env_fns, start_method=None)¶

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters

env_fns ([callable]) – A list of functions that will create the environments (each callable returns a Gym.Env instance when called).
start_method (str) – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

close()¶: Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)¶: Call instance methods of vectorized environments.

get_attr(attr_name, indices=None)¶

Return attribute from vectorized environment (see base class).

Returns: class attribute.
Return type: (dict)

get_state_info(states, actions)¶

get reward and done of the environments in a given state

Parameters: actions ([int] or [float]) – the state
Returns: reward, done
Return type: ([float], [bool])

get_state_info_async(states, actions)¶

get_state_info_wait()¶

reset()¶

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns: observation
Return type: ([int] or [float])

seed(seed=None)¶

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for: each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)¶: Set attribute inside vectorized environments (see base class).

step_async(actions)¶: Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()¶

Wait for the step taken with step_async().

Returns: observation, reward, done, information
Return type: ([int] or [float], [float], [bool], dict)

gym_os2r.common.vec_env.subproc_vec_env¶

class gym_os2r.common.vec_env.subproc_vec_env.SubprocVecEnv(env_fns, start_method=None)¶

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters

env_fns ([callable]) – A list of functions that will create the environments (each callable returns a Gym.Env instance when called).
start_method (str) – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

close()¶: Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)¶: Call instance methods of vectorized environments.

get_attr(attr_name, indices=None)¶

Return attribute from vectorized environment (see base class).

Returns: class attribute.
Return type: (dict)

get_state_info(states, actions)¶

get reward and done of the environments in a given state

Parameters: actions ([int] or [float]) – the state
Returns: reward, done
Return type: ([float], [bool])

get_state_info_async(states, actions)¶

get_state_info_wait()¶

reset()¶

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns: observation
Return type: ([int] or [float])

seed(seed=None)¶

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for: each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)¶: Set attribute inside vectorized environments (see base class).

step_async(actions)¶: Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()¶

Wait for the step taken with step_async().

Returns: observation, reward, done, information
Return type: ([int] or [float], [float], [bool], dict)

gym_os2r.common.vec_env.vec_env¶

exception gym_os2r.common.vec_env.vec_env.AlreadySteppingError¶

Bases: Exception

Raised when an asynchronous step is running while step_async() is called again.

class gym_os2r.common.vec_env.vec_env.CloudpickleWrapper(var)¶: Bases: object

exception gym_os2r.common.vec_env.vec_env.NotSteppingError¶

Bases: Exception

Raised when an asynchronous step is not running but step_wait() is called.

class gym_os2r.common.vec_env.vec_env.VecEnv(num_envs, observation_space, action_space)¶

Bases: abc.ABC

An abstract asynchronous, vectorized environment.

Parameters

num_envs (int) – the number of environments
observation_space (Gym Space) – the observation space
action_space (Gym Space) – the action space

abstract close()¶: Clean up the environment’s resources.

abstract env_method(method_name, *method_args, indices=None, **method_kwargs)¶

Call instance methods of vectorized environments.

Parameters

method_name (str) – The name of the environment method to invoke.
indices (list,int) – Indices of envs whose method to call
method_args (tuple) – Any positional arguments to provide in the call
method_kwargs (dict) – Any keyword arguments to provide in the call

Returns

List of items returned by the environment’s method call

Return type

(list)

abstract get_attr(attr_name, indices=None)¶

Return attribute from vectorized environment.

Parameters

attr_name (str) – The name of the attribute whose value to return
indices (list,int) – Indices of envs to get attribute from

Returns

List of values of ‘attr_name’ in all environments

Return type

(list)

get_images()¶

Return RGB images from each environment

Return type: Sequence[ndarray]

getattr_depth_check(name, already_found)¶

Check if an attribute reference is being hidden in a recursive call to __getattr__

Parameters

name (str) – name of attribute to check for
already_found (bool) – whether this attribute has already been found in a wrapper

Returns

name of module whose attribute is being shadowed, if any.

Return type

(str or None)

metadata = {'render.modes': ['human', 'rgb_array']}¶

render(mode='human')¶

Gym environment rendering

Parameters: mode (str) – the rendering type

abstract reset()¶

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns: observation
Return type: ([int] or [float])

abstract seed(seed=None)¶

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed (Optional[int]) – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for: each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

abstract set_attr(attr_name, value, indices=None)¶

Set attribute inside vectorized environments.

Parameters

attr_name (str) – The name of attribute to assign new value
value (obj) – Value to assign to attr_name
indices (list,int) – Indices of envs to assign value

Returns

(NoneType)

step(actions)¶

Step the environments with the given action

Parameters: actions ([int] or [float]) – the action
Returns: observation, reward, done, information
Return type: ([int] or [float], [float], [bool], dict)

abstract step_async(actions)¶: Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()¶

Wait for the step taken with step_async().

Returns: observation, reward, done, information
Return type: ([int] or [float], [float], [bool], dict)

property unwrapped¶

class gym_os2r.common.vec_env.vec_env.VecEnvWrapper(venv, observation_space=None, action_space=None)¶

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Vectorized environment base class :param venv: the vectorized environment to wrap :type venv: VecEnv :param observation_space: the observation space (can be None to load from venv) :type observation_space: Gym Space :param action_space: the action space (can be None to load from venv) :type action_space: Gym Space

close()¶: Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)¶

Call instance methods of vectorized environments.

Parameters

method_name (str) – The name of the environment method to invoke.
indices (list,int) – Indices of envs whose method to call
method_args (tuple) – Any positional arguments to provide in the call
method_kwargs (dict) – Any keyword arguments to provide in the call

Returns

List of items returned by the environment’s method call

Return type

(list)

get_attr(attr_name, indices=None)¶

Return attribute from vectorized environment.

Parameters

attr_name (str) – The name of the attribute whose value to return
indices (list,int) – Indices of envs to get attribute from

Returns

List of values of ‘attr_name’ in all environments

Return type

(list)

get_images()¶: Return RGB images from each environment

getattr_depth_check(name, already_found)¶

See base class.

Returns: name of module whose attribute is being shadowed, if any.
Return type: (str or None)

getattr_recursive(name)¶

Recursively check wrappers to find attribute.

Parameters: name (str) – name of attribute to look for
Returns: attribute
Return type: (object)

render(mode='human')¶

Gym environment rendering

Parameters: mode (str) – the rendering type

abstract reset()¶

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns: observation
Return type: ([int] or [float])

seed(seed=None)¶

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for: each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)¶

Set attribute inside vectorized environments.

Parameters

attr_name (str) – The name of attribute to assign new value
value (obj) – Value to assign to attr_name
indices (list,int) – Indices of envs to assign value

Returns

(NoneType)

step_async(actions)¶: Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()¶

Wait for the step taken with step_async().

Returns: observation, reward, done, information
Return type: ([int] or [float], [float], [bool], dict)