gym_os2r.common.vec_env

class gym_os2r.common.vec_env.SubprocVecEnv(env_fns, start_method=None)

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters
  • env_fns ([callable]) – A list of functions that will create the environments (each callable returns a Gym.Env instance when called).

  • start_method (str) – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

close()

Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)

Call instance methods of vectorized environments.

get_attr(attr_name, indices=None)

Return attribute from vectorized environment (see base class).

Returns

class attribute.

Return type

(dict)

get_state_info(states, actions)

get reward and done of the environments in a given state

Parameters

actions ([int] or [float]) – the state

Returns

reward, done

Return type

([float], [bool])

get_state_info_async(states, actions)
get_state_info_wait()
reset()

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns

observation

Return type

([int] or [float])

seed(seed=None)

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for

each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)

Set attribute inside vectorized environments (see base class).

step_async(actions)

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()

Wait for the step taken with step_async().

Returns

observation, reward, done, information

Return type

([int] or [float], [float], [bool], dict)

gym_os2r.common.vec_env.subproc_vec_env

class gym_os2r.common.vec_env.subproc_vec_env.SubprocVecEnv(env_fns, start_method=None)

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters
  • env_fns ([callable]) – A list of functions that will create the environments (each callable returns a Gym.Env instance when called).

  • start_method (str) – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

close()

Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)

Call instance methods of vectorized environments.

get_attr(attr_name, indices=None)

Return attribute from vectorized environment (see base class).

Returns

class attribute.

Return type

(dict)

get_state_info(states, actions)

get reward and done of the environments in a given state

Parameters

actions ([int] or [float]) – the state

Returns

reward, done

Return type

([float], [bool])

get_state_info_async(states, actions)
get_state_info_wait()
reset()

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns

observation

Return type

([int] or [float])

seed(seed=None)

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for

each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)

Set attribute inside vectorized environments (see base class).

step_async(actions)

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

step_wait()

Wait for the step taken with step_async().

Returns

observation, reward, done, information

Return type

([int] or [float], [float], [bool], dict)

gym_os2r.common.vec_env.vec_env

exception gym_os2r.common.vec_env.vec_env.AlreadySteppingError

Bases: Exception

Raised when an asynchronous step is running while step_async() is called again.

class gym_os2r.common.vec_env.vec_env.CloudpickleWrapper(var)

Bases: object

exception gym_os2r.common.vec_env.vec_env.NotSteppingError

Bases: Exception

Raised when an asynchronous step is not running but step_wait() is called.

class gym_os2r.common.vec_env.vec_env.VecEnv(num_envs, observation_space, action_space)

Bases: abc.ABC

An abstract asynchronous, vectorized environment.

Parameters
  • num_envs (int) – the number of environments

  • observation_space (Gym Space) – the observation space

  • action_space (Gym Space) – the action space

abstract close()

Clean up the environment’s resources.

abstract env_method(method_name, *method_args, indices=None, **method_kwargs)

Call instance methods of vectorized environments.

Parameters
  • method_name (str) – The name of the environment method to invoke.

  • indices (list,int) – Indices of envs whose method to call

  • method_args (tuple) – Any positional arguments to provide in the call

  • method_kwargs (dict) – Any keyword arguments to provide in the call

Returns

List of items returned by the environment’s method call

Return type

(list)

abstract get_attr(attr_name, indices=None)

Return attribute from vectorized environment.

Parameters
  • attr_name (str) – The name of the attribute whose value to return

  • indices (list,int) – Indices of envs to get attribute from

Returns

List of values of ‘attr_name’ in all environments

Return type

(list)

get_images()

Return RGB images from each environment

Return type

Sequence[ndarray]

getattr_depth_check(name, already_found)

Check if an attribute reference is being hidden in a recursive call to __getattr__

Parameters
  • name (str) – name of attribute to check for

  • already_found (bool) – whether this attribute has already been found in a wrapper

Returns

name of module whose attribute is being shadowed, if any.

Return type

(str or None)

metadata = {'render.modes': ['human', 'rgb_array']}
render(mode='human')

Gym environment rendering

Parameters

mode (str) – the rendering type

abstract reset()

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns

observation

Return type

([int] or [float])

abstract seed(seed=None)

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed (Optional[int]) – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for

each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

abstract set_attr(attr_name, value, indices=None)

Set attribute inside vectorized environments.

Parameters
  • attr_name (str) – The name of attribute to assign new value

  • value (obj) – Value to assign to attr_name

  • indices (list,int) – Indices of envs to assign value

Returns

(NoneType)

step(actions)

Step the environments with the given action

Parameters

actions ([int] or [float]) – the action

Returns

observation, reward, done, information

Return type

([int] or [float], [float], [bool], dict)

abstract step_async(actions)

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()

Wait for the step taken with step_async().

Returns

observation, reward, done, information

Return type

([int] or [float], [float], [bool], dict)

property unwrapped
class gym_os2r.common.vec_env.vec_env.VecEnvWrapper(venv, observation_space=None, action_space=None)

Bases: gym_os2r.common.vec_env.vec_env.VecEnv

Vectorized environment base class :param venv: the vectorized environment to wrap :type venv: VecEnv :param observation_space: the observation space (can be None to load from venv) :type observation_space: Gym Space :param action_space: the action space (can be None to load from venv) :type action_space: Gym Space

close()

Clean up the environment’s resources.

env_method(method_name, *method_args, indices=None, **method_kwargs)

Call instance methods of vectorized environments.

Parameters
  • method_name (str) – The name of the environment method to invoke.

  • indices (list,int) – Indices of envs whose method to call

  • method_args (tuple) – Any positional arguments to provide in the call

  • method_kwargs (dict) – Any keyword arguments to provide in the call

Returns

List of items returned by the environment’s method call

Return type

(list)

get_attr(attr_name, indices=None)

Return attribute from vectorized environment.

Parameters
  • attr_name (str) – The name of the attribute whose value to return

  • indices (list,int) – Indices of envs to get attribute from

Returns

List of values of ‘attr_name’ in all environments

Return type

(list)

get_images()

Return RGB images from each environment

getattr_depth_check(name, already_found)

See base class.

Returns

name of module whose attribute is being shadowed, if any.

Return type

(str or None)

getattr_recursive(name)

Recursively check wrappers to find attribute.

Parameters

name (str) – name of attribute to look for

Returns

attribute

Return type

(object)

render(mode='human')

Gym environment rendering

Parameters

mode (str) – the rendering type

abstract reset()

Reset all the environments and return an array of observations, or a tuple of observation arrays. If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns

observation

Return type

([int] or [float])

seed(seed=None)

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – (Optional[int]): The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for

each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

Return type

(List[Union[None, int]])

set_attr(attr_name, value, indices=None)

Set attribute inside vectorized environments.

Parameters
  • attr_name (str) – The name of attribute to assign new value

  • value (obj) – Value to assign to attr_name

  • indices (list,int) – Indices of envs to assign value

Returns

(NoneType)

step_async(actions)

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step. You should not call this if a step_async run is already pending.

abstract step_wait()

Wait for the step taken with step_async().

Returns

observation, reward, done, information

Return type

([int] or [float], [float], [bool], dict)