DummyStructuredDistributedEnv¶

class maze.train.parallelization.distributed_env.dummy_distributed_env.DummyStructuredDistributedEnv(env_factories: List[Callable[], Union[maze.core.env.structured_env.StructuredEnv, maze.core.env.structured_env_spaces_mixin.StructuredEnvSpacesMixin]]], logging_prefix: Optional[str] = None)¶

Creates a simple wrapper for multiple environments, calling each environment in sequence on the current Python process. This is useful for computationally simple environment such as cartpole-v1, as the overhead of multiprocess or multi-thread outweighs the environment computation time. This can also be used for RL methods that require a vectorized environment, but that you want a single environments to train with.

Parameters: env_factories – A list of functions that will create the environments (each callable returns a MultiStepEnvironment instance when called).

property action_space¶: implementation of StructuredEnvSpacesMixin interface

property action_spaces_dict¶: Return the action space of one of the distributed envs.

actor_id() → List[Tuple[Union[str, int], int]]¶: Return the actor id tuples of all envs in a list.

close() → None ¶: BaseDistributedEnv implementation

get_stats(level: maze.core.log_stats.log_stats.LogStatsLevel = <LogStatsLevel.EPOCH: 3>) → maze.core.log_stats.log_stats.LogStatsAggregator ¶

Returns the aggregator of the individual episode statistics emitted by the parallel envs.

Parameters: level – Must be set to LogStatsLevel.EPOCH, step or episode statistics are not propagated

get_stats_value(event: Callable, level: maze.core.log_stats.log_stats.LogStatsLevel, name: Optional[str] = None) → Union[int, float, numpy.ndarray, dict]¶

Obtain a single value from the epoch statistics dict.

Parameters

event – The event interface method of the value in question.
name – The output_name of the statistics in case it has been specified in maze.core.log_stats.event_decorators.define_epoch_stats()
level – Must be set to LogStatsLevel.EPOCH, step or episode statistics are not propagated.

is_actor_done() → numpy.ndarray ¶: Return the done flags of all actors in a list.

property observation_space¶: implementation of StructuredEnvSpacesMixin interface

property observation_spaces_dict¶: Return the observation space of one of the distributed envs.

reset() → Dict[str, numpy.ndarray]¶: BaseDistributedEnv implementation

seed(seed: int = None) → None ¶: BaseDistributedEnv implementation

step(actions: List[Any]) → Tuple[Dict[str, numpy.ndarray], numpy.ndarray, numpy.ndarray, Iterable[Dict[Any, Any]]]¶

Step the environments with the given actions.

Parameters: actions – the list of actions for the respective envs.
Returns: observations, rewards, dones, information-dicts all in env-aggregated form.

write_epoch_stats()¶: Trigger the epoch statistics generation.