MazeEnv¶

class maze.core.env.maze_env.MazeEnv(*args, **kwds)¶

Base class for (gym style) environments wrapping a core environment and defining state and execution interfaces. The aim of this class is to provide reusable functionality across different gym environments. This functionality comprises for example the reset-function, the step-function or the render-function.

Parameters

core_env – Core environment.
action_conversion_dict – A dictionary with action conversion interface implementation and policy names as keys.
observation_conversion_dict – A dictionary with observation conversion interface implementation and policy names as keys.

property action_conversion¶: Return the action conversion mapping for the current policy.

action_conversion_dict¶: The action conversion mapping used by this env.

property action_space¶: Keep this env compatible with the gym interface by returning the action space of the current policy.

property action_spaces_dict¶: Policy action spaces as dict.

actor_id() → Tuple[Union[str, int], int]¶: forward call to self.core_env

close() → None ¶: forward call to self.core_env

core_env¶: wrapped CoreEnv

get_env_time() → int ¶: Return ID of the current core env step as env time.

get_episode_id() → str ¶: Return the ID of current episode (the ID changes on env reset).

get_kpi_calculator() → Optional[maze.core.log_events.kpi_calculator.KpiCalculator]¶: forward call to self.core_env

get_maze_action() → Any¶: Return last MazeAction object for trajectory recording.

get_maze_state() → Any¶: Return current State object for the core env for trajectory recording.

get_observation_and_action_dicts(maze_state: Optional[Any], maze_action: Optional[Any], first_step_in_episode: bool) → Tuple[Optional[Dict[Union[int, str], Any]], Optional[Dict[Union[int, str], Any]]]¶

Convert MazeState and MazeAction back into observations and actions using the space conversion interfaces.

Parameters

maze_state – State of the environment
maze_action – MazeAction (the one following the state given as the first param)
first_step_in_episode – True if this is the first step in the episode.

Returns

observation and action dictionaries (keys are substep_ids)

get_renderer() → maze.core.rendering.renderer.Renderer ¶: Return the renderer exposed by the underlying core env.

get_step_events() → Iterable[maze.core.events.event_record.EventRecord]¶: forward call to self.core_env

is_actor_done() → bool ¶: forward call to self.core_env

maze_env¶: direct access to the maze env (useful to bypass the wrapper hierarchy)

metadata¶: Only there to be compatible with gym.core.Env

property observation_conversion¶: Return the state to observation mapping for the current policy.

observation_conversion_dict¶: The observation conversion mapping used by this env.

property observation_space¶: Keep this env compatible with the gym interface by returning the observation space of the current policy.

property observation_spaces_dict¶: Policy observation spaces as dict.

reset() → Any¶

Resets the environment and returns the initial observation.

Returns: the initial observation after resetting.

reward_range¶: A tuple (reward min value, reward max value) to be compatible with gym.core.Env

seed(seed: Any) → None ¶: forward call to self.core_env

spec¶: Only there to be compatible with gym.core.Env

step(action: Any) → Tuple[Any, float, bool, Dict[Any, Any]]¶

Take environment step (see CoreEnv.step for details).

Parameters: action – the action the agent wants to take.
Returns: observation, reward, done, info