MazeEnv

class maze.core.env.maze_env.MazeEnv(*args, **kwds)

Base class for (gym style) environments wrapping a core environment and defining state and execution interfaces. The aim of this class is to provide reusable functionality across different gym environments. This functionality comprises for example the reset-function, the step-function or the render-function.

Parameters
  • core_env – Core environment.

  • action_conversion_dict – A dictionary with action conversion interface implementation and policy names as keys.

  • observation_conversion_dict – A dictionary with observation conversion interface implementation and policy names as keys.

property action_conversion

Return the action conversion mapping for the current policy.

action_conversion_dict

The action conversion mapping used by this env.

property action_space

Keep this env compatible with the gym interface by returning the action space of the current policy.

property action_spaces_dict

Policy action spaces as dict.

actor_id() → Tuple[Union[str, int], int]

forward call to self.core_env

close()None

forward call to self.core_env

core_env

wrapped CoreEnv

get_env_time()int

Return ID of the current core env step as env time.

get_episode_id()str

Return the ID of current episode (the ID changes on env reset).

get_kpi_calculator() → Optional[maze.core.log_events.kpi_calculator.KpiCalculator]

forward call to self.core_env

get_maze_action() → Any

Return last MazeAction object for trajectory recording.

get_maze_state() → Any

Return current State object for the core env for trajectory recording.

get_observation_and_action_dicts(maze_state: Optional[Any], maze_action: Optional[Any], first_step_in_episode: bool) → Tuple[Optional[Dict[Union[int, str], Any]], Optional[Dict[Union[int, str], Any]]]

Convert MazeState and MazeAction back into observations and actions using the space conversion interfaces.

Parameters
  • maze_state – State of the environment

  • maze_action – MazeAction (the one following the state given as the first param)

  • first_step_in_episode – True if this is the first step in the episode.

Returns

observation and action dictionaries (keys are substep_ids)

get_renderer()maze.core.rendering.renderer.Renderer

Return the renderer exposed by the underlying core env.

get_step_events() → Iterable[maze.core.events.event_record.EventRecord]

forward call to self.core_env

is_actor_done()bool

forward call to self.core_env

maze_env

direct access to the maze env (useful to bypass the wrapper hierarchy)

metadata

Only there to be compatible with gym.core.Env

property observation_conversion

Return the state to observation mapping for the current policy.

observation_conversion_dict

The observation conversion mapping used by this env.

property observation_space

Keep this env compatible with the gym interface by returning the observation space of the current policy.

property observation_spaces_dict

Policy observation spaces as dict.

reset() → Any

Resets the environment and returns the initial observation.

Returns

the initial observation after resetting.

reward_range

A tuple (reward min value, reward max value) to be compatible with gym.core.Env

seed(seed: Any)None

forward call to self.core_env

spec

Only there to be compatible with gym.core.Env

step(action: Any) → Tuple[Any, float, bool, Dict[Any, Any]]

Take environment step (see CoreEnv.step for details).

Parameters

action – the action the agent wants to take.

Returns

observation, reward, done, info