DefaultPolicy¶

class maze.core.agent.default_policy.DefaultPolicy(policies: Union[List[Union[None, str, Mapping[str, Any], Any]], Mapping[str, Union[None, str, Mapping[str, Any], Any]]])¶

Encapsulates one or more policies identified by policy IDs.

Parameters: policies – Dict of policy IDs and corresponding policies.

compute_action(observation: Dict[str, numpy.ndarray], maze_state: Optional[Any] = None, policy_id: Union[str, int] = None, deterministic: bool = False) → Dict[str, Union[int, numpy.ndarray]]¶: implementation of Policy interface

compute_top_action_candidates(observation: Dict[str, numpy.ndarray], num_candidates: int, maze_state: Optional[Any] = None, policy_id: Union[str, int] = None, deterministic: bool = False) → Tuple[Sequence[Dict[str, Union[int, numpy.ndarray]]], Sequence[float]]¶: implementation of Policy interface

needs_state() → bool ¶: This policy does not require the state() object to compute the action.