RandomPolicy¶
-
class
maze.core.agent.random_policy.
RandomPolicy
(action_spaces_dict: Dict[Union[str, int], gym.spaces.Space])¶ Implements a random structured policy.
- Parameters
action_spaces_dict – The action_spaces dict from the env
-
compute_action
(observation: Dict[str, numpy.ndarray], maze_state: Optional[Any], policy_id: Union[str, int] = None, deterministic: bool = False) → Dict[str, Union[int, numpy.ndarray]]¶ Query a policy that corresponds to the given ID for action.
- Parameters
observation – Current observation of the environment
maze_state – Current state of the environment (will always be None as needs_state() returns False)
policy_id – ID of the policy to query (does not have to be provided if policies dict contain only 1 policy
deterministic – Specify if the action should be computed deterministically
- Returns
Next action to take