ActorAgent

class maze.train.parallelization.distributed_actors.actor.ActorAgent(env_factory: Callable, policy: maze.core.agent.torch_policy.TorchPolicy, n_rollout_steps: int)

Steps through a given environment and records rollouts. Designed to be used in distributed rollouts.

rollout() → maze.train.parallelization.distributed_actors.actor.AgentOutput_w_stats

Performs a agent rollout, that is sample actions and step through the env for a maximum of n_rollout_steps. This rollout (observations, rewards, dones, infos, actions_taken, actions_logits) is returned

Returns

This rollout (observations, rewards, dones, infos, actions_taken, actions_logits) as an ActorOutput named tuple

update_policy(state_dict: Dict) → NoReturn

Update the policy with the given state dict.

Parameters

state_dict – State dict to load.