BaseWorker¶

class maze.train.parallelization.base_worker.BaseWorker¶

This class holds a policy as well an env in order to step through the env, by producing action from the policy and recoding the rollout to be processed by the learner.

abstract rollout() → Union[Tuple[maze.train.parallelization.base_worker.BaseWorkerOutput, List], Tuple[numpy.ndarray, List]]¶

Interface to performs an agent rollout, that is sample actions, step through the env for a maximum of n_rollout_steps and collect data.

Returns: This rollout as an ActorOutput or array of ActorOutputs

abstract update_policy(state_dict: Dict) → None ¶

Update the policy with the given state dict.

Parameters: state_dict – State dict to load.