batch_outputs_time_major¶
-
class
maze.train.trainers.impala.impala_batching.batch_outputs_time_major(actor_outputs: List[maze.train.parallelization.distributed_actors.actor.AgentOutput], learner_device: str)¶ Batch the collected output in time major format
- Parameters
actor_outputs – A list of actor outputs (e.g. rollouts consisting of observations, actions_taken, infos, action_logtis, rewards and dones)
learner_device – the device (‘cpu’ or ‘cuda’) of the learner
- Returns
An ActorOutput Named tuple where the the list of input rollouts has been batched in the second dim.