Policies, Critics and Agents¶

This page contains the reference documentation for policies, critics and agents.

maze.core.agent¶

Policies:

`FlatPolicy`	Generic flat policy interface.
`Policy`	Structured policy class designed to work with structured environments.
`TorchPolicy`	Encapsulates multiple torch policies along with a distribution mapper for training and rollouts in structured environments.
`DefaultPolicy`	Encapsulates one or more policies identified by policy IDs.
`RandomPolicy`	Implements a random structured policy.
`DummyCartPolePolicy`	Dummy structured policy for the CartPole env.
`SerializedTorchPolicy`	Structured policy used for rollouts of trained models.

Critics:

`StateCritic`	Structured state critic class designed to work with structured environments.
`TorchStateCritic`	Encapsulates multiple torch state critics for training in structured environments.
`TorchSharedStateCritic`	One critic is shared across all sub-steps or actors (default to use for standard gym-style environments).
`TorchStepStateCritic`	Each sub-step or actor gets its individual critic.
`TorchDeltaStateCritic`	First sub step gets a regular critic, subsequent sub-steps predict a delta w.r.t.
`StateActionCritic`	Structured state action critic class designed to work with structured environments.
`TorchStateActionCritic`	Encapsulates multiple torch state action critics for training in structured environments.
`TorchSharedStateActionCritic`	One critic is shared across all sub-steps or actors (default to use for standard gym-style environments).
`TorchStepStateActionCritic`	Each sub-step or actor gets its individual critic.

Models:

`TorchModel`	Base class for any torch model.
`TorchActorCritic`	Encapsulates a structured torch policy and critic for training actor-critic algorithms in structured environments.