Policies, Critics and Agents

This page contains the reference documentation for policies, critics and agents.

maze.core.agent

Policies:

FlatPolicy

Generic flat policy interface.

Policy

Structured policy class designed to work with structured environments.

TorchPolicy

Encapsulates multiple torch policies along with a distribution mapper for training and rollouts in structured environments.

DefaultPolicy

Encapsulates one or more policies identified by policy IDs.

RandomPolicy

Implements a random structured policy.

DummyCartPolePolicy

Dummy structured policy for the CartPole env.

SerializedTorchPolicy

Structured policy used for rollouts of trained models.

Critics:

StateCritic

Structured state critic class designed to work with structured environments.

TorchStateCritic

Encapsulates multiple torch state critics for training in structured environments.

TorchSharedStateCritic

One critic is shared across all sub-steps or actors (default to use for standard gym-style environments).

TorchStepStateCritic

Each sub-step or actor gets its individual critic.

TorchDeltaStateCritic

First sub step gets a regular critic, subsequent sub-steps predict a delta w.r.t.

StateActionCritic

Structured state action critic class designed to work with structured environments.

TorchStateActionCritic

Encapsulates multiple torch state action critics for training in structured environments.

TorchSharedStateActionCritic

One critic is shared across all sub-steps or actors (default to use for standard gym-style environments).

TorchStepStateActionCritic

Each sub-step or actor gets its individual critic.

Models:

TorchModel

Base class for any torch model.

TorchActorCritic

Encapsulates a structured torch policy and critic for training actor-critic algorithms in structured environments.