Policies, Critics and Agents¶
This page contains the reference documentation for policies, critics and agents.
maze.core.agent¶
Policies:
Generic flat policy interface. |
|
Structured policy class designed to work with structured environments. |
|
Encapsulates multiple torch policies along with a distribution mapper for training and rollouts in structured environments. |
|
Encapsulates one or more policies identified by policy IDs. |
|
Implements a random structured policy. |
|
Dummy structured policy for the CartPole env. |
|
Structured policy used for rollouts of trained models. |
Critics:
Structured state critic class designed to work with structured environments. |
|
Encapsulates multiple torch state critics for training in structured environments. |
|
One critic is shared across all sub-steps or actors (default to use for standard gym-style environments). |
|
Each sub-step or actor gets its individual critic. |
|
First sub step gets a regular critic, subsequent sub-steps predict a delta w.r.t. |
|
Structured state action critic class designed to work with structured environments. |
|
Encapsulates multiple torch state action critics for training in structured environments. |
|
One critic is shared across all sub-steps or actors (default to use for standard gym-style environments). |
|
Each sub-step or actor gets its individual critic. |
Models:
Base class for any torch model. |
|
Encapsulates a structured torch policy and critic for training actor-critic algorithms in structured environments. |