StateCritic¶

class maze.core.agent.state_critic.StateCritic¶

Structured state critic class designed to work with structured environments. (see StructuredEnv).

It encapsulates state critic and queries them for values according to the provided policy ID.

abstract predict_value(observation: Dict[str, numpy.ndarray], critic_id: Union[int, str]) → torch.Tensor¶

Query a critic that corresponds to the given ID for the state value.

Parameters

observation – Current observation of the environment
critic_id – The critic id to query

Returns

The value for this observation

abstract predict_values(observations: Dict[Union[str, int], Dict[str, numpy.ndarray]]) → Tuple[Dict[Union[str, int], torch.Tensor], Dict[Union[str, int], torch.Tensor]]¶

Query a critic that corresponds to the given ID for the state value.

Parameters: observations – Current observation of the environment
Returns: Tuple containing the values and detached values