Module ilpyt.agents
An agent's role during learning is to coordinate the policy learning and
execution. Here, the policy refers to a function (in this case, a deep neural
network), which maps states to actions. An agent's key functions include a
step
and update
function.
To create a custom agent, see BaseAgent
for more details.
Expand source code
"""
An agent's role during learning is to coordinate the policy learning and
execution. Here, the policy refers to a function (in this case, a deep neural
network), which maps states to actions. An agent's key functions include a
`step` and `update` function.
To create a custom agent, see `BaseAgent` for more details.
"""
Sub-modules
ilpyt.agents.a2c_agent
-
An implementation of the agent from the Advantage Actor Critic (A2C) algorithm. This algorithm was described in the paper "Asynchronous Methods for …
ilpyt.agents.base_agent
-
BaseAgent
is the abstract class for an agent. An agent's role during learning is to coordinate the policy learning and execution. Here, the policy … ilpyt.agents.dqn_agent
-
An implementation of the agent from the Deep Q-Networks (DQN) algorithm. This algorithm was described in the paper "Human Level Control Through Deep …
ilpyt.agents.gail_agent
-
An implementation of the agent from the Generative Adversarial Imitation Learning (GAIL) algorithm. This algorithm was described in the paper …
ilpyt.agents.gcl_agent
-
The agent from the Guided Cost Learning (GCL) algorithm. This algorithm was described in the paper "Guided Cost Learning: Deep Inverse Optimal …
ilpyt.agents.heuristic_agent
-
Heuristic agents for various OpenAI Gym environments. The agent policies, in this case, are deterministic functions, and often handcrafted or found …
ilpyt.agents.imitation_agent
-
An implementation of a simple behavioral cloning (BC) agent, as in An Autonomous Land Vehicle in a Neural Network (ALVINN). The BC algorithm was …
ilpyt.agents.ppo_agent
-
An implementation of the agent from the Proximal Policy Optimization (PPO) algorithm. This algorithm was described in the paper "Proximal Policy …