Regimens¶
Regimens are training algorithms. At its core, a training algorithm is
responsible for setting up the environment and the agent and successively
calling step
on each one, while passing actions and states back and forth
as appropriate. q2’s Regimen
class implements this basic funtionality and
provides hooks for you to customize any other behaviour as desired.
Run:
q2 generate regimen my_regimen
to generate a new regimen from template, then fill in the implementations of whichever event hooks you need.
-
class
q2.regimens.
Regimen
¶ -
before_training
()¶
-
after_training
()¶ Called once at the start and end of training respectively.
-
before_epoch
(epoch:int)¶
-
after_epoch
(epoch:int)¶ Called before and after each epoch.
-
before_episode
(episode:int)¶
-
after_episode
(episode:int)¶ Called before and after each episode.
-
before_step
(step)¶
-
after_step
(step)¶ Called before and after each step of the environment and agent.
-
on_error
(step, exception)¶ Called when an exception occurs. If this method returns
True
then propagation of the exception is stopped, which can be useful when certain exceptions are expected to occur.
-
plugins
() → List[Plugin]¶ A list of plugins to be used by your regimen.
-
log
(msg:str)¶ Add a message to the logging output for the current timestep. This method is implemented by q2 and provided as a convenience, you should not override it with your own implementation.
-
agent: Agent
-
env: Environment
-
sess: tf.Session
The tensorflow session.
-
objective: Objective
-
action_space
¶ The action_space of the environment.
-
observation_space
¶ The observation_space of the environment.
-
agent_constructor: Type[Agent]
Callable that constructs a new agent.
-
env_maker: Maker
Callable that creates a new environment.
-
Plugins¶
When implementing your own regimens, you might find that you want to re-use
the same morsels of useful behaviour in multiple different regimens. You can
achieve this by implementing a Plugin
. For example, q2 comes with a
DisplayFramerate plugin that lets any regimen display a nice framerate
message in the logs without polluting the core logic of the regimen.
The interface of a plugin is identical to that of a Regimen
except that
each method takes as first argument a Regimen
which it can inspect,
interact with or modify. Note that the plugin event hooks are always called
before the regimen’s hooks, so the regimen always has final say over any
state before the next step is run. You should write your plugins to account
for the possibility that the regimen itself might change some state before
the next event happens.
You can use a plugin by calling adding it to the list returned by
regimen.plugins()
.