Playing With Megastep

While the Minimal Environment tutorial is extremely detailed, it’s not particularly fun. This tutorial instead takes an existing interesting environment and tasks you to adapt it in specific ways. In solving these tasks, you’ll learn how megastep is structured while hopefully keeping the problem-solving part of your brain from losing interest.

The best way to do these experiments is in a notebook of some sort. In a pinch, you can also use this Google Colab. The main problem is the font rendering is a bit weird.

At the bottom of the page are some links to resources you might find helpful with these.

Run the Demo

Install megastep and record an untrained demo agent:

from megastep.demo import *
env = explorer.Explorer(1)
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Swap Out The Geometry

Copy and paste the code for the Minimal env and replace the default toy geometry geometry with cubicasa geometry. Plot it to check that it works:

env = AlteredMinimal()
env.reset()
env.display()

Triangular Geometry

Read the toy geometry code and alter it to write your own triangular-box geometry. Try it out with display():

from megastep import geometry
geometry.display(tri)

Change the Lights

Copy and paste the code for the Minimal env, and replace the scenery() call with your own function. Have this function first call the original scene function, but then modify its baked lighting to be all 1s. Check that it works with display():

from megastep import scene
scene.display(bright)

Change the Colours

Copy and paste the code for the Minimal env, and replace the scenery() call with your own function. Have this function first call the original scene function, but then modify its textures to be all white. Check that it works with display():

from megastep import scene
scene.display(white)

Change the Observations

Copy and paste the code for the Minimal env, and replace the RGB observations with a Depth observation. Alter the plot_state method so you can see your change in action:

from megastep.demo import *
env = AlteredMinimal()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Custom Observations

Trickier. Write a module like Depth that returns a visualization of indices of the lines it’s looking at. You’ll want to read the Render documentation.

Check it works by copying and pasting the code for the Minimal env, then running it through the demo recorder:

from megastep.demo import *
env = AlteredMinimal()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

To get this to work, you’ll need to update the plot_state and the observation space too.

Change the Movement

Copy and paste the code for the Minimal env, and replace the jump-y simple-motion actions with MomentumMovement. Check that it works with the demo recorder:

from megastep.demo import *
env = AlteredMinimal()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Custom Movement

Trickier. Write a module like MomentumMovement that teleports the agent in a different direction depending on which action is chosen.

Check it works by copying and pasting the code for the Minimal env, then running it through the demo recorder:

from megastep.demo import *
env = AlteredMinimal()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Custom Spawning

Trickier. Write a module like RandomSpawns that spawns the even-numbered agents facing right and the odd-numbered agents facing left.

Check it works by copying and pasting the code for the Minimal env, increasing the number of agents, then resetting and displaying it:

from megastep.demo import *
env = AlteredMinimal()
env.reset()
env.display()

Multiple Actions

Trickier. Copy and paste the code for the Minimal env, and add a second set of actions in parallel to the movement actions that cause the agent to respawn at a random location. Multiple actions are not used in either of the examples, so to point you in the right direction, you want your action space to look like this:

self.action_space = dotdict.dotdict(
    movement=self.movement.space,
    respawn=spaces.MultiDiscrete(self.n_agents, 2))

Check it works by running it through the demo recorder:

from megastep.demo import *
env = AlteredMinimal()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Custom Agents

Trickier. The default Agent in the demo class uses a neural net to control things. Rewrite it to go forward while it’s more than 2m away from the nearest object in its field of vision, and to go backward when it’s less than 1m away.

Test it out on Explorer env, which has a depth observation by default. You may want to have a look through the source for the Depth module to understand how depth is perceived.

Check it works by running it through the demo recorder:

from megastep.demo import *
env = Explorer()
agent = Agent(env).cuda()
demo(env=env, agent=agent, length=64)

Watch It Learn

Trickiest. The default demo has a train() func that’ll train an exploration agent in ~ten minutes on a RTX 2080 TI. Check that it works. You can view how training is progressing by opening a second notebook and, after training has begun, running

from rebar import plots;
plots.view()

This is using my entirely undocumented rebar toolkit, which is why I’ve marked it as trickiest. You can watch demos as the agent learns by running

from megastep.demo import *
demo(run=-1, length=64)

It’ll start off moving randomly, but as the reward improves it’ll get obviously better.

Resources

Here are some tools that might help with these tasks.

API reference

The API reference describes the details of megastep in one place. If you decide you want to alter the FOV of the agents for example, a good way to go about it would be to go to the API reference and Ctrl+F for ‘FOV’.

The API reference also links to the source of the code it’s documenting; if you don’t find the detail you want in the docs themselves, clicking through to the source code will often give you an answer.

Concepts

There are some ideas in megastep - like ‘agents’ - which turn up in too many places to document them again and again every time they’re used. Instead, there’s a Concepts page which gives a brief overview of each of these ideas.

FAQ

The FAQ tries to preempt some common questions. It remains to be seen how good of a job I’ve done with it.

Tutorials

If you’re reading this you probably don’t want to read the more in-depth tutorials, but they may still be useful as something to Ctrl+F through when you’re after a specific bit of code.

IPython Help

You can follow any object in an IPython session with ? to get the docs for that object, or ?? to get the source code for that object.

?? will also give you the filepath of the code underlying the object, which is useful for the next bit.

Library Breakpoints

If you’re curious how a library is doing a specific thing and just looking at the code isn’t helpful, you can use ?? to find the path to the code on your system and open that path in an editor. Then you can set a breakpoint anywhere you want! This can be done with breakpoint() in Python 3.7, or import pdb; pdb.set_trace() in earlier versions.

If you’re using autoreload then next time you run the code, you’ll hit the breakpoint. If you’re not using autoreload, you’ll either have to use importlib to manually reload things, or just restart your IPython session.

On top of pdb’s built-in capabilities, I’d also recommend having a look at extract.

Demos

The demo module has examples of two environments and an example of how to train them.

TODO-DOCS Add demos of all of these