I worked on this for my W22 Experimental Mobile Robotics Project. To get a clear idea of what this project is about, read my report.
In summary, we drop our agent in a grid and instruct it to go to one or more tiles selected by colour. We provide this instruction via natural lanaguage. To succeed, the agent must learn to go to the tile specified by the instruction then stop, all within the provided time constraints.