Skip to content

Commit

Permalink
update website
Browse files Browse the repository at this point in the history
  • Loading branch information
jakegrigsby committed Nov 5, 2023
1 parent 08b27a9 commit 4cc91be
Show file tree
Hide file tree
Showing 15 changed files with 68 additions and 9 deletions.
2 changes: 1 addition & 1 deletion _site/feed.xml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-02T23:51:57-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-05T00:59:27-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
37 changes: 33 additions & 4 deletions _site/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -239,20 +239,45 @@ <h4>
</tbody>
</table>

<table border="0" cellspacing="10" cellpadding="0" align="center">
<tbody>
<tr>
<td align="center" valign="middle">
</td>
<a href="./src/figure/combined_metaworld_throughput.png"><img src="./src/figure/combined_metaworld_throughput.png" style="width:100%;" /> </a>
</tr>
</tbody>
</table>




METAWORLD FIGURE HERE


<hr />

<h1 align="center">Adaptive Instruction-Following</h1>

<h4>
<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks.
<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. We introduce several easily-simulated benchmarks to research this setting, which highlight the importance of AMAGO's technical details:

<br /><br />

As an example, we evaluate instruction-conditioned agents in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement.
<table border="0" cellspacing="10" cellpadding="0" align="center">
<tbody>
<tr>
<td align="center" valign="middle">
</td>
<a href="./src/figure/maze_results_from_ptt.png"><img src="./src/figure/maze_results_from_ppt.png" style="width:100%;" /> </a>
</tr>
</tbody>
</table>


<br /><br />

Finally, we evaluate AMAGO in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement. Our agents need to generalize over thousands of unique goals in previously unseen environments, creating a highly general instruction-following agent.

</h4>


Expand All @@ -268,7 +293,9 @@ <h4>


<h4>
Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction with sample videos below. <br /><br />
Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction. Sample videos are shown below; these tasks are prompted by the user at test-time, and each video represents just one of the thousands of instructions an agent was trained on.

<br /><br />
</h4>

<table border="0" cellspacing="10" cellpadding="0" align="center">
Expand Down Expand Up @@ -311,6 +338,8 @@ <h4>
</table>


<br /><br />
Check out our paper for more details and results!

<hr />

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _site/src/figure/maze_results_from_ppt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 33 additions & 4 deletions index.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -250,20 +250,45 @@ AMAGO handles meta-learning as a simple extension of zero-shot generalization, a
</tbody>
</table>

<table border="0" cellspacing="10" cellpadding="0" align="center">
<tbody>
<tr>
<td align="center" valign="middle">
</td>
<a href="./src/figure/combined_metaworld_throughput.png"><img src="./src/figure/combined_metaworld_throughput.png" style="width:100%;"> </a>
</tr>
</tbody>
</table>




METAWORLD FIGURE HERE


<hr>

<h1 align="center">Adaptive Instruction-Following</h1>

<h4>
<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks.
<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. We introduce several easily-simulated benchmarks to research this setting, which highlight the importance of AMAGO's technical details:

<br><br>

As an example, we evaluate instruction-conditioned agents in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement.
<table border="0" cellspacing="10" cellpadding="0" align="center">
<tbody>
<tr>
<td align="center" valign="middle">
</td>
<a href="./src/figure/maze_results_from_ptt.png"><img src="./src/figure/maze_results_from_ppt.png" style="width:100%;"> </a>
</tr>
</tbody>
</table>


<br><br>

Finally, we evaluate AMAGO in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement. Our agents need to generalize over thousands of unique goals in previously unseen environments, creating a highly general instruction-following agent.

</h4>


Expand All @@ -279,7 +304,9 @@ As an example, we evaluate instruction-conditioned agents in the procedurally ge


<h4>
Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction with sample videos below. <br><br>
Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction. Sample videos are shown below; these tasks are prompted by the user at test-time, and each video represents just one of the thousands of instructions an agent was trained on.

<br><br>
</h4>

<table border="0" cellspacing="10" cellpadding="0" align="center">
Expand Down Expand Up @@ -322,6 +349,8 @@ Above, we use several single-task instructions to evaluate the exploration capab
</table>


<br><br>
Check out our paper for more details and results!

<hr>

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
I"{"source"=>"/Users/jakegrigsby/amago/src/figure", "destination"=>"/Users/jakegrigsby/amago/src/figure/_site", "collections_dir"=>"", "cache_dir"=>".jekyll-cache", "plugins_dir"=>"_plugins", "layouts_dir"=>"_layouts", "data_dir"=>"_data", "includes_dir"=>"_includes", "collections"=>{"posts"=>{"output"=>true, "permalink"=>"/:categories/:year/:month/:day/:title:output_ext"}}, "safe"=>false, "include"=>[".htaccess"], "exclude"=>[".sass-cache", ".jekyll-cache", "gemfiles", "Gemfile", "Gemfile.lock", "node_modules", "vendor/bundle/", "vendor/cache/", "vendor/gems/", "vendor/ruby/"], "keep_files"=>[".git", ".svn"], "encoding"=>"utf-8", "markdown_ext"=>"markdown,mkdown,mkdn,mkd,md", "strict_front_matter"=>false, "show_drafts"=>nil, "limit_posts"=>0, "future"=>false, "unpublished"=>false, "whitelist"=>[], "plugins"=>[], "markdown"=>"kramdown", "highlighter"=>"rouge", "lsi"=>false, "excerpt_separator"=>"\n\n", "incremental"=>false, "detach"=>false, "port"=>"4000", "host"=>"127.0.0.1", "baseurl"=>nil, "show_dir_listing"=>false, "permalink"=>"date", "paginate_path"=>"/page:num", "timezone"=>nil, "quiet"=>false, "verbose"=>false, "defaults"=>[], "liquid"=>{"error_mode"=>"warn", "strict_filters"=>false, "strict_variables"=>false}, "kramdown"=>{"auto_ids"=>true, "toc_levels"=>[1, 2, 3, 4, 5, 6], "entity_output"=>"as_char", "smart_quotes"=>"lsquo,rsquo,ldquo,rdquo", "input"=>"GFM", "hard_wrap"=>false, "guess_lang"=>true, "footnote_nr"=>1, "show_warnings"=>false}, "livereload_port"=>35729, "serving"=>true, "watch"=>true, "url"=>"http://localhost:4000"}:ET
Binary file added src/figure/_site/amago_fish_icon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/_site/case_studies_arxiv_v2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/_site/crafter_condensed_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/_site/fig1_iclr_e_notation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/_site/maze_results_from_ppt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/combined_metaworld_throughput.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/figure/maze_results_from_ppt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 4cc91be

Please sign in to comment.