update website

UT-Austin-RPL · Nov 5, 2023 · 4cc91be · 4cc91be
1 parent 08b27a9
commit 4cc91be
Show file tree

Hide file tree

Showing 15 changed files with 68 additions and 9 deletions.
diff --git a/_site/feed.xml b/_site/feed.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-02T23:51:57-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
+<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-05T00:59:27-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
diff --git a/_site/index.html b/_site/index.html
@@ -239,20 +239,45 @@ <h4>
   </tbody>
 </table>
 
+<table border="0" cellspacing="10" cellpadding="0" align="center"> 
+  <tbody>
+   <tr>
+	<td align="center" valign="middle">
+    </td>
+	 <a href="./src/figure/combined_metaworld_throughput.png"><img src="./src/figure/combined_metaworld_throughput.png" style="width:100%;" /> </a>
+   </tr>
+  </tbody>
+</table>
+
+
+
 
-METAWORLD FIGURE HERE
 
 
 <hr />
 
 <h1 align="center">Adaptive Instruction-Following</h1>
 
 <h4>
-<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. 
+<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. We introduce several easily-simulated benchmarks to research this setting, which highlight the importance of AMAGO's technical details:
 
 <br /><br />
 
-As an example, we evaluate instruction-conditioned agents in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement.
+<table border="0" cellspacing="10" cellpadding="0" align="center"> 
+  <tbody>
+   <tr>
+	<td align="center" valign="middle">
+    </td>
+	 <a href="./src/figure/maze_results_from_ptt.png"><img src="./src/figure/maze_results_from_ppt.png" style="width:100%;" /> </a>
+   </tr>
+  </tbody>
+</table>
+
+
+<br /><br />
+
+Finally, we evaluate AMAGO in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement. Our agents need to generalize over thousands of unique goals in previously unseen environments, creating a highly general instruction-following agent.
+
 </h4>
 
 
@@ -268,7 +293,9 @@ <h4>
 
 
 <h4>
-Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction with sample videos below. <br /><br />
+Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction. Sample videos are shown below; these tasks are prompted by the user at test-time, and each video represents just one of the thousands of instructions an agent was trained on.
+
+<br /><br />
 </h4>
 
 <table border="0" cellspacing="10" cellpadding="0" align="center"> 
@@ -311,6 +338,8 @@ <h4>
 </table>
 
 
+<br /><br />
+Check out our paper for more details and results!
 
 <hr />
 

diff --git a/_site/src/figure/combined_metaworld_throughput.png b/_site/src/figure/combined_metaworld_throughput.png
diff --git a/_site/src/figure/maze_results_from_ppt.png b/_site/src/figure/maze_results_from_ppt.png
diff --git a/index.markdown b/index.markdown
@@ -250,20 +250,45 @@ AMAGO handles meta-learning as a simple extension of zero-shot generalization, a
   </tbody>
 </table>
 
+<table border="0" cellspacing="10" cellpadding="0" align="center"> 
+  <tbody>
+   <tr>
+	<td align="center" valign="middle">
+    </td>
+	 <a href="./src/figure/combined_metaworld_throughput.png"><img src="./src/figure/combined_metaworld_throughput.png" style="width:100%;"> </a>
+   </tr>
+  </tbody>
+</table>
+
+
+
 
-METAWORLD FIGURE HERE
 
 
 <hr>
 
 <h1 align="center">Adaptive Instruction-Following</h1>
 
 <h4>
-<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. 
+<b> An important benefit of off-policy learning is the ability to <i> relabel </i> rewards in hindsight </b>. AMAGO extends <a href="https://arxiv.org/abs/1707.01495">hindsight experience replay</a> to "instructions" or sequences of multiple goals. Relabeling instructions extends the diversity of our dataset and plays to the strengths of data-hungry Transformers while generating automatic exploration curricula for more complex objectives. The combination of AMAGO's relabeling, memory-based adaptation, and long-horizon learning update can be very effective in goal-conditioned generalization tasks. We introduce several easily-simulated benchmarks to research this setting, which highlight the importance of AMAGO's technical details:
 
 <br><br>
 
-As an example, we evaluate instruction-conditioned agents in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement.
+<table border="0" cellspacing="10" cellpadding="0" align="center"> 
+  <tbody>
+   <tr>
+	<td align="center" valign="middle">
+    </td>
+	 <a href="./src/figure/maze_results_from_ptt.png"><img src="./src/figure/maze_results_from_ppt.png" style="width:100%;"> </a>
+   </tr>
+  </tbody>
+</table>
+
+
+<br><br>
+
+Finally, we evaluate AMAGO in the procedurally generated worlds of <a href="https://arxiv.org/abs/2109.06780">Crafter</a>. Instructions are strings from a closed vocabulary of Crafter's achievement system, with added goals for navigation and block placement. Our agents need to generalize over thousands of unique goals in previously unseen environments, creating a highly general instruction-following agent.
+
 </h4>
 
 
@@ -279,7 +304,9 @@ As an example, we evaluate instruction-conditioned agents in the procedurally ge
 
 
 <h4>
-Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction with sample videos below. <br><br>
+Above, we use several single-task instructions to evaluate the exploration capabilities of various ablations. As tasks require more exploration and adaptation to new world layouts, AMAGO's memory and relabeling become essential to success. Multi-step goals require considerable generalization, and AMAGO qualitatively demonstrates a clear understanding of the instruction. Sample videos are shown below; these tasks are prompted by the user at test-time, and each video represents just one of the thousands of instructions an agent was trained on.
+
+<br><br>
 </h4>
 
 <table border="0" cellspacing="10" cellpadding="0" align="center"> 
@@ -322,6 +349,8 @@ Above, we use several single-task instructions to evaluate the exploration capab
 </table>
 
 
+<br><br>
+Check out our paper for more details and results!
 
 <hr>
 

diff --git a/...yll/Cache/Jekyll--Cache/b7/9606fb3afea5bd1609ed40b622142f1c98125abcfe89a76a661b0e8e343910 b/...yll/Cache/Jekyll--Cache/b7/9606fb3afea5bd1609ed40b622142f1c98125abcfe89a76a661b0e8e343910
@@ -0,0 +1 @@
+I"{"source"=>"/Users/jakegrigsby/amago/src/figure", "destination"=>"/Users/jakegrigsby/amago/src/figure/_site", "collections_dir"=>"", "cache_dir"=>".jekyll-cache", "plugins_dir"=>"_plugins", "layouts_dir"=>"_layouts", "data_dir"=>"_data", "includes_dir"=>"_includes", "collections"=>{"posts"=>{"output"=>true, "permalink"=>"/:categories/:year/:month/:day/:title:output_ext"}}, "safe"=>false, "include"=>[".htaccess"], "exclude"=>[".sass-cache", ".jekyll-cache", "gemfiles", "Gemfile", "Gemfile.lock", "node_modules", "vendor/bundle/", "vendor/cache/", "vendor/gems/", "vendor/ruby/"], "keep_files"=>[".git", ".svn"], "encoding"=>"utf-8", "markdown_ext"=>"markdown,mkdown,mkdn,mkd,md", "strict_front_matter"=>false, "show_drafts"=>nil, "limit_posts"=>0, "future"=>false, "unpublished"=>false, "whitelist"=>[], "plugins"=>[], "markdown"=>"kramdown", "highlighter"=>"rouge", "lsi"=>false, "excerpt_separator"=>"\n\n", "incremental"=>false, "detach"=>false, "port"=>"4000", "host"=>"127.0.0.1", "baseurl"=>nil, "show_dir_listing"=>false, "permalink"=>"date", "paginate_path"=>"/page:num", "timezone"=>nil, "quiet"=>false, "verbose"=>false, "defaults"=>[], "liquid"=>{"error_mode"=>"warn", "strict_filters"=>false, "strict_variables"=>false}, "kramdown"=>{"auto_ids"=>true, "toc_levels"=>[1, 2, 3, 4, 5, 6], "entity_output"=>"as_char", "smart_quotes"=>"lsquo,rsquo,ldquo,rdquo", "input"=>"GFM", "hard_wrap"=>false, "guess_lang"=>true, "footnote_nr"=>1, "show_warnings"=>false}, "livereload_port"=>35729, "serving"=>true, "watch"=>true, "url"=>"http://localhost:4000"}:ET
diff --git a/src/figure/_site/amago_fish_icon.png b/src/figure/_site/amago_fish_icon.png
diff --git a/src/figure/_site/case_studies_arxiv_v2.png b/src/figure/_site/case_studies_arxiv_v2.png
diff --git a/src/figure/_site/combined_metaworld_throughput.png b/src/figure/_site/combined_metaworld_throughput.png
diff --git a/src/figure/_site/crafter_condensed_results.png b/src/figure/_site/crafter_condensed_results.png
diff --git a/src/figure/_site/fig1_iclr_e_notation.png b/src/figure/_site/fig1_iclr_e_notation.png
diff --git a/src/figure/_site/maze_results_from_ppt.png b/src/figure/_site/maze_results_from_ppt.png
diff --git a/src/figure/_site/popgym_summary_expanded_outliers.png b/src/figure/_site/popgym_summary_expanded_outliers.png
diff --git a/src/figure/combined_metaworld_throughput.png b/src/figure/combined_metaworld_throughput.png
diff --git a/src/figure/maze_results_from_ppt.png b/src/figure/maze_results_from_ppt.png
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-02T23:51:57-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
		<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="http://localhost:4000/feed.xml" rel="self" type="application/atom+xml" /><link href="http://localhost:4000/" rel="alternate" type="text/html" /><updated>2023-11-05T00:59:27-05:00</updated><id>http://localhost:4000/feed.xml</id><title type="html">AMAGO</title><subtitle>A simple and scalable agent for sequence-based RL</subtitle></feed>
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		I"{"source"=>"/Users/jakegrigsby/amago/src/figure", "destination"=>"/Users/jakegrigsby/amago/src/figure/_site", "collections_dir"=>"", "cache_dir"=>".jekyll-cache", "plugins_dir"=>"_plugins", "layouts_dir"=>"_layouts", "data_dir"=>"_data", "includes_dir"=>"_includes", "collections"=>{"posts"=>{"output"=>true, "permalink"=>"/:categories/:year/:month/:day/:title:output_ext"}}, "safe"=>false, "include"=>[".htaccess"], "exclude"=>[".sass-cache", ".jekyll-cache", "gemfiles", "Gemfile", "Gemfile.lock", "node_modules", "vendor/bundle/", "vendor/cache/", "vendor/gems/", "vendor/ruby/"], "keep_files"=>[".git", ".svn"], "encoding"=>"utf-8", "markdown_ext"=>"markdown,mkdown,mkdn,mkd,md", "strict_front_matter"=>false, "show_drafts"=>nil, "limit_posts"=>0, "future"=>false, "unpublished"=>false, "whitelist"=>[], "plugins"=>[], "markdown"=>"kramdown", "highlighter"=>"rouge", "lsi"=>false, "excerpt_separator"=>"\n\n", "incremental"=>false, "detach"=>false, "port"=>"4000", "host"=>"127.0.0.1", "baseurl"=>nil, "show_dir_listing"=>false, "permalink"=>"date", "paginate_path"=>"/page:num", "timezone"=>nil, "quiet"=>false, "verbose"=>false, "defaults"=>[], "liquid"=>{"error_mode"=>"warn", "strict_filters"=>false, "strict_variables"=>false}, "kramdown"=>{"auto_ids"=>true, "toc_levels"=>[1, 2, 3, 4, 5, 6], "entity_output"=>"as_char", "smart_quotes"=>"lsquo,rsquo,ldquo,rdquo", "input"=>"GFM", "hard_wrap"=>false, "guess_lang"=>true, "footnote_nr"=>1, "show_warnings"=>false}, "livereload_port"=>35729, "serving"=>true, "watch"=>true, "url"=>"http://localhost:4000"}:ET