-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
247 lines (197 loc) · 10.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>CVPR'24 Tutorial on 3D/4D Generation and Modeling with Generative Priors</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.2.0/css/bootstrap.min.css">
<link href="css/materialize.css" type="text/css" rel="stylesheet" media="screen,projection"/>
<link href='http://fonts.googleapis.com/css?family=Lato:400,700' rel='stylesheet' type='text/css'>
<link href="css/style.css" rel="stylesheet" type="text/css" />
</head>
<body id="page-top">
<div class="navbar-fixed" >
<nav class="teal lighten-2" role="navigation">
<div class="nav-wrapper" >
<ul class="center hide-on-med-and-down nav navbar-nav navbar-center">
<li><a class="page-scroll" href="#page-top" style="color:#CD853F;font-size:20px">Home</a></li>
<li><a class="page-scroll" href="#overview" style="color:#CD853F;font-size:20px">Overview</a></li>
<li><a class="page-scroll" href="#organizer" style="color:#CD853F;font-size:20px">Organizer</a></li>
<li><a class="page-scroll" href="#schedule" style="color:#CD853F;font-size:20px">Program</a></li>
<li><a class="page-scroll" href="#speaker" style="color:#CD853F;font-size:20px">Speaker</a></li>
</ul>
</div>
</nav>
</div>
<div class="container">
<table border="0" align="center">
<tr>
<td width="700" align="center" valign="middle"><h3>CVPR 2024 Tutorial on</h3>
<span class="title">3D/4D Generation and Modeling with Generative Priors</span></td>
</tr>
</table>
<h3 align="center"> <b>Date:</b> Tuesday, June 18th 8:30 a.m. PDT - noon PDT. </h3>
<h3 align="center"> <b>Location:</b> Summit 440-441</h3>
<!--h3 colspan="3" align="center"><br> Slides and recorded videos will be provided on this webpage.</h3-->
<!-->
<!--h3 colspan="3" align="center"><br>The tutorial can be accessed at: <a a href=https://ohyay.co/s/cvpr-tutorial-on-unlocking-creativity> this URL </a>.
<br>
Anyone can join! </h3-->
<!-- <p><img src="figures/teaser.jpg" width="1000" align="middle" /></p> -->
</div>
</br>
<div class="container" id="recorded-video">
<h2>Recorded Video</h2>
<div>
<div class="text-center">
<div style="position:relative;padding-top:56.25%;">
<iframe style="position:absolute;top:0;left:0;width:100%;height:100%;"
src="https://www.youtube.com/embed/QA5vxU5KxUc?si=B3WwuzrulGemSVD4" title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen></iframe>
</div>
</div>
<p></p>
</div>
</br>
<div class="container" id="overview">
<h2>Overview</h2>
<div class="overview">
</br>
<div class="media-container">
<img src="media/GTR.gif" class="media-item" alt="GTR">
<img src="media/scenetex.gif" class="media-item" alt="scenetex">
<img src="media/4real.gif" class="media-item" alt="4real">
</div>
<p>In the ever-expanding metaverse, where the physical and digital worlds seamlessly merge, the need to capture, represent, and analyze three-dimensional structures is crucial. The advancements in 3D and 4D generation technologies have transformed gaming, augmented reality (AR), and virtual reality (VR), offering unprecedented immersion and interaction. Bridging the gap between reality and virtuality, 3D modeling enables realistic simulations, immersive gaming experiences, and AR overlays. Adding the temporal dimension enhances these experiences, enabling lifelike animations, object tracking, and understanding complex spatiotemporal relationships, reshaping digital interactions in entertainment, education, and beyond.
<p>Traditionally, 3D generation involved directly manipulating 3D data and attempt to recover 3D details using 2D data.
Recent breakthroughs in 2D diffusion models have significantly improved 3D generation.
Methods using 2D priors from diffusion models have emerged, enhancing the quality and diversity of 3D asset generation.
These methods range from inpainting-based approaches and optimization-based techniques like Score Distillation Sampling (SDS), to recent feed-forward generation using multi-view images as an auxiliary medium.
<p>On the other hand, challenges persist in extending 3D asset generation to scenes and mitigating biases in 2D priors for realistic synthesis in real-world settings. Addressing these issues, our tutorial delves into 3D scene generation, exploring techniques for diverse scene scales, compositionality, and realism.
Finally, we also cover recent advancements in 4D generation using images and videos models as priors, crucial for applications like augmented reality.
Attendees will gain insights into various paradigms of 3D/4D generation, from training on 3D data to leveraging 2D diffusion model knowledge, resulting in a comprehensive understanding of contemporary 3D modeling approaches.
<p>In conclusion, our tutorial provides a comprehensive exploration of 3D/4D generation and modeling, covering fundamental techniques to cutting-edge advancements. By navigating scene-level generation intricacies and leveraging 2D priors for enhanced realism, attendees will emerge equipped with a nuanced understanding of the evolving landscape of 3D modeling in the metaverse era.
</div>
</div>
</br>
<div class="container" id="organizer">
<h2>Organizers</h2>
<div>
<div class="instructor">
<a href="http://hsinyinglee.com/">
<div class="instructorphoto"><img src="figures/hsin.png"></div>
<div>Hsin-Ying Lee<br>Creative Vision, Snap Research</div>
</a>
</div>
<div class="instructor">
<a href="https://payeah.net/">
<div class="instructorphoto"><img src="figures/peiye.png"></div>
<div>Peiye Zhuang<br>Creative Vision, Snap Research</div>
</a>
</div>
<div class="instructor">
<a href="https://mightychaos.github.io/">
<div class="instructorphoto"><img src="figures/chaoyang.png"></div>
<div> Chaoyang Wang <br>Creative Vision, Snap Research</div>
</a>
</div>
</div>
<p></p>
</div>
</br>
<div class="container" id="schedule">
<h2>Program</h2>
<table class="program">
<tr>
<td width="70%">
<p style="font-size:20px"> <b>Introduction</b> </a> </p>
</td>
<td width="20%"><em>Hsin-Ying Lee</em></td>
<td width="10%"><b>08:30 - <br /> 08:40</b></td>
<td width="10%">
<a href="https://drive.google.com/file/d/1ba86ESYabbklphfIkkF5Izh8iz2Cwxrm/view?usp=sharing">PDF</a>
</td>
</tr>
<tr>
<td width="70%">
<p style="font-size:20px"> <b>3D Generation w/o Large-Scale 2D Priors</b> </a> </p>
Introducing conventional ways of
training 3D generation models using 2D and 3D data without large-scale image and video diffusion models.
</td>
<td width="20%"><em>Hsin-Ying Lee</em></td>
<td width="10%"><b>08:40 - <br /> 09:00</b></td>
<td width="10%">
<a href="https://drive.google.com/file/d/1ItJvdwAX6gryPAohSNc6cguUrrBbuo-o/view?usp=sharing">PDF</a>
</td>
</tr>
<tr>
<td>
<p style="font-size:20px"> <b>Bridging 2D and 3D: From Optimization to Feedforward </b> </p>
Introducing two ways of performing 3D generation with the help of large-scale 2D diffusion models,
including optimization-based methods distilling knowledge with Score Distillation Sampling (SDS) and its variants, and feedforward methods with the help of multi-view image generation.
</td>
<td><em>Peiye Zhuang</em></td>
<td><b>09:10 - <br /> 10:00</b></td>
<td width="10%">
<a href="https://drive.google.com/file/d/10Y9Swd1muMocNgNX64ApH-ntFblj0XCc/view?usp=sharing">PDF</a>
</td>
</tr>
<tr>
<td>
<p style="font-size:20px"> <b>3D Scene Generation</b> </p>
Introducing the recent advances and challenges in 3D scene generation.
</td>
<td><em>Hsin-Ying Lee</em></td>
<td><b>10:10 - <br /> 10:40</b></td>
<td width="10%">
<a href="https://drive.google.com/file/d/1ZeR5yvU5s9HnYr_cDfx-CMiLZ4D3JLVO/view?usp=sharing">PDF</a>
</td>
</tr>
<tr>
<td>
<p style="font-size:20px"> <b>4D Generation and Reconstruction </b>
</p> Introducing recent advancements on 4D generation as well as generation vis reconstruction.
</td>
<td><em>Chaoyang Wang</em></td>
<td><b>10:50 - <br /> 11:35</b></td>
<td width="10%">
<a href="https://drive.google.com/file/d/1vdr4fGamoQq-l6wByUV14cl5a3s-sNmP/view?usp=sharing">PDF</a>
</td>
</tr>
<td>
<p style="font-size:20px"> <b>Closing Remarks</b></p>
</td>
<td><em>Hsin-Ying Lee</em></td>
<td><b>11:35 - <br /> 11:45</b></td>
</tr>
</table>
</div>
</br>
<div class="container" id='speaker'>
<h2>About the Speakers</h2>
<div class="schedule">
<p><b>Hsin-Ying Lee</b> is a Senior Research Scientist in the Creative Vision team at Snap Research.
His research focuses on content generation, specifically, image/video/3D/4D generation and manipulation.
He has published 50+ top conference papers and journals.
Hsin-Ying got Ph.D. in the University of California, Merced.
Before joining Snap Inc, Hsin-Ying did internships in Google and Nvidia. </p>
<p><b>Peiye Zhuang </b> is a Research Scientist in the Creative Vision group at Snap Research.
Her research focuses on foundation generative models and various content creation applications,
including 2D/3D/video generation and editing. Before joining Snap, Peiye received her PhD degree in Computer Science
at University of Illinois at Urbana-Champaign (UIUC) in 2023. She also spent time at Stanford University and interned
with Apple, Google Brain, Facebook (now Meta), and Adobe.
</p>
<p><b>Chaoyang Wang </b> is a Research Scientist in the Creative Vision group at Snap Research.
His research focuses on 3D/4D reconstruction and its application for photo-realistic novel view synthesis and
content generation. He got his Ph.D. degree in the Robotics Institute of Carnegie Mellon University.
Before joining Snap Inc, Chaoyang did internships in Nvidia, Adobe, Microsoft and Argo AI.
</p>
</div>
</div>
</br>
<div class="containersmall">
<p>Please contact <a href="[email protected]">Hsin-Ying Lee</a> if you have question. The webpage template is by the courtesy of awesome <a href="https://gkioxari.github.io/">Georgia</a>.</p>
</div>
<!--<p align="center" class="acknowledgement">Last updated: Jan. 6, 2017</p>-->
</body>
</html>