-
Notifications
You must be signed in to change notification settings - Fork 12
/
index.html
364 lines (325 loc) · 15.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="en">
<head>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<!-- Bootstrap CSS -->
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
<title>3D Bird Reconstruction: A Dataset, Model, and Shape Recovery from a Single View</title>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-173639674-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-173639674-1');
</script>
</head>
<body class="container" style="max-width:920px">
<!-- Title -->
<div>
<div class='row mt-5 mb-1'>
<div class='col text-center'>
<p class="h2 font-weight-normal">3D Bird Reconstruction</p>
</div>
</div>
<div class='row mt-1 mb-5'>
<div class='col text-center'>
<p class="h3 font-weight-normal">A Dataset, Model, and Shape Recovery from a Single View</p>
</div>
</div>
<!-- authors -->
<div class='row text-center h5 font-weight-bold mb-4'>
<a class="col-md-3 col-xs-6" href="https://www.ocf.berkeley.edu/~badger/" target="_blank"><span>Marc Badger</span></a>
<a class="col-md-3 col-xs-6" href="https://yufu-wang.github.io" target="_blank"><span>Yufu Wang</span></a>
<a class="col-md-3 col-xs-6" href="https://www.seas.upenn.edu/~adarshm" target="_blank"><span>Adarsh Modh</span></a>
<a class="col-md-3 col-xs-6" href="https://aperkes.github.io/" target="_blank"><span>Ammon Perkes</span></a>
<a class="col-md-3 col-xs-6" href="https://www.seas.upenn.edu/~nkolot" target="_blank"><span>Nikos Kolotouros</span></a>
<a class="col-md-3 col-xs-6" href="http://pfrommer.us" target="_blank"><span>Bernd G. Pfrommer</span></a>
<a class="col-md-3 col-xs-6" href="https://web.sas.upenn.edu/marcschmidtlab/pages/people/" target="_blank"><span>Marc F. Schmidt</span></a>
<a class="col-md-3 col-xs-6" href="https://www.cis.upenn.edu/~kostas" target="_blank"><span>Kostas Daniilidis</span></a>
</div>
<!-- affiliations -->
<div class='row mt-1 mt-2' >
<div class='col text-center'>
<p class="h5 font-weight-light">
<a class="mr-4 ml-4" href="https://www.upenn.edu/" target="_blank"><span>University of Pennsylvania</span></a>
</p>
</div>
</div>
<div class='row mt-5'>
<table align=center width=90%>
<tr>
<td >
<center>
<video width=100% src="files/method.mp4" type="video/mp4" autoplay muted loop/>
<!-- <img width=100% src="files/method.gif" type="image/gif"/> -->
</center>
</td>
</tr>
<tr>
<td width=80%>
<center>
<span style="font-size:14px"><i>We estimate the 3D pose and shape of birds from a single view. Given a detection and associated bounding box, we predict body keypoints and a mask. We then predict the parameters of an articulated avian mesh model, which provides a good initial estimate for optional further optimization.</i>
</center>
</td>
</tr>
</table>
</div>
<!-- Paper section -->
<div>
<hr>
<div class='row'>
<div class='col-md-3 col-sm-3 col-xs-12 text-center col-sm-3'>
<div class="row mt-4">
<a href="files/3d_birds_singleview.pdf" target="_blank" style="max-width:200px; margin-left:auto; margin-right:auto">
<img src="files/paper.png" alt="paper-snapshot" class="img-thumbnail" width="80%" style="box-shadow: 10px 10px 5px grey;">
</a>
</div>
<div class="row mt-4">
<div class="col">
<a class="h5" href="https://arxiv.org/abs/2008.06133" target="_blank" style="margin-right:10px">
<span>[arXiv]</span>
</a>
<a class="h5" href="files/3d_birds_singleview-supp.pdf" target="_blank" style="margin-right:10px">
<span>[Supplementary]</span>
</a>
<a class="h5" href="https://github.com/marcbadger/avian-mesh" target="_blank" style="margin-right:10px">
<span>[Code]</span>
</a>
<a class="h5" href="files/badger2020.bib" target="_blank">
<span>[Bibtex]</span>
</a>
</div>
</div>
</div>
<div class='col-md-9 col-sm-9 col-xs-12'>
<p class='h4 font-weight-bold '>Abstract</p>
<p>
Automated capture of animal pose is transforming how we study neuroscience and social behavior.
Movements carry important social cues, but current methods are not able to robustly estimate pose and shape of animals, particularly for social animals such as birds, which are often occluded by each other and objects in the environment.
To address this problem, we first introduce a model and multi-view optimization approach, which we use to capture the unique shape and pose space displayed by live birds.
We then introduce a pipeline and experiments for keypoint, mask, pose, and shape regression that recovers accurate avian postures from single views.
Finally, we provide extensive multi-view keypoint and mask annotations collected from a group of 15 social birds housed together in an outdoor aviary.
</p>
</div>
</div>
</div>
<!-- Overview -->
<div>
<hr>
<div class='row text-center'>
<div class='col'>
<p class='h2'>Overview</p>
</div>
</div>
<div class='row mt-3'>
<div class='col'>
<center>
<video controls width=80% poster="files/video_thumbnail.png" src="files/Badger_2897_short.mp4" type="video/mp4"/>
</center>
</div>
</div>
</div>
<!-- Dataset and Approach -->
<div>
<hr>
<div class='row text-center'>
<div class='col'>
<p class='h2'>Dataset</p>
</div>
</div>
<div class='row mt-3'>
<table align=center width=99%>
<tr>
<td>
<center>
<video width=100% src="files/timelapse.mp4" type="video/mp4" autoplay muted loop/>
</center>
</td>
</tr>
<td width=80%>
<center>
<span style="font-size:14px"><i>Our dataset captures the social interactions of 15 cowbirds housed together in an outdoor aviary over the course of a three-month mating season.</i>
</center>
</td>
</tr>
</table>
</div>
<div class='row mt-5'>
<table align=center width=99%>
<tr>
<td>
<center>
<img width=100% src="files/dataset_w_mask.png" type="image/png"/>
</center>
</td>
</tr>
<td width=80%>
<center>
<span style="font-size:14px"><i>We provide multi-view segmentation masks for over 6300 bird instances, keypoints for 1000 bird instances, an articulated 3D mesh model of a bird, and a full pipeline for recovering the shape and pose of birds from single views. See our <a href="https://github.com/marcbadger/avian-mesh">code</a> for details.</i>
</center>
</td>
</tr>
</table>
</div>
<div class='row mt-5'>
<table align=center width=99%>
<tr>
<td>
<div class = 'row'>
<div class='col-md-6 col-sm-6 col-xs-12 mt-1'>
<img width=100% src="files/mask_instance.png" type="image/png"/>
</div>
<div class='col-md-6 col-sm-6 col-xs-12 mt-1'>
<video width=100% src="files/mask_predictions.mp4" type="video/mp4" autoplay muted loop/>
</div>
</div>
</td>
</tr>
<td width=80%>
<center>
<span style="font-size:14px"><i>We detect bird instances using a Mask R-CNN pretrained on COCO instance segmentation, which we fine-tune for birds. Frame-by-frame predictions are stable across time and we achieve excellent generalization to unseen days and across seasons.</i>
</center>
</td>
</tr>
</table>
</div>
<div class='row mt-5'>
<table align=center width=99%>
<tr>
<td>
<center>
<video width=100% src="files/multiview.mp4" type="video/mp4" autoplay muted loop/>
</center>
</td>
</tr>
<td width=80%>
<center>
<span style="font-size:14px"><i>We fit our avian mesh to the annotated multi-view dataset and extract distributions for shape and pose of birds in the aviary.</i>
</center>
</td>
</tr>
</table>
</div>
</div>
<!-- Results -->
<div>
<hr>
<div class='row text-center'>
<div class='col'>
<p class='h2'>Results</p>
</div>
</div>
<div class='text-left'>
<p>Our single-view pipeline (shown at the top) produces good qualitative fits for a variety of poses, including asymmetric, stretched, and puffed postures and for a variety of viewpoints including views from the front and back, the sides, and from below. Each panel shows the input image and the output mesh.</p>
</div>
<div class='row'>
<table align=center width=99%>
<tr>
<td>
<center>
<img width=100% src="files/pipeline_results.png" type="image/png"/>
</center>
</td>
</tr>
</table>
</div>
<div class='mt-5 text-left'>
<p>Our mesh, pose regression networks, and single-view optimization procedure generalize to similar bird species in CUB-200 using distributions of shape and pose extracted from our multi-view dataset.</p>
</div>
<div class="row mb-4">
<table align=center width=99%>
<tr>
<td>
<center>
<img width=100% src="files/cub200_row1.png" type="image/png"/>
</center>
</td>
</tr>
<tr>
<td>
<center>
<div class="row">
<div class="col-4">
<span style="font-size:14px"><i>Red-winged Blackbird</i></span>
</div>
<div class="col-4">
<span style="font-size:14px"><i>Painted Bunting</i></span>
</div>
<div class="col-4">
<span style="font-size:14px"><i>Rose-breasted Grosbeak</i></span>
</div>
</div>
</center>
</td>
</tr>
</table>
</div>
<div class="row mb-4">
<table align=center width=99%>
<tr>
<td>
<center>
<img width=100% src="files/cub200_row2.png" type="image/png"/>
</center>
</td>
</tr>
<tr>
<td>
<center>
<div class="row">
<div class="col-4">
<span style="font-size:14px"><i>European Goldfinch</i></span>
</div>
<div class="col-4">
<span style="font-size:14px"><i>Purple Finch</i></span>
</div>
<div class="col-4">
<span style="font-size:14px"><i>Yellow-headed Blackbird</i></span>
</div>
</div>
</center>
</td>
</tr>
</table>
</div>
</div>
<!-- Youtube Video -->
<div>
<hr>
<div class='row text-center'>
<div class='col'>
<p class='h2 mr-3'>Video</p>
</div>
</div>
<div class='row mt-3 text-center center-block' style=" margin-left:auto; margin-right:auto">
<div class='col ml-1 mr-1' style="position: relative; width: 100%;height: 0;padding-bottom: 56%;">
<iframe
src="https://www.youtube.com/embed/M_tVMaj33pg"
frameborder="0"
allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen
style="position: absolute;width: 100%;height: 100%; left: 0; top: 0;"
>
</iframe>
</div>
</div>
</div>
<!-- Ack -->
<div>
<hr>
<div class='row mb-5 text-center'>
<div class='col'>
<p class='h2'>Acknowledgements</p>
<div class='text-left'>
<p>
We thank the diligent annotators in the Schmidt Lab, Kenneth Chaney for compute resources, and Stephen Phillips for helpful discussions. We gratefully acknowledge support through the following grants: NSF-IOS-1557499, NSF-IIS-1703319, NSF MRI 1626008, NSF TRIPODS 1934960.
</p>
<p>
The design of this project page was based on <a href="https://www.guandaoyang.com/PointFlow/" target="_blank">this</a> website.
</p>
</div>
</body>
</html>