diff --git a/README.md b/README.md new file mode 100644 index 0000000..716fdb6 --- /dev/null +++ b/README.md @@ -0,0 +1,24 @@ +# Occam's LGS: A Simple Approach for Language Gaussian Splatting + +[![arXiv](https://img.shields.io/badge/arXiv-xxxx.xxxxx-b31b1b.svg)]() +[![Project Page](https://img.shields.io/badge/Project-Page-blue)]() + +This is the official implementation of "Occam's LGS: A simple approach for Language Gaussian Splatting". + +## Overview + +Occam's LGS is a simple, training-free approach for Language-guided 3D Gaussian Splatting that achieves state-of-the-art results with a 100x speed improvement. Our method: + +- 🎯 Lifts 2D language features to 3D Gaussian Splats without complex modules or training +- 🚀 Provides 100x faster optimization compared to existing methods +- 🧩 Works with any feature dimension without compression +- 🎨 Enables easy scene manipulation and object insertion + +## Key Features + +- Training-free global optimization approach +- Direct reasoning in language feature space +- Support for arbitrary language feature dimensionality +- Fast processing time (~15s runtime) +- Compatible with SAM+CLIP features +- Includes tools for object insertion and scene manipulation \ No newline at end of file diff --git a/docs/index.html b/docs/index.html new file mode 100644 index 0000000..f1d9180 --- /dev/null +++ b/docs/index.html @@ -0,0 +1,523 @@ + + +
+ + + + +Teatime
+*Visualization: SAM+CLIP features, reduced to 3-dimension via LangSplat autoencoder for visualization purpose, uplifted to 3D +
++ In this work, we show that the sophisticated techniques for language-grounded 3D Gaussian Splatting are simply + unnecessary. Instead, we apply Occam's razor to the task at hand and perform weighted multi-view feature aggregation + using the weights derived from the standard rendering process, followed by a simple heuristic-based noisy Gaussian + filtration. Doing so offers us state-of-the-art results with a speed-up of two orders of magnitude. We showcase our + results in two commonly used benchmark datasets: LERF and 3D-OVS. Our simple approach allows us to perform reasoning directly + in the language features, without any compression whatsoever. Such modeling in turn offers easy scene manipulation, unlike the + existing methods -- which we illustrate using an application of object insertion in the scene. Furthermore, we provide a thorough + discussion regarding the significance of our contributions within the context of the current literature. Our source code will be made publicly available. +
++ Overview: Occam's LGS consists of three stages: (1) Forward rendering with 3D Gaussian Splatting to obtain opacity α, + projected positions xi' and pixels pi, (2) Weighted aggregation of multi-view semantic features + via alpha blending, and (3) Filtering of invisible Gaussians +
++ We show the comparison with other works. +
++ We also show the visualization of the relevancy map for the 3D-OVS dataset. +
+Room Scene
+ + +Bench Scene
+ + +Lawn Scene
++ We select and extract an "Pocelain hand" and "Waldo" (represented by its Gaussians) from the Figurines scene of LERF. + By simply copying the object's Gaussians together with their parameters and semantic features, the new object seamlessly + integrates into the Teatime scene while preserving its semantic features. +
+@article{park2021nerfies,
+ author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
+ title = {Nerfies: Deformable Neural Radiance Fields},
+ journal = {ICCV},
+ year = {2021},
+}
+