Skip to content
/ AnyLoc Public
forked from AnyLoc/AnyLoc

Official Code Repository for AnyLoc

License

Notifications You must be signed in to change notification settings

fyllr/AnyLoc

 
 

Repository files navigation

AnyLoc: Towards Universal Visual Place Recognition

License: BSD-3 stars arXiv githubio github YouTube Hugging Face Space Open In Colab: Global Descriptors Open In Colab: Cluster visualizations Public Release on IIITH-OneDrive Hugging Face Paper

Table of contents

Contents

The contents of this repository are as follows

S. No. Item Description
1 demo Contains standalone demo scripts (Quick start, Jupyter Notebook, and Gradio app) to run our AnyLoc-VLAD-DINOv2 method. Also contains guides for APIs. This folder is self-contained (doesn't use anything outside it).
2 scripts Contains all scripts for development. Use the -h option for argument information.
3 configs.py Global configurations for the repository
4 utilities Utility Classes & Functions (includes DINOv2 hooks & VLAD)
5 conda-environment.yml The conda environment (it could fail to install OpenAI CLIP as it includes a git+ URL). We suggest you use the setup_conda.sh script.
6 requirements.txt Requirements file for pip virtual environment. Probably out of date.
7 custom_datasets Custom datalaoder implementations for VPR.
8 examples Miscellaneous example scripts
9 MixVPR Minimal MixVPR inference code
10 clip_wrapper.py A wrapper around two CLIP implementations (OpenAI and OpenCLIP).
11 models_mae.py MAE implementation
12 dino_extractor.py DINO (v1) feature extractor
13 CONTRIBUTING.md Note for contributors
14 paper_utils Paper scripts (formatting for figures, etc.)

Included Repositories

Includes the following repositories (currently not submodules) as subfolders.

Directory Link Cloned On Description
dvgl-benchmark gmberton/deep-visual-geo-localization-benchmark 2023-02-12 For benchmarking
datasets-vg gmberton/datasets_vg 2023-02-13 For dataset download and formatting
CosPlace gmberton/CosPlace 2023-03-20 Baseline Comparisons

PapersWithCode Badges

PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC PWC

Getting Started

Tip: You can explore the HuggingFace Space and the Colab notebooks (no GPU needed).

Clone this repository

git clone https://github.com/AnyLoc/AnyLoc.git
cd AnyLoc

Set up the conda environment

conda create -n anyloc python=3.8
conda activate anyloc
bash ./setup_conda.sh

You can also use an existing conda environment, say vl-vpr, by doing

bash ./setup_conda.sh vl-vpr

Note the following:

  • All our public release files can be found here.
    • If the conda environment setup is taking time, you could just unzip conda-env.tar.gz (GB) in your ~/anaconda3/envs folder (but compatibility is not guaranteed).
  • The ./scripts folder is for validating our results and seeing the main scripts. Most applications are in the ./demo folder. See the list of demos before running anything.
  • If you're running something in the ./scripts folder, run it with pwd in this (repository) folder. For example, python scripts are run as python ./scripts/<script>.py and bash scripts are run as bash ./scripts/<script>.sh. For the demos and other baselines, you should cd into respective folders.
  • The utilities.py file is mainly for ./scripts files. All demos actually use the demo/utilities.py file (which is distilled and minimal). Using the latter should be enough to implement our SOTA method.

Using the SOTA: AnyLoc-VLAD-DINOv2

Open In Colab Local Python script

Using the APIs

Import the utilities

from utilities import DinoV2ExtractFeatures
from utilities import VLAD

DINOv2

DINOv2 feature extractor can be used as follows

extractor = DinoV2ExtractFeatures("dinov2_vitg14", desc_layer,
        desc_facet, device=device)

Get the descriptors using

# Make image patchable (14, 14 patches)
c, h, w = img_pt.shape
h_new, w_new = (h // 14) * 14, (w // 14) * 14
img_pt = tvf.CenterCrop((h_new, w_new))(img_pt)[None, ...]
# Main extraction
ret = extractor(img_pt) # [1, num_patches, desc_dim]

VLAD

The VLAD aggregator can be loaded with vocabulary (cluster centers) from a c_centers.pt file.

# Main VLAD object
vlad = VLAD(num_c, desc_dim=None, cache_dir=os.path.dirname(c_centers_file))
vlad.fit(None)  # Load the vocabulary (and auto-detect `desc_dim`)
# Cluster centers have shape: [num_c, desc_dim]
#   - num_c: number of clusters
#   - desc_dim: descriptor dimension

If you have a database of descriptors you want to fit, use

vlad.fit(ein.rearrange(full_db_vlad, "n k d -> (n k) d"))
# n: number of images
# k: number of patches/descriptors per image
# d: descriptor dimension

To get the VLAD representations of multiple images, use

db_vlads: torch.Tensor = vlad.generate_multi(full_db)
# Shape of full_db: [n_db, n_d, d_dim]
#   - n_db: number of images in the database
#   - n_d: number of descriptors per image
#   - d_dim: descriptor dimension
# Shape of db_vlads: [n_db, num_c * d_dim]
#   - num_c: number of clusters (centers)

DINOv1

This is present in dino_extractor.py (not a part of demo/utilities.py).

Initialize and use it as follows the extractor

# Import it
from dino_extractor import ViTExtractor
...

# Initialize it (layer and key are when extracting descriptors)
extractor = ViTExtractor("dino_vits8", stride=4, 
        device=device)
...

# Use it to extract patch descriptors
img = ein.rearrange(img, "c h w -> 1 c h w").to(device)
img = F.interpolate(img, (224, 298))    # For 4:3 images
desc = extractor.extract_descriptors(img,
        layer=11, facet="key") # [1, 1, num_descs, d_dim]
...

Validating the Results

You don't need to read further if you're not experimentally validating the entire results (enjoy the demos instead). The following sections are for the curious minds who want to reproduce the results.

Note to/for contributors

All the runs were done on a machine with the following specifications:

  • CPU: Two Intel Xeon Gold 5317 CPUs (12C24T each)
  • CPU RAM: 256 GB
  • GPUs: Four NVIDIA RTX 3090 GPUs (24 GB, 10496 CUDA cores each)
  • Storage: 3.5 TB HDD on /scratch. However, all datasets will take 32+ GB, have more for other requirements (for VLAD cluster centers, caching, models, etc.). We noticed that singularity (with SIF, cache, and tmp) used 90+ GB.
    • Driver Version (NVIDIA-SMI): 570.47.03
    • CUDA (SMI): 11.6

We can use only one GPU; however, some experiments (with large datasets) might need all of the CPU RAM (for efficient/fast nearest neighbor search). Ideally, a 16 GB GPU should also work.

Do the following

  1. Clone the repository and setup the NVIDIA NGC container (run everything inside it)
  2. Setup the datasets (download, format, and unzip them)
  3. Run the script you want to test from scripts folder

Start by cloning/setting up the repository

cd ~/Documents
git clone https://github.com/AnyLoc/AnyLoc.git vl-vpr

NVIDIA NGC Singularity Container Setup

Despite using recommended practices of reproducibility (see function seed_everything in utilities.py) in PyTorch, we noticed minor changes across GPU types and CUDA versions. To mitigate this, we recommend using a singularity container.

Setting up the environment in a singularity container (in a SLURM environment)

TL;DR: Run the following (this system is a different one). This was tested on CMU's Bridges-2 partition of PSC HPC. Don't use this if you want to replicate the tables in the paper (but the numbers come close).

salloc -p GPU-small -t 01:00:00 --ntasks-per-node=5 --gres=gpu:v100-32:1
cd /ocean/containers/ngc/pytorch/
singularity instance start --nv pytorch_22.12-py3.sif vlvpr
singularity run --nv instance://vlvpr
cd /ocean/projects/cis220039p/nkeetha/data/singularity/venv
source vlvpr/bin/activate
cd /ocean/projects/cis220039p/<path to vl-vpr scripts folder>

Main setup: For Singularity on IIITH's Ada HPC (Ubuntu 18.04) - our main setup for validation. Use this if you want to replicate the tables in the paper (hardware should be same as listed before).

The script below assumes that this repository is cloned in ~/Documents/vl-vpr. That is, this README is at ~/Documents/vl-vpr/README.md.

# Load the module and configurations
module load u18/singularity-ce/3.9.6
mkdir -p /scratch/$USER/singularity && cd $_ && mkdir .cache .tmp venvs
export SINGULARITY_CACHEDIR=/scratch/$USER/singularity/.cache
export SINGULARITY_TMPDIR=/scratch/$USER/singularity/.tmp
# Ensure that the next command gives output "1" (or anything other than "0")
cat /proc/sys/kernel/unprivileged_userns_clone
# Setup the container (download the image if not there already) - (15 GB cache + 7.5 GB file)
singularity pull ngc_pytorch_22.12-py3 docker://nvcr.io/nvidia/pytorch:22.12-py3
# Test container through shell
singularity shell --nv ngc_pytorch_22.12-py3
# Start and run the container (mount the symlinked and scratch folders)
singularity instance start --mount "type=bind,source=/scratch/$USER,destination=/scratch/$USER" \
    --nv ngc_pytorch_22.12-py3 vl-vpr
singularity run --nv instance://vl-vpr
# Create virtual environment
cd ~/Documents/vl-vpr/
pip install virtualenv
cd venvs
virtualenv --system-site-packages vl-vpr
# Activate virtualenv and install all packages
cd ~/Documents/vl-vpr/
source ./venvs/vl-vpr/bin/activate
bash ./setup_virtualenv_ngc.sh
# Run anything you want (from here, but find the file in scripts)
cd ~/Documents/vl-vpr/
python ./scripts/<task name>.py <args>
# The baseline scripts should be run in their own folders. For example, to run CosPlace, do
cd ~/Documents/vl-vpr/
cd CosPlace
python ./<script>.py

Dataset Setup

Datasets Note: Some datasets are under review (other works) and will be updated soon. See the Datasets-All folder in out public material (for .tar.gz files).

Set them up in a folder with sufficient space

mkdir -p /scratch/$USER/vl-vpr/datasets && cd $_

Download (and unzip) the datasets from here into this folder. Link this folder (for easy access form this repository)

cd ~/Documents/vl-vpr/
cd ./datasets-vg
ln -s /scratch/$USER/vl-vpr/datasets datasets

After setting up all datasets, the folders should look like this (in the dataset folder). Run the following command to get the tree structure.

tree ./eiffel ./hawkins*/ ./laurel_caverns ./VPAir ./test_40_midref_rot*/ ./Oxford_Robotcar ./gardens ./17places ./baidu_datasets ./st_lucia ./pitts30k --filelimit=20 -h
  • The test_40_midref_rot0 is Nardo Air. This is also referred as Tartan_GNSS_notrotated in our scripts.
  • The test_40_midref_rot90 is Nardo Air-R (rotated). This is also referred as Tartan_GNSS_rotated in out scripts.
  • The hawkins_long_corridor is the Hawkins dataset (degraded environment).
  • The eiffel dataset is Mid-Atlantic Ridge (underwater dataset).

Output will be something like

./eiffel
├── [4.0K]  db_images [65 entries exceeds filelimit, not opening dir]
├── [2.2K]  eiffel_gt.npy
└── [4.0K]  q_images [101 entries exceeds filelimit, not opening dir]
./hawkins_long_corridor/
├── [4.0K]  db_images [127 entries exceeds filelimit, not opening dir]
├── [ 12K]  images [314 entries exceeds filelimit, not opening dir]
├── [ 17K]  pose_topic_list.npy
└── [4.0K]  q_images [118 entries exceeds filelimit, not opening dir]
./laurel_caverns
├── [4.0K]  db_images [141 entries exceeds filelimit, not opening dir]
├── [ 20K]  images [744 entries exceeds filelimit, not opening dir]
├── [ 41K]  pose_topic_list.npy
└── [4.0K]  q_images [112 entries exceeds filelimit, not opening dir]
./VPAir
├── [ 677]  camera_calibration.yaml
├── [420K]  distractors [10000 entries exceeds filelimit, not opening dir]
├── [4.0K]  distractors_temp
├── [ 321]  License.txt
├── [177K]  poses.csv
├── [ 72K]  queries [2706 entries exceeds filelimit, not opening dir]
├── [160K]  reference_views [2706 entries exceeds filelimit, not opening dir]
├── [ 96K]  reference_views_npy [2706 entries exceeds filelimit, not opening dir]
└── [ 82K]  vpair_gt.npy
./test_40_midref_rot0/
├── [ 46K]  gt_matches.csv
├── [2.8K]  network_config_dump.yaml
├── [5.3K]  query.csv
├── [4.0K]  query_images [71 entries exceeds filelimit, not opening dir]
├── [2.9K]  reference.csv
└── [4.0K]  reference_images [102 entries exceeds filelimit, not opening dir]
./test_40_midref_rot90/
├── [ 46K]  gt_matches.csv
├── [2.8K]  network_config_dump.yaml
├── [5.3K]  query.csv
├── [4.0K]  query_images [71 entries exceeds filelimit, not opening dir]
├── [2.9K]  reference.csv
└── [4.0K]  reference_images [102 entries exceeds filelimit, not opening dir]
./Oxford_Robotcar
├── [4.0K]  __MACOSX
│   └── [4.0K]  oxDataPart
├── [4.0K]  oxDataPart
│   ├── [4.0K]  1-m [191 entries exceeds filelimit, not opening dir]
│   ├── [ 24K]  1-m.npz
│   ├── [ 13K]  1-m.txt
│   ├── [4.0K]  1-s [191 entries exceeds filelimit, not opening dir]
│   ├── [ 24K]  1-s.npz
│   ├── [4.0K]  1-s-resized [191 entries exceeds filelimit, not opening dir]
│   ├── [ 13K]  1-s.txt
│   ├── [4.0K]  2-s [191 entries exceeds filelimit, not opening dir]
│   ├── [ 24K]  2-s.npz
│   ├── [4.0K]  2-s-resized [191 entries exceeds filelimit, not opening dir]
│   └── [ 13K]  2-s.txt
├── [ 15K]  oxdatapart.mat
└── [ 66M]  oxdatapart_seg.npz
./gardens
├── [4.0K]  day_left [200 entries exceeds filelimit, not opening dir]
├── [4.0K]  day_right [200 entries exceeds filelimit, not opening dir]
├── [3.6K]  gardens_gt.npy
└── [4.0K]  night_right [200 entries exceeds filelimit, not opening dir]
./17places
├── [ 14K]  ground_truth_new.npy
├── [ 13K]  my_ground_truth_new.npy
├── [ 12K]  query [406 entries exceeds filelimit, not opening dir]
├── [ 514]  ReadMe.txt
└── [ 12K]  ref [406 entries exceeds filelimit, not opening dir]
./baidu_datasets
├── [4.0G]  IDL_dataset_cvpr17_3852.zip
├── [387M]  mall.pcd
├── [108K]  query_gt [2292 entries exceeds filelimit, not opening dir]
├── [ 96K]  query_images_undistort [2292 entries exceeds filelimit, not opening dir]
├── [2.7K]  readme.txt
├── [ 44K]  training_gt [689 entries exceeds filelimit, not opening dir]
└── [ 36K]  training_images_undistort [689 entries exceeds filelimit, not opening dir]
./st_lucia
├── [4.0K]  images
│   └── [4.0K]  test
│       ├── [180K]  database [1549 entries exceeds filelimit, not opening dir]
│       └── [184K]  queries [1464 entries exceeds filelimit, not opening dir]
└── [695K]  map_st_lucia.png
./pitts30k
└── [4.0K]  images
    ├── [4.0K]  test
    │   ├── [1.2M]  database [10000 entries exceeds filelimit, not opening dir]
    │   ├── [5.9M]  database.npy
    │   ├── [864K]  queries [6816 entries exceeds filelimit, not opening dir]
    │   └── [4.0M]  queries.npy
    ├── [4.0K]  train
    │   ├── [1.3M]  database [10000 entries exceeds filelimit, not opening dir]
    │   ├── [5.9M]  database.npy
    │   ├── [948K]  queries [7416 entries exceeds filelimit, not opening dir]
    │   └── [4.4M]  queries.npy
    └── [4.0K]  val
        ├── [1.3M]  database [10000 entries exceeds filelimit, not opening dir]
        ├── [5.8M]  database.npy
        ├── [980K]  queries [7608 entries exceeds filelimit, not opening dir]
        └── [4.4M]  queries.npy

These directories are put under ./datasets_vg/datasets folder (can store them in scratch and symlink it there). For example, the 17places dataset can be found under ./datasets_vg/datasets/17places folder.

Original dataset webpages:

Note: We're in the process of releasing Nardo-Air (Tartan Air), Laurel Caverns, and Hawkins (part of SubT-MRS). Please stay tuned!

Some datasets can be found at other places

References

We thank the authors of the following repositories for their open source code and data:

Cite Our Work

Thanks for using our work. You can cite it as:

@article{AnyLoc,
    author    = {Nikhil Keetha and Avneesh Mishra and Jay Karhade and Krishna Murthy Jatavallabhula and Sebastian Scherer and Madhava Krishna and Sourav Garg}
    title     = {AnyLoc: Towards Universal Visual Place Recognition},
    url       = {https://arxiv.org/abs/2308.00688}
    journal   = {arXiv},
    year      = {2023},
}

Developers:

About

Official Code Repository for AnyLoc

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.8%
  • Shell 9.2%