diff --git a/README.md b/README.md
index a03f921..15c0f94 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,31 @@
-# AVLink
+# AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
+[![Project Page](https://img.shields.io/badge/Project-Page-green.svg)](https://snap-research.github.io/AVLink/)
+[![arXiv](https://img.shields.io/badge/arXiv-2311.18822-b31b1b)](#TODO)
+
+
+
+
+# AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
+
+
Abstract: We propose AV-Link, a unified framework for Video-to-Audio and Audio-to-Video generation that leverages the activations of frozen video and audio diffusion models for temporally-aligned cross-modal conditioning. The key to our framework is a Fusion Block that enables bidirectional
+information exchange between our backbone video and audio diffusion models through a temporally-aligned self attention operation. Unlike prior work that uses feature extractors pretrained for other tasks for the conditioning signal, AV-Link can directly leverage features obtained by the
+complementary modality in a single framework i.e. video features to generate audio, or audio features to generate
+video. We extensively evaluate our design choices and demonstrate the ability of our method to achieve synchronized and high-quality audiovisual content, showcasing its potential for applications in immersive media generation. For more details, please visit our
project webpage or read our
+
paper.
+
+
+
+
+# Issues
+If you have any questions about AV-Link, please open an issue in this GitHub page or send your questions to `mh155@rice.edu`
+
+# Project Page Template
+a template of our project page can be found under `docs` directory
+
+## Citation
+If you find this paper useful in your research, please consider citing:
+```
+TODO
+```
\ No newline at end of file