diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 00000000..5106ab32
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,122 @@
+# Contributing to idc-index
+
+There are many ways to contribute to idc-index, with varying levels of effort.
+Do try to look through the [documentation](idc-index-docs) first if something is
+unclear, and let us know how we can do better.
+
+- Ask a question on the [IDC forum][idc-forum]
+- Use [idc-index issues][idc-index-issues] to submit a feature request or bug,
+  or add to the discussion on an existing issue
+- Submit a [Pull Request](https://github.com/ImagingDataCommons/idc-index/pulls)
+  to improve idc-index or its documentation
+
+We encourage a range of Pull Requests, from patches that include passing tests
+and documentation, all the way down to half-baked ideas that launch discussions.
+
+## The PR Process, Circle CI, and Related Gotchas
+
+### How to submit a PR ?
+
+If you are new to idc-index development and you don't have push access to the
+repository, here are the steps:
+
+1. [Fork and clone](https://docs.github.com/get-started/quickstart/fork-a-repo)
+   the repository.
+2. Create a branch dedicated to the feature/bugfix you plan to implement (do not
+   use `main` branch - this will complicate further development and
+   collaboration)
+3. [Push](https://docs.github.com/get-started/using-git/pushing-commits-to-a-remote-repository)
+   the branch to your GitHub fork.
+4. Create a
+   [Pull Request](https://github.com/ImagingDataCommons/idc-index/pulls).
+
+This corresponds to the `Fork & Pull Model` described in the
+[GitHub collaborative development](https://docs.github.com/pull-requests/collaborating-with-pull-requests/getting-started/about-collaborative-development-models)
+documentation.
+
+When submitting a PR, the developers following the project will be notified.
+That said, to engage specific developers, you can add `Cc: @<username>` comment
+to notify them of your awesome contributions. Based on the comments posted by
+the reviewers, you may have to revisit your patches.
+
+### How to efficiently contribute ?
+
+We encourage all developers to:
+
+- set up pre-commit hooks so that you can remedy various formatting and other
+  issues early, without waiting for the continuous integration (CI) checks to
+  complete: `pre-commit install`
+
+- add or update tests. You can see current tests
+  [here](https://github.com/ImagingDataCommons/idc-index/tree/main/tests). If
+  you contribute new functionality, adding test(s) covering it is mandatory!
+
+- you can run individual tests from the root repository using the following
+  command: `python -m unittest -vv tests.idcindex.TestIDCClient.<test_name>`
+
+### How to write commit messages ?
+
+Write your commit messages using the standard prefixes for commit messages:
+
+- `BUG:` Fix for runtime crash or incorrect result
+- `COMP:` Compiler error or warning fix
+- `DOC:` Documentation change
+- `ENH:` New functionality
+- `PERF:` Performance improvement
+- `STYLE:` No logic impact (indentation, comments)
+- `WIP:` Work In Progress not ready for merge
+
+The body of the message should clearly describe the motivation of the commit
+(**what**, **why**, and **how**). In order to ease the task of reviewing
+commits, the message body should follow the following guidelines:
+
+1. Leave a blank line between the subject and the body. This helps `git log` and
+   `git rebase` work nicely, and allows to smooth generation of release notes.
+2. Try to keep the subject line below 72 characters, ideally 50.
+3. Capitalize the subject line.
+4. Do not end the subject line with a period.
+5. Use the imperative mood in the subject line (e.g.
+   `BUG: Fix spacing not being considered.`).
+6. Wrap the body at 80 characters.
+7. Use semantic line feeds to separate different ideas, which improves the
+   readability.
+8. Be concise, but honor the change: if significant alternative solutions were
+   available, explain why they were discarded.
+9. If the commit refers to a topic discussed on the [IDC forum][idc-forum], or
+   fixes a regression test, provide the link. If it fixes a compiler error,
+   provide a minimal verbatim message of the compiler error. If the commit
+   closes an issue, use the
+   [GitHub issue closing keywords](https://docs.github.com/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue).
+
+Keep in mind that the significant time is invested in reviewing commits and
+_pull requests_, so following these guidelines will greatly help the people
+doing reviews.
+
+These guidelines are largely inspired by Chris Beam's
+[How to Write a Commit Message](https://chris.beams.io/posts/git-commit/) post.
+
+### How to integrate a PR ?
+
+Getting your contributions integrated is relatively straightforward, here is the
+checklist:
+
+- All tests pass
+- Consensus is reached. This usually means that at least two reviewers approved
+  the changes (or added a `LGTM` comment) and at least one business day passed
+  without anyone objecting. `LGTM` is an acronym for _Looks Good to Me_.
+- To accommodate developers explicitly asking for more time to test the proposed
+  changes, integration time can be delayed by few more days.
+- If you do NOT have push access, a core developer will integrate your PR. If
+  you would like to speed up the integration, do not hesitate to add a reminder
+  comment to the PR
+
+### Automatic testing of pull requests
+
+Every pull request is tested automatically using GitHub Actions each time you
+push a commit to it. The GitHub UI will restrict users from merging pull
+requests until the CI build has returned with a successful result indicating
+that all tests have passed.
+
+[idc-forum]: https://discourse.canceridc.dev
+[idc-index-issues]: https://github.com/ImagingDataCommons/idc-index/issues
+[idc-index-docs]: https://idc-index.readthedocs.io/
diff --git a/idc_index/index.py b/idc_index/index.py
index 8a5e33f7..4089342c 100644
--- a/idc_index/index.py
+++ b/idc_index/index.py
@@ -21,6 +21,7 @@
 
 aws_endpoint_url = "https://s3.amazonaws.com"
 gcp_endpoint_url = "https://storage.googleapis.com"
+asset_endpoint_url = f"https://api.github.com/repos/ImagingDataCommons/idc-index-data/releases/tags/{idc_index_data.__version__}"
 
 logging.basicConfig(format="%(asctime)s - %(message)s", level=logging.INFO)
 logger = logging.getLogger(__name__)
@@ -67,7 +68,24 @@ def __init__(self):
         self.collection_summary = self.index.groupby("collection_id").agg(
             {"Modality": pd.Series.unique, "series_size_MB": "sum"}
         )
-        self.indices_overview = self.list_indices()
+
+        self.indices_overview = pd.DataFrame(
+            {
+                "index": {"description": None, "installed": True, "url": None},
+                "sm_index": {
+                    "description": None,
+                    "installed": True,
+                    "url": os.path.join(asset_endpoint_url, "sm_index.parquet"),
+                },
+                "sm_instance_index": {
+                    "description": None,
+                    "installed": True,
+                    "url": os.path.join(
+                        asset_endpoint_url, "sm_instance_index.parquet"
+                    ),
+                },
+            }
+        )
 
         # Lookup s5cmd
         self.s5cmdPath = shutil.which("s5cmd")
@@ -172,33 +190,6 @@ def get_idc_version():
         idc_version = Version(idc_index_data.__version__).major
         return f"v{idc_version}"
 
-    @staticmethod
-    def _get_latest_idc_index_data_release_assets():
-        """
-        Retrieves a list of the latest idc-index-data release assets.
-
-        Returns:
-            release_assets (list): List of tuples (asset_name, asset_url).
-        """
-        release_assets = []
-        url = f"https://api.github.com/repos/ImagingDataCommons/idc-index-data/releases/tags/{idc_index_data.__version__}"
-        try:
-            response = requests.get(url, timeout=30)
-            if response.status_code == 200:
-                release_data = response.json()
-                assets = release_data.get("assets", [])
-                for asset in assets:
-                    release_assets.append(
-                        (asset["name"], asset["browser_download_url"])
-                    )
-            else:
-                logger.error(f"Failed to fetch releases: {response.status_code}")
-
-        except FileNotFoundError:
-            logger.error(f"Failed to fetch releases: {response.status_code}")
-
-        return release_assets
-
     def list_indices(self):
         """
         Lists all available indices including their installation status.
@@ -207,40 +198,6 @@ def list_indices(self):
             indices_overview (pd.DataFrame): DataFrame containing information per index.
         """
 
-        if "indices_overview" not in locals():
-            indices_overview = {}
-            # Find installed indices
-            for file in distribution("idc-index-data").files:
-                if str(file).endswith("index.parquet"):
-                    index_name = os.path.splitext(
-                        str(file).rsplit("/", maxsplit=1)[-1]
-                    )[0]
-
-                    indices_overview[index_name] = {
-                        "description": None,
-                        "installed": True,
-                        "local_path": os.path.join(
-                            idc_index_data.IDC_INDEX_PARQUET_FILEPATH.parents[0],
-                            f"{index_name}.parquet",
-                        ),
-                    }
-
-            # Find available indices from idc-index-data
-            release_assets = self._get_latest_idc_index_data_release_assets()
-            for asset_name, asset_url in release_assets:
-                if asset_name.endswith(".parquet"):
-                    asset_name = os.path.splitext(asset_name)[0]
-                    if asset_name not in indices_overview:
-                        indices_overview[asset_name] = {
-                            "description": None,
-                            "installed": False,
-                            "url": asset_url,
-                        }
-
-            self.indices_overview = pd.DataFrame.from_dict(
-                indices_overview, orient="index"
-            )
-
         return self.indices_overview
 
     def fetch_index(self, index) -> None:
@@ -251,14 +208,14 @@ def fetch_index(self, index) -> None:
             index (str): Name of the index to be downloaded.
         """
 
-        if index not in self.indices_overview.index.tolist():
+        if index not in self.indices_overview.keys():
             logger.error(f"Index {index} is not available and can not be fetched.")
-        elif self.indices_overview.loc[index, "installed"]:
+        elif self.indices_overview[index]["installed"]:
             logger.warning(
                 f"Index {index} already installed and will not be fetched again."
             )
         else:
-            response = requests.get(self.indices_overview.loc[index, "url"], timeout=30)
+            response = requests.get(self.indices_overview[index]["url"], timeout=30)
             if response.status_code == 200:
                 filepath = os.path.join(
                     idc_index_data.IDC_INDEX_PARQUET_FILEPATH.parents[0],
@@ -266,8 +223,7 @@ def fetch_index(self, index) -> None:
                 )
                 with open(filepath, mode="wb") as file:
                     file.write(response.content)
-                self.indices_overview.loc[index, "installed"] = True
-                self.indices_overview.loc[index, "local_path"] = filepath
+                self.indices_overview[index]["installed"] = True
             else:
                 logger.error(f"Failed to fetch index: {response.status_code}")
 
@@ -668,8 +624,8 @@ def _validate_update_manifest_and_get_download_size(
         # create a copy of the index
         index_df_copy = self.index
 
-        # Extract s3 url and crdc_instance_uuid from the manifest copy commands
-        # Next, extract crdc_instance_uuid from aws_series_url in the index and
+        # Extract s3 url and crdc_series_uuid from the manifest copy commands
+        # Next, extract crdc_series_uuid from aws_series_url in the index and
         # try to verify if every series in the manifest is present in the index
 
         # TODO: need to remove the assumption that manifest commands will have 'cp'
@@ -697,8 +653,9 @@ def _validate_update_manifest_and_get_download_size(
                 seriesInstanceuid,
                 s3_url,
                 series_size_MB,
-                index_crdc_series_uuid==manifest_crdc_series_uuid AS crdc_series_uuid_match,
+                index_crdc_series_uuid is not NULL as crdc_series_uuid_match,
                 s3_url==series_aws_url AS s3_url_match,
+                manifest_temp.manifest_cp_cmd,
             CASE
                 WHEN s3_url==series_aws_url THEN 'aws'
             ELSE
@@ -717,19 +674,23 @@ def _validate_update_manifest_and_get_download_size(
 
         endpoint_to_use = None
 
-        if validate_manifest:
-            # Check if crdc_instance_uuid is found in the index
-            if not all(merged_df["crdc_series_uuid_match"]):
-                missing_manifest_cp_cmds = merged_df.loc[
-                    ~merged_df["crdc_series_uuid_match"], "manifest_cp_cmd"
-                ]
-                missing_manifest_cp_cmds_str = f"The following manifest copy commands do not have any associated series in the index: {missing_manifest_cp_cmds.tolist()}"
-                raise ValueError(missing_manifest_cp_cmds_str)
+        # Check if any crdc_series_uuid are not found in the index
+        if not all(merged_df["crdc_series_uuid_match"]):
+            missing_manifest_cp_cmds = merged_df.loc[
+                ~merged_df["crdc_series_uuid_match"], "manifest_cp_cmd"
+            ]
+            logger.error(
+                "The following manifest copy commands are not recognized as referencing any associated series in the index.\n"
+                "This means either these commands are invalid, or they may correspond to files available in a release of IDC\n"
+                f"different from {self.get_idc_version()} used in this version of idc-index. The corresponding files will not be downloaded.\n"
+            )
+            logger.error("\n" + "\n".join(missing_manifest_cp_cmds.tolist()))
 
-            # Check if there are more than one endpoints
+        if validate_manifest:
+            # Check if there is more than one endpoint
             if len(merged_df["endpoint"].unique()) > 1:
                 raise ValueError(
-                    "Either GCS bucket path is invalid or manifest has a mix of GCS and AWS urls. If so, please use urls from one provider only"
+                    "Either GCS bucket path is invalid or manifest has a mix of GCS and AWS urls. "
                 )
 
             if (
diff --git a/pyproject.toml b/pyproject.toml
index a4d8c825..92304920 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -125,6 +125,7 @@ disallow_incomplete_defs = true
 
 [tool.ruff]
 src = ["idc_index"]
+extend-exclude = ["./CONTRIBUTING.md"]
 
 [tool.ruff.lint]
 extend-select = [
diff --git a/tests/idcindex.py b/tests/idcindex.py
deleted file mode 100644
index c0806134..00000000
--- a/tests/idcindex.py
+++ /dev/null
@@ -1,476 +0,0 @@
-from __future__ import annotations
-
-import logging
-import os
-import tempfile
-import unittest
-from itertools import product
-from pathlib import Path
-
-import pandas as pd
-import pytest
-from click.testing import CliRunner
-from idc_index import IDCClient, cli
-
-# Run tests using the following command from the root of the repository:
-# python -m unittest -vv tests/idcindex.py
-
-logging.basicConfig(level=logging.DEBUG)
-
-
-@pytest.fixture(autouse=True)
-def _change_test_dir(request, monkeypatch):
-    monkeypatch.chdir(request.fspath.dirname)
-
-
-class TestIDCClient(unittest.TestCase):
-    def setUp(self):
-        self.client = IDCClient()
-        self.download_from_manifest = cli.download_from_manifest
-        self.download_from_selection = cli.download_from_selection
-        self.download = cli.download
-
-        logger = logging.getLogger("idc_index")
-        logger.setLevel(logging.DEBUG)
-
-    def test_get_collections(self):
-        collections = self.client.get_collections()
-        self.assertIsNotNone(collections)
-
-    def test_get_idc_version(self):
-        idc_version = self.client.get_idc_version()
-        self.assertIsNotNone(idc_version)
-        self.assertTrue(idc_version.startswith("v"))
-
-    def test_get_patients(self):
-        # Define the values for each optional parameter
-        output_format_values = ["list", "dict", "df"]
-        collection_id_values = [
-            "htan_ohsu",
-            ["ct_phantom4radiomics", "cmb_gec"],
-        ]
-
-        # Test each combination
-        for collection_id in collection_id_values:
-            for output_format in output_format_values:
-                patients = self.client.get_patients(
-                    collection_id=collection_id, outputFormat=output_format
-                )
-
-                # Check if the output format matches the expected type
-                if output_format == "list":
-                    self.assertIsInstance(patients, list)
-                    self.assertTrue(bool(patients))  # Check that the list is not empty
-                elif output_format == "dict":
-                    self.assertTrue(
-                        isinstance(patients, dict)
-                        or (
-                            isinstance(patients, list)
-                            and all(isinstance(i, dict) for i in patients)
-                        )
-                    )  # Check that the output is either a dictionary or a list of dictionaries
-                    self.assertTrue(
-                        bool(patients)
-                    )  # Check that the output is not empty
-                elif output_format == "df":
-                    self.assertIsInstance(patients, pd.DataFrame)
-                    self.assertFalse(
-                        patients.empty
-                    )  # Check that the DataFrame is not empty
-
-    def test_get_studies(self):
-        # Define the values for each optional parameter
-        output_format_values = ["list", "dict", "df"]
-        patient_id_values = ["PCAMPMRI-00001", ["PCAMPMRI-00001", "NoduleLayout_1"]]
-
-        # Test each combination
-        for patient_id in patient_id_values:
-            for output_format in output_format_values:
-                studies = self.client.get_dicom_studies(
-                    patientId=patient_id, outputFormat=output_format
-                )
-
-                # Check if the output format matches the expected type
-                if output_format == "list":
-                    self.assertIsInstance(studies, list)
-                    self.assertTrue(bool(studies))  # Check that the list is not empty
-                elif output_format == "dict":
-                    self.assertTrue(
-                        isinstance(studies, dict)
-                        or (
-                            isinstance(studies, list)
-                            and all(isinstance(i, dict) for i in studies)
-                        )
-                    )  # Check that the output is either a dictionary or a list of dictionaries
-                    self.assertTrue(bool(studies))  # Check that the output is not empty
-                elif output_format == "df":
-                    self.assertIsInstance(studies, pd.DataFrame)
-                    self.assertFalse(
-                        studies.empty
-                    )  # Check that the DataFrame is not empty
-
-    def test_get_series(self):
-        """
-        Query used for selecting the smallest series/studies:
-
-        SELECT
-            StudyInstanceUID,
-            ARRAY_AGG(DISTINCT(collection_id)) AS collection,
-            ARRAY_AGG(DISTINCT(series_aws_url)) AS aws_url,
-            ARRAY_AGG(DISTINCT(series_gcs_url)) AS gcs_url,
-            COUNT(DISTINCT(SOPInstanceUID)) AS num_instances,
-            SUM(instance_size) AS series_size
-        FROM
-            `bigquery-public-data.idc_current.dicom_all`
-        GROUP BY
-            StudyInstanceUID
-        HAVING
-            num_instances > 2
-        ORDER BY
-            series_size asc
-        LIMIT
-            10
-        """
-        # Define the values for each optional parameter
-        output_format_values = ["list", "dict", "df"]
-        study_instance_uid_values = [
-            "1.3.6.1.4.1.14519.5.2.1.6279.6001.175012972118199124641098335511",
-            [
-                "1.3.6.1.4.1.14519.5.2.1.1239.1759.691327824408089993476361149761",
-                "1.3.6.1.4.1.14519.5.2.1.1239.1759.272272273744698671736205545239",
-            ],
-        ]
-
-        # Test each combination
-        for study_instance_uid in study_instance_uid_values:
-            for output_format in output_format_values:
-                series = self.client.get_dicom_series(
-                    studyInstanceUID=study_instance_uid, outputFormat=output_format
-                )
-
-                # Check if the output format matches the expected type
-                if output_format == "list":
-                    self.assertIsInstance(series, list)
-                    self.assertTrue(bool(series))  # Check that the list is not empty
-                elif output_format == "dict":
-                    self.assertTrue(
-                        isinstance(series, dict)
-                        or (
-                            isinstance(series, list)
-                            and all(isinstance(i, dict) for i in series)
-                        )
-                    )  # Check that the output is either a dictionary or a list of dictionaries
-                elif output_format == "df":
-                    self.assertIsInstance(series, pd.DataFrame)
-                    self.assertFalse(
-                        series.empty
-                    )  # Check that the DataFrame is not empty
-
-    def test_download_dicom_series(self):
-        with tempfile.TemporaryDirectory() as temp_dir:
-            self.client.download_dicom_series(
-                seriesInstanceUID="1.3.6.1.4.1.14519.5.2.1.7695.1700.153974929648969296590126728101",
-                downloadDir=temp_dir,
-            )
-            self.assertEqual(sum([len(files) for r, d, files in os.walk(temp_dir)]), 3)
-
-    def test_download_with_template(self):
-        dirTemplateValues = [
-            None,
-            "%collection_id_%PatientID/%Modality-%StudyInstanceUID%SeriesInstanceUID",
-            "%collection_id%PatientID-%Modality_%StudyInstanceUID/%SeriesInstanceUID",
-            "%collection_id-%PatientID_%Modality/%StudyInstanceUID-%SeriesInstanceUID",
-            "%collection_id_%PatientID/%Modality/%StudyInstanceUID_%SeriesInstanceUID",
-        ]
-        for template in dirTemplateValues:
-            with tempfile.TemporaryDirectory() as temp_dir:
-                self.client.download_from_selection(
-                    downloadDir=temp_dir,
-                    studyInstanceUID="1.3.6.1.4.1.14519.5.2.1.7695.1700.114861588187429958687900856462",
-                    dirTemplate=template,
-                )
-                self.assertEqual(
-                    sum([len(files) for r, d, files in os.walk(temp_dir)]), 3
-                )
-
-    def test_download_from_selection(self):
-        # Define the values for each optional parameter
-        dry_run_values = [True, False]
-        quiet_values = [True, False]
-        show_progress_bar_values = [True, False]
-        use_s5cmd_sync_values = [True, False]
-
-        # Generate all combinations of optional parameters
-        combinations = product(
-            dry_run_values,
-            quiet_values,
-            show_progress_bar_values,
-            use_s5cmd_sync_values,
-        )
-
-        # Test each combination
-        for (
-            dry_run,
-            quiet,
-            show_progress_bar,
-            use_s5cmd_sync,
-        ) in combinations:
-            with tempfile.TemporaryDirectory() as temp_dir:
-                self.client.download_from_selection(
-                    downloadDir=temp_dir,
-                    dry_run=dry_run,
-                    patientId=None,
-                    studyInstanceUID="1.3.6.1.4.1.14519.5.2.1.7695.1700.114861588187429958687900856462",
-                    seriesInstanceUID=None,
-                    quiet=quiet,
-                    show_progress_bar=show_progress_bar,
-                    use_s5cmd_sync=use_s5cmd_sync,
-                )
-
-                if not dry_run:
-                    self.assertNotEqual(len(os.listdir(temp_dir)), 0)
-
-    def test_sql_queries(self):
-        df = self.client.sql_query("SELECT DISTINCT(collection_id) FROM index")
-
-        self.assertIsNotNone(df)
-
-    def test_download_from_aws_manifest(self):
-        # Define the values for each optional parameter
-        quiet_values = [True, False]
-        validate_manifest_values = [True, False]
-        show_progress_bar_values = [True, False]
-        use_s5cmd_sync_values = [True, False]
-        dirTemplateValues = [
-            None,
-            "%collection_id/%PatientID/%Modality/%StudyInstanceUID/%SeriesInstanceUID",
-            "%collection_id%PatientID%Modality%StudyInstanceUID%SeriesInstanceUID",
-        ]
-        # Generate all combinations of optional parameters
-        combinations = product(
-            quiet_values,
-            validate_manifest_values,
-            show_progress_bar_values,
-            use_s5cmd_sync_values,
-            dirTemplateValues,
-        )
-        # Test each combination
-        for (
-            quiet,
-            validate_manifest,
-            show_progress_bar,
-            use_s5cmd_sync,
-            dirTemplate,
-        ) in combinations:
-            with tempfile.TemporaryDirectory() as temp_dir:
-                self.client.download_from_manifest(
-                    manifestFile="./study_manifest_aws.s5cmd",
-                    downloadDir=temp_dir,
-                    quiet=quiet,
-                    validate_manifest=validate_manifest,
-                    show_progress_bar=show_progress_bar,
-                    use_s5cmd_sync=use_s5cmd_sync,
-                    dirTemplate=dirTemplate,
-                )
-
-                if sum([len(files) for _, _, files in os.walk(temp_dir)]) != 9:
-                    print(
-                        f"Failed for {quiet} {validate_manifest} {show_progress_bar} {use_s5cmd_sync} {dirTemplate}"
-                    )
-                    self.assertFalse(True)
-
-    def test_download_from_gcp_manifest(self):
-        # Define the values for each optional parameter
-        quiet_values = [True, False]
-        validate_manifest_values = [True, False]
-        show_progress_bar_values = [True, False]
-        use_s5cmd_sync_values = [True, False]
-        dirTemplateValues = [
-            None,
-            "%collection_id/%PatientID/%Modality/%StudyInstanceUID/%SeriesInstanceUID",
-            "%collection_id_%PatientID_%Modality_%StudyInstanceUID_%SeriesInstanceUID",
-        ]
-        # Generate all combinations of optional parameters
-        combinations = product(
-            quiet_values,
-            validate_manifest_values,
-            show_progress_bar_values,
-            use_s5cmd_sync_values,
-            dirTemplateValues,
-        )
-
-        # Test each combination
-        for (
-            quiet,
-            validate_manifest,
-            show_progress_bar,
-            use_s5cmd_sync,
-            dirTemplate,
-        ) in combinations:
-            with tempfile.TemporaryDirectory() as temp_dir:
-                self.client.download_from_manifest(
-                    manifestFile="./study_manifest_gcs.s5cmd",
-                    downloadDir=temp_dir,
-                    quiet=quiet,
-                    validate_manifest=validate_manifest,
-                    show_progress_bar=show_progress_bar,
-                    use_s5cmd_sync=use_s5cmd_sync,
-                    dirTemplate=dirTemplate,
-                )
-
-                self.assertEqual(
-                    sum([len(files) for r, d, files in os.walk(temp_dir)]), 9
-                )
-
-    def test_download_from_bogus_manifest(self):
-        # Define the values for each optional parameter
-        quiet_values = [True, False]
-        validate_manifest_values = [True, False]
-        show_progress_bar_values = [True, False]
-        use_s5cmd_sync_values = [True, False]
-
-        # Generate all combinations of optional parameters
-        combinations = product(
-            quiet_values,
-            validate_manifest_values,
-            show_progress_bar_values,
-            use_s5cmd_sync_values,
-        )
-
-        # Test each combination
-        for (
-            quiet,
-            validate_manifest,
-            show_progress_bar,
-            use_s5cmd_sync,
-        ) in combinations:
-            with tempfile.TemporaryDirectory() as temp_dir:
-                self.client.download_from_manifest(
-                    manifestFile="./study_manifest_bogus.s5cmd",
-                    downloadDir=temp_dir,
-                    quiet=quiet,
-                    validate_manifest=validate_manifest,
-                    show_progress_bar=show_progress_bar,
-                    use_s5cmd_sync=use_s5cmd_sync,
-                )
-
-                self.assertEqual(len(os.listdir(temp_dir)), 0)
-
-    """
-    disabling these tests due to a consistent server timeout issue
-    def test_citations(self):
-        citations = self.client.citations_from_selection(
-            collection_id="tcga_gbm",
-            citation_format=index.IDCClient.CITATION_FORMAT_APA,
-        )
-        self.assertIsNotNone(citations)
-
-        citations = self.client.citations_from_selection(
-            seriesInstanceUID="1.3.6.1.4.1.14519.5.2.1.7695.4164.588007658875211151397302775781",
-            citation_format=index.IDCClient.CITATION_FORMAT_BIBTEX,
-        )
-        self.assertIsNotNone(citations)
-
-        citations = self.client.citations_from_selection(
-            studyInstanceUID="1.2.840.113654.2.55.174144834924218414213677353968537663991",
-            citation_format=index.IDCClient.CITATION_FORMAT_BIBTEX,
-        )
-        self.assertIsNotNone(citations)
-
-        citations = self.client.citations_from_manifest("./study_manifest_aws.s5cmd")
-        self.assertIsNotNone(citations)
-    """
-
-    def test_cli_download_from_selection(self):
-        runner = CliRunner()
-        with tempfile.TemporaryDirectory() as temp_dir:
-            result = runner.invoke(
-                self.download_from_selection,
-                [
-                    "--download-dir",
-                    temp_dir,
-                    "--dry-run",
-                    False,
-                    "--quiet",
-                    True,
-                    "--show-progress-bar",
-                    True,
-                    "--use-s5cmd-sync",
-                    False,
-                    "--study-instance-uid",
-                    "1.3.6.1.4.1.14519.5.2.1.7695.1700.114861588187429958687900856462",
-                ],
-            )
-            assert len(os.listdir(temp_dir)) != 0
-
-    def test_cli_download_from_manifest(self):
-        runner = CliRunner()
-        with tempfile.TemporaryDirectory() as temp_dir:
-            result = runner.invoke(
-                self.download_from_manifest,
-                [
-                    "--manifest-file",
-                    "./study_manifest_aws.s5cmd",
-                    "--download-dir",
-                    temp_dir,
-                    "--quiet",
-                    True,
-                    "--show-progress-bar",
-                    True,
-                    "--use-s5cmd-sync",
-                    False,
-                ],
-            )
-            assert len(os.listdir(temp_dir)) != 0
-
-    def test_singleton_attribute(self):
-        # singleton, initialized on first use
-        i1 = IDCClient.client()
-        i2 = IDCClient.client()
-
-        # new instances created via constructor (through init)
-        i3 = IDCClient()
-        i4 = self.client
-
-        # all must be not none
-        assert i1 is not None
-        assert i2 is not None
-        assert i3 is not None
-        assert i4 is not None
-
-        # singletons must return the same instance
-        assert i1 == i2
-
-        # new instances must be different
-        assert i1 != i3
-        assert i1 != i4
-        assert i3 != i4
-
-        # all must be instances of IDCClient
-        assert isinstance(i1, IDCClient)
-        assert isinstance(i2, IDCClient)
-        assert isinstance(i3, IDCClient)
-        assert isinstance(i4, IDCClient)
-
-    def test_cli_download(self):
-        runner = CliRunner()
-        with runner.isolated_filesystem():
-            result = runner.invoke(
-                self.download,
-                ["1.3.6.1.4.1.14519.5.2.1.7695.1700.114861588187429958687900856462"],
-            )
-            assert len(os.listdir(Path.cwd())) != 0
-
-    def test_list_indices(self):
-        i = IDCClient()
-        assert not i.indices_overview.empty  # assert that df was created
-
-    def test_fetch_index(self):
-        i = IDCClient()
-        assert i.indices_overview["sm_index", "installed"] is False
-        i.fetch_index("sm_index")
-        assert i.indices_overview["sm_index", "installed"] is True
-
-
-if __name__ == "__main__":
-    unittest.main()
diff --git a/tests/prior_version_manifest.s5cmd b/tests/prior_version_manifest.s5cmd
new file mode 100644
index 00000000..1c91a450
--- /dev/null
+++ b/tests/prior_version_manifest.s5cmd
@@ -0,0 +1,5 @@
+cp s3://idc-open-data/040fd3e1-0088-4bfd-8439-55e3c5d80a56/*  .
+cp s3://idc-open-data/04553d0f-1af9-414d-b631-cc31624aced5/*  .
+cp s3://idc-open-data/068346bf-16ef-4e45-87bf-87feb576a21c/*  .
+cp s3://idc-open-data/07908d47-5e85-45f3-9649-79c15f606f52/*  .
+cp s3://idc-open-data/099d180f-1d79-402d-abad-bfd8e2736b04/*  .