From dd5e66d834ddb7df6a7c57843ce5bbd71651e8ef Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 10:44:04 -0800
Subject: [PATCH 01/36] Use sphinx variables from the environment if present

This is required for docs-ci to work properly.

It is already done in docs.nextstrain.org and ncov:

https://github.com/nextstrain/docs.nextstrain.org/blob/5b82daa7d6f9f784e93bdef967af101f91fd2c75/Makefile#L4-L7
https://github.com/nextstrain/ncov/blob/822a77ba070d1d37483d45c22a00b5f83d16e974/docs/Makefile#L4-L7
---
 docs/Makefile | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/Makefile b/docs/Makefile
index 0316170ff..1518a91bb 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -1,9 +1,10 @@
 # Minimal makefile for Sphinx documentation
 #
 
-# You can set these variables from the command line.
-SPHINXOPTS    =
-SPHINXBUILD   = sphinx-build
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
 SOURCEDIR     = .
 BUILDDIR      = _build
 

From b62ecb5b9d3aed54409c9adef2fc1f1032142266 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 10:32:41 -0800
Subject: [PATCH 02/36] Update CI workflow to catch warnings in docs

This uses a reusable workflow in the nextstrain/.github repo.
---
 .github/workflows/ci.yaml | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml
index b7d54dae5..c36c7958d 100644
--- a/.github/workflows/ci.yaml
+++ b/.github/workflows/ci.yaml
@@ -93,3 +93,9 @@ jobs:
       - uses: codecov/codecov-action@v3
         with:
           fail_ci_if_error: true
+
+  build-docs:
+    uses: nextstrain/.github/.github/workflows/docs-ci.yaml@master
+    with:
+      docs-directory: docs/
+      pip-install-target: .[dev]

From 9227345bdb9f41251ae50edf37e26a94f6ca6ed4 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 17 Jan 2023 16:13:24 -0800
Subject: [PATCH 03/36] docs/faq/metadata: Convert page to rST

rST is our standard doc format; Markdown is the legacy format.

Initial conversion performed with:

    pandoc -f markdown-smart -t rst-smart docs/faq/metadata.md > docs/faq/metadata.rst

and then I hand reviewed and made additional edits.

This fixes a broken link to https://docs.nextstrain.org/en/latest/guides/bioinformatics/lat_longs.html.
---
 docs/faq/metadata.md  |  59 -----------------------
 docs/faq/metadata.rst | 108 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+), 59 deletions(-)
 delete mode 100644 docs/faq/metadata.md
 create mode 100644 docs/faq/metadata.rst

diff --git a/docs/faq/metadata.md b/docs/faq/metadata.md
deleted file mode 100644
index 9c455d288..000000000
--- a/docs/faq/metadata.md
+++ /dev/null
@@ -1,59 +0,0 @@
-# Preparing Your Metadata
-
-Analyses are vastly more interesting if the sequences or samples analyzed have rich 'meta data' wherever possible. This metadata could typically include collection dates, geographic location, symptoms of patients, host characteristics, etc.
-
-To make the most of augur's features, we recommend including sampling date and at least one type of geographic information if at all possible. However, you can also include things like symptoms, host, clinical outcome - and more!
-
-For augur to be able to parse this data, it needs to be formated consistently. Your data may have meta information coded into the sequence name (see example [below](#parsing-from-the-header)). If not, a very transparent way is to provide the meta data as a separate table in a tab- or comma-separated file.
-
-An example meta data file is shown here:
-
-```
-strain      accession   date        region          host
-1_0087_PF   KX447509    2013-12-XX  Oceania         Human
-1_0181_PF   KX447512    2013-12-XX  Oceania         Bat
-1_0199_PF   KX447519    2013-11-XX  Oceania         Human
-BRA/2016    KY785433    2016-04-08  South America   Cow
-BRA/2015    KY558989    2015-02-23  South America   Bat
-```
-
-### A note on Excel
-
-Because Excel will automatically change the date formatting, we recommend _not_ opening or preparing your meta data file in Excel. If the metadata is already in Excel, or you decide to prepare it in Excel, we recommend using another program to correct the dates afterwards (and don't open it in Excel again!).
-
-### Format
-
-**Strain names**
-
-You must have one column named `strain` or `name`. It contains your sequence names, and needs to match the identifiers of your sequences (in the Fasta or VCF file) _exactly_ and must not contain characters such as spaces, or `()[]{}|#><`.
-
-**Dates**
-
-Dates should be formated according as `YYYY-MM-DD`. You can specify unknown dates or month by replacing the respected values by `XX` (ex: `2013-01-XX` or `2011-XX-XX`) and completely unknown dates can be shown with `20XX-XX-XX` (which does not restrict the sequence to being in the 21st century - they could be earlier).
-
-**Geography**
-
-Geographic locations can be broken down, for example, into `region`, `country`, `division` or `city`. You can have as many levels of geographic information as you wish. For `region`, `country`, and some `division`s augur already knows many lat-long coordinates (see which ones it already knows by checking the list [here](https://github.com/nextstrain/augur/blob/master/augur/data/lat_longs.tsv)).
-
-It is important that these are spelled consistently.
-
-If you want to include locations where augur doesn't know the lat-long values, you can include them - see how [here](./lat_longs.html).
-
-### Consistancy and Style
-
-Check that your metadata is free from spelling mistakes and that values are consistant. Augur doesn't know that 'UK' and 'United Kingdom' or 'cat' and 'feline' are the same!
-
-Previously, auspice 'prettified' traits by capitalizing them automatically, and removing the underscores that separated two-word locations ('new_zealand' became 'New Zealand').
-
-Auspice will still do this if you are exporting 'V1' type JSON files (from augur v5 or augur v6 using `export v1`), but will not do this if you are using `export v2` ([read more](../releases/migrating-v5-v6.html#prettifying-metadata-fields)). Instead, you should update your metadata files so that traits look the same as you'd like them to display in Auspice (change 'new_zealand' to 'New Zealand' in your metadata, and in any additional latitude-longitude or coloring files you use).
-
-### Parsing from the header
-
-Sometimes, metadata can be coded into the Fasta header, like so:
-
-```
->1_0087_PF | KX447509 | 2013-12-XX | oceania
-ACTCGCTGCATCG...
-```
-
-Augur can parse meta data from Fasta headers using the `parse` function (see [here](/usage/cli/parse)), but you have to make sure that every sequence has the exact same meta data fields (even if empty), and that they are consistently delimited with `|`. Furthermore, none of the metadata fields can contain the character `|`.
diff --git a/docs/faq/metadata.rst b/docs/faq/metadata.rst
new file mode 100644
index 000000000..cbf54850b
--- /dev/null
+++ b/docs/faq/metadata.rst
@@ -0,0 +1,108 @@
+Preparing Your Metadata
+=======================
+
+Analyses are vastly more interesting if the sequences or samples
+analyzed have rich 'meta data' wherever possible. This metadata could
+typically include collection dates, geographic location, symptoms of
+patients, host characteristics, etc.
+
+To make the most of augur's features, we recommend including sampling
+date and at least one type of geographic information if at all possible.
+However, you can also include things like symptoms, host, clinical
+outcome - and more!
+
+For augur to be able to parse this data, it needs to be formated
+consistently. Your data may have meta information coded into the
+sequence name (see example :ref:`below<parsing-from-the-header>`). If
+not, a very transparent way is to provide the meta data as a separate
+table in a tab- or comma-separated file.
+
+An example meta data file is shown here:
+
+::
+
+   strain      accession   date        region          host
+   1_0087_PF   KX447509    2013-12-XX  Oceania         Human
+   1_0181_PF   KX447512    2013-12-XX  Oceania         Bat
+   1_0199_PF   KX447519    2013-11-XX  Oceania         Human
+   BRA/2016    KY785433    2016-04-08  South America   Cow
+   BRA/2015    KY558989    2015-02-23  South America   Bat
+
+A note on Excel
+~~~~~~~~~~~~~~~
+
+Because Excel will automatically change the date formatting, we
+recommend *not* opening or preparing your meta data file in Excel. If
+the metadata is already in Excel, or you decide to prepare it in Excel,
+we recommend using another program to correct the dates afterwards (and
+don't open it in Excel again!).
+
+Format
+~~~~~~
+
+**Strain names**
+
+You must have one column named ``strain`` or ``name``. It contains your
+sequence names, and needs to match the identifiers of your sequences (in
+the Fasta or VCF file) *exactly* and must not contain characters such as
+spaces, or ``()[]{}|#><``.
+
+**Dates**
+
+Dates should be formated according as ``YYYY-MM-DD``. You can specify
+unknown dates or month by replacing the respected values by ``XX`` (ex:
+``2013-01-XX`` or ``2011-XX-XX``) and completely unknown dates can be
+shown with ``20XX-XX-XX`` (which does not restrict the sequence to being
+in the 21st century - they could be earlier).
+
+**Geography**
+
+Geographic locations can be broken down, for example, into ``region``,
+``country``, ``division`` or ``city``. You can have as many levels of
+geographic information as you wish. For ``region``, ``country``, and
+some ``division``\ s augur already knows many lat-long coordinates (see
+which ones it already knows by checking the list
+`here <https://github.com/nextstrain/augur/blob/master/augur/data/lat_longs.tsv>`__).
+
+It is important that these are spelled consistently.
+
+If you want to include locations where augur doesn't know the lat-long
+values, you can include them - see how :doc:`here <docs.nextstrain.org:guides/bioinformatics/lat_longs>`.
+
+Consistancy and Style
+~~~~~~~~~~~~~~~~~~~~~
+
+Check that your metadata is free from spelling mistakes and that values
+are consistant. Augur doesn't know that 'UK' and 'United Kingdom' or
+'cat' and 'feline' are the same!
+
+Previously, auspice 'prettified' traits by capitalizing them
+automatically, and removing the underscores that separated two-word
+locations ('new_zealand' became 'New Zealand').
+
+Auspice will still do this if you are exporting 'V1' type JSON files
+(from augur v5 or augur v6 using ``export v1``), but will not do this if
+you are using ``export v2`` (`read
+more <../releases/migrating-v5-v6.html#prettifying-metadata-fields>`__).
+Instead, you should update your metadata files so that traits look the
+same as you'd like them to display in Auspice (change 'new_zealand' to
+'New Zealand' in your metadata, and in any additional latitude-longitude
+or coloring files you use).
+
+.. _parsing-from-the-header:
+
+Parsing from the header
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes, metadata can be coded into the Fasta header, like so:
+
+::
+
+   >1_0087_PF | KX447509 | 2013-12-XX | oceania
+   ACTCGCTGCATCG...
+
+Augur can parse meta data from Fasta headers using the ``parse``
+function (see :doc:`here </usage/cli/parse>`), but you have to make sure
+that every sequence has the exact same meta data fields (even if empty),
+and that they are consistently delimited with ``|``. Furthermore, none
+of the metadata fields can contain the character ``|``.

From ce920ad9ed3a3fe2992fd2e451d68b8486a870ab Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 11:39:49 -0800
Subject: [PATCH 04/36] Update references to zika tutorial

The page was renamed in https://github.com/nextstrain/docs.nextstrain.org/commit/a3b38566b835e8f40955d35fb5e6e201a84c6174.
---
 docs/usage/cli/filter.rst | 4 ++--
 docs/usage/cli/index.rst  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/usage/cli/filter.rst b/docs/usage/cli/filter.rst
index 36250aee5..5bd45b778 100644
--- a/docs/usage/cli/filter.rst
+++ b/docs/usage/cli/filter.rst
@@ -15,7 +15,7 @@ augur filter
 How we subsample sequences in the zika-tutoral
 ==============================================
 
-As an example, we'll look that the ``filter`` command in greater detail using material from the :doc:`Zika tutorial <docs.nextstrain.org:tutorials/zika>`.
+As an example, we'll look that the ``filter`` command in greater detail using material from the :doc:`Zika tutorial <docs.nextstrain.org:tutorials/creating-a-workflow>`.
 The filter command allows you to selected various subsets of your input data for different types of analysis.
 A simple example use of this command would be
 
@@ -45,7 +45,7 @@ To drop such strains, you can pass the name of this file to the augur filter com
              --output filtered.fasta
 
 (To improve legibility, we have wrapped the command across multiple lines.)
-If you run this command (you should be able to copy-paste this into your terminal) on the data provided in the :doc:`Zika tutorial <docs.nextstrain.org:tutorials/zika>`, you should see that one of the sequences in the data set was dropped since its name was in the ``dropped_strains.txt`` file.
+If you run this command (you should be able to copy-paste this into your terminal) on the data provided in the :doc:`Zika tutorial <docs.nextstrain.org:tutorials/creating-a-workflow>`, you should see that one of the sequences in the data set was dropped since its name was in the ``dropped_strains.txt`` file.
 
 Another common filtering operation is subsetting of data to a achieve a more even spatio-temporal distribution or to cut-down data set size to more manageable numbers.
 The filter command allows you to select a specific number of sequences from specific groups, for example one sequence per month from each country:
diff --git a/docs/usage/cli/index.rst b/docs/usage/cli/index.rst
index aea3f83e0..aa95d8f45 100644
--- a/docs/usage/cli/index.rst
+++ b/docs/usage/cli/index.rst
@@ -11,7 +11,7 @@ augur index
 Speed up filtering with a sequence index
 ========================================
 
-As we describe in :doc:`the zika tutorial <docs.nextstrain.org:tutorials/zika>`, augur index precalculates the composition of the sequences (e.g., numbers of nucleotides, gaps, invalid characters, and total sequence length) prior to filtering.
+As we describe in :doc:`the zika tutorial <docs.nextstrain.org:tutorials/creating-a-workflow>`, augur index precalculates the composition of the sequences (e.g., numbers of nucleotides, gaps, invalid characters, and total sequence length) prior to filtering.
 The resulting sequence index speeds up subsequent filter steps especially in more complex workflows.
 
 .. code-block:: bash

From 2c8fdaa18e68f41e319414f299fbd7414631170e Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 13:30:23 -0800
Subject: [PATCH 05/36] Remove "Tests" section heading in docstring

This was causing a warning when building the docs:

    Tests
    -----
    None:8: CRITICAL: Unexpected section title.

Removing since docstrings in other files do not have this heading.
---
 augur/align.py | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/augur/align.py b/augur/align.py
index ebe038946..a95df36ba 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -280,9 +280,6 @@ def strip_non_reference(aln, reference, insertion_csv=None):
     list
         list of trimmed sequences, effectively a multiple alignment
 
-
-    Tests
-    -----
     >>> [s.name for s in strip_non_reference(read_alignment("tests/data/align/test_aligned_sequences.fasta"), "with_gaps")]
     Trimmed gaps in with_gaps from the alignment
     ['with_gaps', 'no_gaps', 'some_other_seq', '_R_crick_strand']

From 3aa9665c52860221834eb3f59841d8076bdb6f01 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 13:53:06 -0800
Subject: [PATCH 06/36] Use consistent table of contents

In CLI pages with multiple headings.
---
 docs/usage/cli/distance.rst | 5 +++++
 docs/usage/cli/filter.rst   | 3 ++-
 docs/usage/cli/index.rst    | 5 +++++
 docs/usage/cli/parse.rst    | 3 ++-
 docs/usage/cli/refine.rst   | 7 ++-----
 docs/usage/cli/traits.rst   | 5 ++++-
 6 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/docs/usage/cli/distance.rst b/docs/usage/cli/distance.rst
index 7ab45d20e..04a72dba0 100644
--- a/docs/usage/cli/distance.rst
+++ b/docs/usage/cli/distance.rst
@@ -2,6 +2,11 @@
 augur distance
 ==============
 
+.. contents:: Table of Contents
+   :local:
+
+----
+
 .. argparse::
     :module: augur
     :func: make_parser
diff --git a/docs/usage/cli/filter.rst b/docs/usage/cli/filter.rst
index 5bd45b778..85efd151d 100644
--- a/docs/usage/cli/filter.rst
+++ b/docs/usage/cli/filter.rst
@@ -2,7 +2,8 @@
 augur filter
 ============
 
-* `How we subsample sequences in the zika-tutoral <#how-we-subsample-sequences-in-the-zika-tutoral>`__
+.. contents:: Table of Contents
+   :local:
 
 ----
 
diff --git a/docs/usage/cli/index.rst b/docs/usage/cli/index.rst
index aa95d8f45..77a4c6b00 100644
--- a/docs/usage/cli/index.rst
+++ b/docs/usage/cli/index.rst
@@ -2,6 +2,11 @@
 augur index
 ============
 
+.. contents:: Table of Contents
+   :local:
+
+----
+
 .. argparse::
     :module: augur
     :func: make_parser
diff --git a/docs/usage/cli/parse.rst b/docs/usage/cli/parse.rst
index 283c7d94f..372a65b80 100644
--- a/docs/usage/cli/parse.rst
+++ b/docs/usage/cli/parse.rst
@@ -2,7 +2,8 @@
 augur parse
 ===========
 
-* `Example: how to parse metadata from fasta-headers <#example-how-to-parse-metadata-from-fasta-headers>`__
+.. contents:: Table of Contents
+   :local:
 
 ----
 
diff --git a/docs/usage/cli/refine.rst b/docs/usage/cli/refine.rst
index 6608565d4..1eb58a5a3 100644
--- a/docs/usage/cli/refine.rst
+++ b/docs/usage/cli/refine.rst
@@ -3,11 +3,8 @@ augur refine
 ===========================
 
 
-* `How we use refine in the zika tutorial <#how-we-use-refine-in-the-zika-tutorial>`__
-* `Specify the evolutionary rate <#specify-the-evolutionary-rate>`__
-* `Confidence intervals for divergence times <#confidence-intervals-for-divergence-times>`__
-* `Specifying the root of the tree <#specifying-the-root-of-the-tree>`__
-* `Polytomy resolution <#polytomy-resolution>`__
+.. contents:: Table of Contents
+   :local:
 
 ----
 
diff --git a/docs/usage/cli/traits.rst b/docs/usage/cli/traits.rst
index 772fc5592..c1034158d 100644
--- a/docs/usage/cli/traits.rst
+++ b/docs/usage/cli/traits.rst
@@ -2,7 +2,10 @@
 augur traits
 ============
 
-.. contents::
+.. contents:: Table of Contents
+   :local:
+
+----
 
 .. argparse::
     :module: augur

From f3314154330dc0a073362aef335d19062f900536 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 8 Nov 2022 13:55:07 -0800
Subject: [PATCH 07/36] distance: Use JSON syntax highlighting

Replace the literal blocks (introduced by the :: markers) with
code-block directives for nice syntax highlighting.
---
 augur/distance.py | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/augur/distance.py b/augur/distance.py
index d41be670e..f344a2652 100644
--- a/augur/distance.py
+++ b/augur/distance.py
@@ -39,7 +39,9 @@
 The `default` key specifies the numeric (floating point) value to assign to all mismatches by default.
 The `map` key specifies a dictionary of weights to use for distance calculations.
 These weights are indexed hierarchically by gene name and one-based gene coordinate and are assigned in either a sequence-independent or sequence-dependent manner.
-The simplest possible distance map calculates Hamming distance between sequences without any site-specific weights, as shown below::
+The simplest possible distance map calculates Hamming distance between sequences without any site-specific weights, as shown below:
+
+.. code-block:: json
 
     {
         "name": "Hamming distance",
@@ -48,7 +50,9 @@
     }
 
 By default, distances are floating point values whose precision can be controlled with the `precision` key that defines the number of decimal places to retain for each distance.
-The following example shows how to specify a precision of two decimal places in the final output.::
+The following example shows how to specify a precision of two decimal places in the final output:
+
+.. code-block:: json
 
     {
         "name": "Hamming distance",
@@ -57,7 +61,9 @@
         "precision": 2
     }
 
-Distances can be reported as integer values by specifying an `output_type` as `integer` or `int` as follows.::
+Distances can be reported as integer values by specifying an `output_type` as `integer` or `int` as follows:
+
+.. code-block:: json
 
     {
         "name": "Hamming distance",
@@ -70,7 +76,9 @@
 value of the same type as the default value (integer or float). The following
 example is a distance map for antigenic amino acid substitutions near influenza
 A/H3N2 HA's receptor binding sites. This map calculates the Hamming distance
-between amino acid sequences only at seven positions in the HA1 gene::
+between amino acid sequences only at seven positions in the HA1 gene:
+
+.. code-block:: json
 
     {
         "name": "Koel epitope sites",
@@ -92,7 +100,9 @@
 where the `from` sequence in each pair is interpreted as the ancestral state and
 the `to` sequence as the derived state. The following example is a distance map
 that assigns asymmetric weights to specific amino acid substitutions at a
-specific position in the influenza gene HA1::
+specific position in the influenza gene HA1:
+
+.. code-block:: json
 
     {
         "default": 0.0,
@@ -119,7 +129,9 @@
 the JSON includes a `params` field that describes the mapping of attribute names
 to requested comparisons and distance maps and any date parameters specified by
 the user. The following example JSON shows a sample output when the distance
-command is run with multiple comparisons and distance maps::
+command is run with multiple comparisons and distance maps:
+
+.. code-block:: json
 
     {
         "params": {

From cbed8ef049fca5c08a00a813821fecb646e5c2d5 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 17 Jan 2023 16:48:22 -0800
Subject: [PATCH 08/36] Fix augur.io docs warning
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fixes this warning:

    augur/io/__init__.py:docstring of augur.io:1: WARNING: duplicate object description of augur.io, other instance in api/developer/augur.io, use :noindex: for one of them

While :noindex: for Sphinx in general disables cross-referencing¹ and it
is briefly mentioned in the "Options and advanced usage" section of
automodule², there aren't any references anyways (as demonstrated by a
lack of related warnings/errors).

However, it's still unclear to me exactly why this shows on augur.io
only. I assume it has something to do with the re-exporting of public
API functions, which is currently specifi to augur.io.

¹ https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#basic-markup
² https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#directives
---
 docs/api/developer/augur.io.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/api/developer/augur.io.rst b/docs/api/developer/augur.io.rst
index 56c1737d8..13c044f60 100644
--- a/docs/api/developer/augur.io.rst
+++ b/docs/api/developer/augur.io.rst
@@ -5,6 +5,7 @@ augur.io
    :members:
    :undoc-members:
    :show-inheritance:
+   :noindex:
 
 .. toctree::
 

From 844c10c62af559ebb71d7bb4efbefcdb07d65c02 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Tue, 17 Jan 2023 16:56:25 -0800
Subject: [PATCH 09/36] Replace section heading in docstring with bold text
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The section headings resulted in warnings:

    None:8: CRITICAL: Unexpected section title.

    Comparison methods
    ==================
    None:36: CRITICAL: Unexpected section title.

    Distance maps
    =============

They rendered fine in the Development API page¹ but did not render on
the CLI page².

This seems like a bug with sphinx-argparse³. For now, use bold text to
avoid the warnings and allow the text to render on the CLI page.

¹ https://docs.nextstrain.org/projects/augur/en/stable/api/developer/augur.distance.html
² https://docs.nextstrain.org/projects/augur/en/stable/usage/cli/distance.html
³ https://github.com/ashb/sphinx-argparse/issues/31
---
 augur/distance.py | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/augur/distance.py b/augur/distance.py
index f344a2652..7ee432528 100644
--- a/augur/distance.py
+++ b/augur/distance.py
@@ -4,8 +4,7 @@
 which sequences to compare) and a distance map (to determine the weight of a
 mismatch between any two sequences).
 
-Comparison methods
-==================
+**Comparison methods**
 
 Comparison methods include:
 
@@ -32,8 +31,7 @@
 parameters allow users to specify a fixed time interval for pairwise
 calculations, limiting the computationally complexity of the comparisons.
 
-Distance maps
-=============
+**Distance maps**
 
 Distance maps are defined in JSON format with two required top-level keys.
 The `default` key specifies the numeric (floating point) value to assign to all mismatches by default.

From a310587237c6fea2bd9433caf8dd386db6aecc4e Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 12:11:08 -0800
Subject: [PATCH 10/36] docs: Build locally with nitpicky mode

This makes local builds show the same warnings in CI without erroring.
---
 docs/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/Makefile b/docs/Makefile
index 1518a91bb..21666bb84 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -3,7 +3,7 @@
 
 # You can set these variables from the command line, and also
 # from the environment for the first two.
-SPHINXOPTS    ?=
+SPHINXOPTS    ?= -n
 SPHINXBUILD   ?= sphinx-build
 SOURCEDIR     = .
 BUILDDIR      = _build

From 14bf62092a067eebf1b379835eb54fa7ee3c539e Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 16:45:26 -0800
Subject: [PATCH 11/36] Set nitpick_ignore to suppress warnings of valid
 numpydoc

---
 docs/conf.py | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/docs/conf.py b/docs/conf.py
index e4f383ffe..ce5ef49d7 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -116,6 +116,16 @@ def prose_list(items):
     'css/custom.css',
 ]
 
+# -- Resolve build warnings --------------------------------------------------
+
+nitpick_ignore = [
+    # These are valid numpydoc keywords¹, but somehow they are not recognized by
+    # napoleon.
+    # ¹ https://numpydoc.readthedocs.io/en/v1.5.0/format.html#parameters
+    ('py:class', 'optional'),
+    ('py:class', 'iterable'),
+]
+
 # -- Cross-project references ------------------------------------------------
 
 intersphinx_mapping = {

From 97e082d01f800375f6bed6ea613d6f2bb872c965 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 8 Mar 2023 16:32:38 -0800
Subject: [PATCH 12/36] fix: Add augur.types prefix for proper linking

Without this prefix, a warning appears:

    WARNING: py:attr reference target not found: ValidationMode.SKIP
---
 augur/util_support/node_data_reader.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/augur/util_support/node_data_reader.py b/augur/util_support/node_data_reader.py
index 191414318..4d365101e 100644
--- a/augur/util_support/node_data_reader.py
+++ b/augur/util_support/node_data_reader.py
@@ -16,7 +16,7 @@ class NodeDataReader:
 
     If a tree file is specified, it is used to verify the node names.
 
-    If validation_mode is set to :py:attr:`ValidationMode.SKIP`, Augur version of node data files is not checked.
+    If validation_mode is set to :py:attr:`augur.types.ValidationMode.SKIP`, Augur version of node data files is not checked.
     """
 
     def __init__(self, filenames, tree_file=None, validation_mode=ValidationMode.ERROR):

From eb65cbb4b7b165d8bb46be54fa015cc559c55809 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 10:38:47 -0800
Subject: [PATCH 13/36] fix: Use Python type hints in doc generation

The majority of the codebase uses numpydoc for type annotations.
However, there are some usages of the newer PEP 484 type hints.

For example, augur.dates.get_numerical_dates has `metadata:pd.DataFrame`
in the function signature.

Without sphinx-autodoc-typehints, this caused a warning:

    augur/augur/dates.py:docstring of augur.dates.get_numerical_dates:1: WARNING: py:class reference target not found: pandas.core.frame.DataFrame

This commit fixes that warning.
---
 docs/conf.py | 1 +
 setup.py     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/docs/conf.py b/docs/conf.py
index ce5ef49d7..b6de5bf41 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -61,6 +61,7 @@ def prose_list(items):
     'sphinx.ext.autodoc',
     'sphinxarg.ext',
     'sphinx.ext.napoleon',
+    'sphinx_autodoc_typehints', # must come after napoleon https://github.com/tox-dev/sphinx-autodoc-typehints/blob/1.21.4/README.md#compatibility-with-sphinxextnapoleon
     'sphinx_markdown_tables',
     'sphinx.ext.intersphinx',
     'nextstrain.sphinx.theme',
diff --git a/setup.py b/setup.py
index 2ed0bdd1f..4f7f41431 100644
--- a/setup.py
+++ b/setup.py
@@ -82,6 +82,7 @@
             "sphinx-argparse >=0.2.5",
             "sphinx-markdown-tables >= 0.0.9",
             "sphinx-rtd-theme >=0.4.3",
+            "sphinx-autodoc-typehints >=1.21.4",
             "wheel >=0.32.3",
             "ipdb >=0.10.1"
         ]

From bf2d1330ebac6ace6f1159b3bc1523d93061d733 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 11:51:30 -0800
Subject: [PATCH 14/36] fix: Add external intersphinx mappings

This enables proper reference resolution during the docs build, and
provides useful links on the generated doc pages.
---
 docs/conf.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/docs/conf.py b/docs/conf.py
index b6de5bf41..ff7ea77ec 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -130,5 +130,10 @@ def prose_list(items):
 # -- Cross-project references ------------------------------------------------
 
 intersphinx_mapping = {
+    'Bio': ('https://biopython.org/docs/latest/api/', None),
     'docs.nextstrain.org': ('https://docs.nextstrain.org/en/latest/', None),
+    'python': ('https://docs.python.org/3', None),
+    'numpy': ('https://numpy.org/doc/stable', None),
+    'pandas': ('https://pandas.pydata.org/docs', None),
+    'treetime': ('https://treetime.readthedocs.io/en/stable/', None),
 }

From 862f7494147d8ea30c46c759af982a2e68fff080 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 12:21:01 -0800
Subject: [PATCH 15/36] fix: Add doc pages to resolve references

Without these pages, there would be warnings such as:

    WARNING: py:class reference target not found: AugurError
    WARNING: py:class reference target not found: DataErrorMethod

It's a good idea to have these doc pages regardless.
---
 docs/api/developer/augur.errors.rst | 7 +++++++
 docs/api/developer/augur.rst        | 2 ++
 docs/api/developer/augur.types.rst  | 7 +++++++
 3 files changed, 16 insertions(+)
 create mode 100644 docs/api/developer/augur.errors.rst
 create mode 100644 docs/api/developer/augur.types.rst

diff --git a/docs/api/developer/augur.errors.rst b/docs/api/developer/augur.errors.rst
new file mode 100644
index 000000000..b35076ca3
--- /dev/null
+++ b/docs/api/developer/augur.errors.rst
@@ -0,0 +1,7 @@
+augur.errors
+============
+
+.. automodule:: augur.errors
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/api/developer/augur.rst b/docs/api/developer/augur.rst
index 835535653..d7fc72359 100644
--- a/docs/api/developer/augur.rst
+++ b/docs/api/developer/augur.rst
@@ -13,6 +13,7 @@ augur
    augur.clades
    augur.dates
    augur.distance
+   augur.errors
    augur.export
    augur.export_v1
    augur.export_v2
@@ -35,6 +36,7 @@ augur
    augur.traits
    augur.translate
    augur.tree
+   augur.types
    augur.util_support
    augur.utils
    augur.validate
diff --git a/docs/api/developer/augur.types.rst b/docs/api/developer/augur.types.rst
new file mode 100644
index 000000000..d37872b71
--- /dev/null
+++ b/docs/api/developer/augur.types.rst
@@ -0,0 +1,7 @@
+augur.types
+===========
+
+.. automodule:: augur.types
+   :members:
+   :undoc-members:
+   :show-inheritance:

From 9b896c211378d0ac44847d879e86f200fe55972e Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 12:44:10 -0800
Subject: [PATCH 16/36] fix: Ignore JSONEncoder/JSONDecodeError

---
 docs/conf.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/docs/conf.py b/docs/conf.py
index ff7ea77ec..875637583 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -125,6 +125,11 @@ def prose_list(items):
     # ¹ https://numpydoc.readthedocs.io/en/v1.5.0/format.html#parameters
     ('py:class', 'optional'),
     ('py:class', 'iterable'),
+
+     # Some references get translated to these, but somehow they can't get
+     # resolved by intersphinx for a proper link.
+     ("py:class", "json.decoder.JSONDecodeError"),
+     ("py:class", "json.encoder.JSONEncoder"),
 ]
 
 # -- Cross-project references ------------------------------------------------

From 445d73fa253340a81274cacddacc0f6fdafd317c Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 12:02:15 -0800
Subject: [PATCH 17/36] fix: Correct type of pivots

np.isscalar will return true for a float, but that is not valid input
for np.linspace(num).

Instead, allow native or numpy integers.
---
 augur/frequency_estimators.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index 92206003b..5a05623f1 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -92,7 +92,7 @@ def make_pivots(pivots, tps):
 
     Parameters
     ----------
-    pivots : scalar or iterable
+    pivots : int or iterable
         either number of pivots (a scalar) or the actual pivots
         (will be cast to array and returned)
     tps : np.array
@@ -103,7 +103,7 @@ def make_pivots(pivots, tps):
     pivots : np.array
         array of pivot values
     '''
-    if np.isscalar(pivots):
+    if isinstance(pivots, int):
         dt = np.max(tps)-np.min(tps)
         return np.linspace(np.min(tps)-0.01*dt, np.max(tps)+0.01*dt, pivots)
     else:

From ee5890731365741f71209c1acfa0c66679abc624 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 12:09:47 -0800
Subject: [PATCH 18/36] fix numpydoc: Address obvious style syntax issues

---
 augur/align.py                | 11 +++++------
 augur/ancestral.py            |  2 +-
 augur/clades.py               |  4 ++--
 augur/frequency_estimators.py | 16 ++++++++--------
 augur/index.py                |  2 +-
 augur/io/sequences.py         |  2 +-
 augur/titer_model.py          |  4 ++--
 augur/traits.py               |  2 +-
 augur/utils.py                |  2 +-
 9 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/augur/align.py b/augur/align.py
index a95df36ba..93745c418 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -67,7 +67,8 @@ def prepare(sequences, existing_aln_fname, output, ref_name, ref_seq_fname):
 
     Returns
     -------
-        tuple: The existing alignment filename, the new sequences filename, and the name of the reference sequence.
+    tuple of str
+        The existing alignment filename, the new sequences filename, and the name of the reference sequence.
     """
     seqs = read_sequences(*sequences)
     seqs_to_align_fname = output + ".to_align.fasta"
@@ -104,7 +105,7 @@ def run(args):
     '''
     Parameters
     ----------
-    args : namespace
+    args : argparse.Namespace
         arguments passed in via the command-line from augur
 
     Returns
@@ -152,6 +153,8 @@ def run(args):
 def postprocess(output_file, ref_name, keep_reference, fill_gaps):
     """Postprocessing of the combined alignment file.
 
+    The modified alignment is written directly to output_file.
+
     Parameters
     ----------
     output_file: str
@@ -162,10 +165,6 @@ def postprocess(output_file, ref_name, keep_reference, fill_gaps):
         If the reference was provided, whether it should be kept in the alignment
     fill_gaps: bool
         Replace all gaps in the alignment with "N" to indicate ambiguous sites.
-
-    Returns
-    -------
-        None - the modified alignment is written directly to output_file
     """
     # -- ref_name --
     # reads the new alignment
diff --git a/augur/ancestral.py b/augur/ancestral.py
index 97279cbd4..9d84c9b6e 100644
--- a/augur/ancestral.py
+++ b/augur/ancestral.py
@@ -28,7 +28,7 @@ def ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True,
 
     Parameters
     ----------
-    tree : Bio.Phylo tree or str
+    tree : Bio.Phylo or str
         tree or filename of tree
     aln : Bio.Align.MultipleSeqAlignment or str
         alignment or filename of alignment
diff --git a/augur/clades.py b/augur/clades.py
index b3155f438..66d188eff 100644
--- a/augur/clades.py
+++ b/augur/clades.py
@@ -126,7 +126,7 @@ def is_node_in_clade(clade_alleles, node, ref):
         list of clade defining alleles
     node : Phylo.Node
         node to check, assuming sequences (as mutations) are attached to node
-    ref : str/list
+    ref : str or list
         positions
 
     Returns
@@ -164,7 +164,7 @@ def assign_clades(clade_designations, all_muts, tree, ref=None):
         mutations in each node
     tree : Phylo.Tree
         phylogenetic tree to process
-    ref : str/list, optional
+    ref : str or list, optional
         reference sequence to look up state when not mutated
 
     Returns
diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index 5a05623f1..db4886b64 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -124,7 +124,7 @@ def running_average(obs, ws):
 
     Parameters
     ----------
-    obs : list/np.array(bool)
+    obs : list or np.array(bool)
         observations
     ws : int
         window size as measured in number of consecutive points
@@ -200,11 +200,11 @@ def __init__(self, tps, obs, pivots, stiffness = 20.0,
 
         Parameters
         ----------
-        tps : list/np.array(float)
+        tps : list or np.array(float)
             array with numerical dates
-        obs : list/np.array(bool)
+        obs : list or np.array(bool)
             array with boolean observations
-        pivots : int/np.array(float)
+        pivots : int or np.array(float)
             either integer specifying the number of pivot values,
             or list of explicity pivots
         stiffness : float, optional
@@ -476,9 +476,9 @@ def __init__(self, tree, pivots, node_filter=None, min_clades=10, verbose=0, pc=
 
         Parameters
         ----------
-        tree : Bio.Phylo.calde
+        tree : Bio.Phylo
             Biopython tree
-        pivots : int/array
+        pivots : int or array
             number or list of pivots
         node_filter : callable, optional
             function that evaluates to true/false to filter nodes
@@ -676,7 +676,7 @@ def mutation_frequencies(self, min_freq=0.01, include_set=None, ignore_char=''):
         ----------
         min_freq : float, optional
             minimal all-time frequency for an aligment column to be considered
-        include_set : list/set, optional
+        include_set : list or set, optional
             set of alignment column that will be used regardless of variation
         ignore_char : str, optional
             ignore this character in an alignment column (missing data)
@@ -1133,7 +1133,7 @@ def estimate(self, tree):
             tree (Bio.Phylo): annotated tree whose nodes all have an `attr` attribute with at least  "num_date" key
 
         Returns:
-            frequencies (dict): node frequencies by clade
+            dict: node frequencies by clade
 
         """
         # Calculate pivots for the given tree.
diff --git a/augur/index.py b/augur/index.py
index e53401607..ae7715293 100644
--- a/augur/index.py
+++ b/augur/index.py
@@ -63,7 +63,7 @@ def index_sequence(sequence, values):
     sequence : Bio.SeqRecord.SeqRecord
         sequence record to index.
 
-    values : list of sets of str
+    values : list of set of str
         values to count; sets must be non-overlapping and contain only
         single-character, lowercase strings
 
diff --git a/augur/io/sequences.py b/augur/io/sequences.py
index 7569b05d9..ce0b6118e 100644
--- a/augur/io/sequences.py
+++ b/augur/io/sequences.py
@@ -44,7 +44,7 @@ def write_sequences(sequences, path_or_buffer, format="fasta"):
 
     Parameters
     ----------
-    sequences : iterable of Bio.SeqRecord.SeqRecord objects
+    sequences : iterable of Bio.SeqRecord.SeqRecord
         A list-like collection of sequences to write
 
     path_or_buffer : str or Path-like object or IO buffer
diff --git a/augur/titer_model.py b/augur/titer_model.py
index 8b60848da..d146892e1 100644
--- a/augur/titer_model.py
+++ b/augur/titer_model.py
@@ -32,7 +32,7 @@ def load_from_file(filenames, excluded_sources=None):
 
         Returns
         -------
-        tuple (dict, list, list)
+        tuple
             tuple of a dict of titer measurements, list of strains, list of sources
 
 
@@ -139,7 +139,7 @@ def count_strains(titers):
 
         Parameters
         ----------
-        titers : defaultdict
+        titers : collections.defaultdict
             titer measurements indexed by test, reference,
             and serum
 
diff --git a/augur/traits.py b/augur/traits.py
index d4879a666..24deba1b2 100644
--- a/augur/traits.py
+++ b/augur/traits.py
@@ -129,7 +129,7 @@ def run(args):
 
     Parameters
     ----------
-    args : namespace
+    args : argparse.Namespace
         command line arguments are parsed by argparse
     """
     tree_fname = args.tree
diff --git a/augur/utils.py b/augur/utils.py
index b5146c75d..b13ccbd12 100644
--- a/augur/utils.py
+++ b/augur/utils.py
@@ -543,7 +543,7 @@ def read_strains(*files, comment_char="#"):
 
     Parameters
     ----------
-    files : one or more str
+    files : iterable of str
         one or more names of text files with one strain name per line
 
     Returns

From 3071e378772e6c284456f6b57c9966bfd20a67b1 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 12:22:07 -0800
Subject: [PATCH 19/36] fix numpydoc: Add an extra line between numpydoc and
 doctest

Otherwise the doctest gets parsed as numpydoc.
---
 augur/align.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/augur/align.py b/augur/align.py
index 93745c418..860e105d8 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -279,6 +279,7 @@ def strip_non_reference(aln, reference, insertion_csv=None):
     list
         list of trimmed sequences, effectively a multiple alignment
 
+
     >>> [s.name for s in strip_non_reference(read_alignment("tests/data/align/test_aligned_sequences.fasta"), "with_gaps")]
     Trimmed gaps in with_gaps from the alignment
     ['with_gaps', 'no_gaps', 'some_other_seq', '_R_crick_strand']

From cc77b51a78cb3ede6244934b84b012542f45d226 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 12:31:23 -0800
Subject: [PATCH 20/36] fix numpydoc: Add Examples section for doctests
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Better than the previous commit since this section is made for doctest examples¹.

¹ https://numpydoc.readthedocs.io/en/v1.5.0/format.html#examples
---
 augur/align.py                        |  3 ++-
 augur/dates/__init__.py               |  2 ++
 augur/distance.py                     |  6 +++--
 augur/export_v2.py                    |  5 +++-
 augur/filter/include_exclude_rules.py | 39 ++++++++++++++++++---------
 augur/filter/io.py                    |  3 ++-
 augur/filter/subsample.py             | 13 ++++++++-
 augur/frequency_estimators.py         |  2 ++
 augur/index.py                        |  3 ++-
 augur/io/json.py                      |  8 ++++++
 augur/io/metadata.py                  |  2 ++
 augur/io/vcf.py                       |  2 ++
 augur/titer_model.py                  | 11 +++++---
 augur/translate.py                    |  2 ++
 augur/tree.py                         |  3 ++-
 augur/utils.py                        |  5 ++++
 augur/validate.py                     |  2 ++
 scripts/diff_trees.py                 |  2 ++
 18 files changed, 89 insertions(+), 24 deletions(-)

diff --git a/augur/align.py b/augur/align.py
index 860e105d8..40f1068e9 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -279,7 +279,8 @@ def strip_non_reference(aln, reference, insertion_csv=None):
     list
         list of trimmed sequences, effectively a multiple alignment
 
-
+    Examples
+    --------
     >>> [s.name for s in strip_non_reference(read_alignment("tests/data/align/test_aligned_sequences.fasta"), "with_gaps")]
     Trimmed gaps in with_gaps from the alignment
     ['with_gaps', 'no_gaps', 'some_other_seq', '_R_crick_strand']
diff --git a/augur/dates/__init__.py b/augur/dates/__init__.py
index c8969ba19..472d208be 100644
--- a/augur/dates/__init__.py
+++ b/augur/dates/__init__.py
@@ -25,6 +25,8 @@ def numeric_date(date):
     2. A string in the YYYY-MM-DD (ISO 8601) syntax
     3. A string representing a relative date (duration before datetime.date.today())
 
+    Examples
+    --------
     >>> numeric_date("2020.42")
     2020.42
     >>> numeric_date("2020-06-04")
diff --git a/augur/distance.py b/augur/distance.py
index 7ee432528..4520709b2 100644
--- a/augur/distance.py
+++ b/augur/distance.py
@@ -187,7 +187,8 @@ def read_distance_map(map_file):
     dict :
         Python representation of the distance map JSON
 
-
+    Examples
+    --------
     >>> sorted(read_distance_map("tests/data/distance_map_weight_per_site.json").items())
     [('default', 0), ('map', {'HA1': {144: 1}})]
     >>> sorted(read_distance_map("tests/data/distance_map_weight_per_site_and_sequence.json").items())
@@ -247,7 +248,8 @@ def get_distance_between_nodes(node_a_sequences, node_b_sequences, distance_map,
     float :
         distance between node sequences based on the given map
 
-
+    Examples
+    --------
     >>> node_a_sequences = {"gene": "ACTG"}
     >>> node_b_sequences = {"gene": "ACGG"}
     >>> distance_map = {"default": 0, "map": {}}
diff --git a/augur/export_v2.py b/augur/export_v2.py
index 19bb6933c..3f606c797 100644
--- a/augur/export_v2.py
+++ b/augur/export_v2.py
@@ -584,7 +584,8 @@ def set_data_provenance(data_json, config):
     config : dict
         config JSON with an expected ``data_provenance`` key
 
-
+    Examples
+    --------
     >>> config = {"data_provenance": [{"name": "GISAID"}, {"name": "INSDC"}]}
     >>> data_json = {"meta": {}}
     >>> set_data_provenance(data_json, config)
@@ -600,6 +601,8 @@ def counter_to_disambiguation_suffix(count):
     """Given a numeric count of author papers, return a distinct alphabetical
     disambiguation suffix.
 
+    Examples
+    --------
     >>> counter_to_disambiguation_suffix(0)
     'A'
     >>> counter_to_disambiguation_suffix(25)
diff --git a/augur/filter/include_exclude_rules.py b/augur/filter/include_exclude_rules.py
index 8024000e8..6252b4a36 100644
--- a/augur/filter/include_exclude_rules.py
+++ b/augur/filter/include_exclude_rules.py
@@ -27,7 +27,8 @@ def filter_by_exclude_all(metadata):
     set[str]:
         Empty set of strains
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
     >>> filter_by_exclude_all(metadata)
     set()
@@ -50,7 +51,8 @@ def filter_by_exclude(metadata, exclude_file):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> import os
     >>> from tempfile import NamedTemporaryFile
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
@@ -82,7 +84,8 @@ def parse_filter_query(query):
     str :
         Value of column to query
 
-
+    Examples
+    --------
     >>> parse_filter_query("property=value")
     ('property', <built-in function eq>, 'value')
     >>> parse_filter_query("property!=value")
@@ -117,7 +120,8 @@ def filter_by_exclude_where(metadata, exclude_where):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
     >>> filter_by_exclude_where(metadata, "region!=Europe")
     {'strain2'}
@@ -169,7 +173,8 @@ def filter_by_query(metadata, query):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
     >>> filter_by_query(metadata, "region == 'Africa'")
     {'strain1'}
@@ -198,7 +203,8 @@ def filter_by_ambiguous_date(metadata, date_column="date", ambiguity="any"):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-XX"}, {"region": "Europe", "date": "2020-01-02"}], index=["strain1", "strain2"])
     >>> filter_by_ambiguous_date(metadata)
     {'strain2'}
@@ -241,7 +247,8 @@ def filter_by_date(metadata, date_column="date", min_date=None, max_date=None):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-01"}, {"region": "Europe", "date": "2020-01-02"}], index=["strain1", "strain2"])
     >>> filter_by_date(metadata, min_date=numeric_date("2020-01-02"))
     {'strain2'}
@@ -312,7 +319,8 @@ def filter_by_sequence_index(metadata, sequence_index):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-01"}, {"region": "Europe", "date": "2020-01-02"}], index=["strain1", "strain2"])
     >>> sequence_index = pd.DataFrame([{"strain": "strain1", "ACGT": 28000}]).set_index("strain")
     >>> filter_by_sequence_index(metadata, sequence_index)
@@ -342,7 +350,8 @@ def filter_by_sequence_length(metadata, sequence_index, min_length=0):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-01"}, {"region": "Europe", "date": "2020-01-02"}], index=["strain1", "strain2"])
     >>> sequence_index = pd.DataFrame([{"strain": "strain1", "A": 7000, "C": 7000, "G": 7000, "T": 7000}, {"strain": "strain2", "A": 6500, "C": 6500, "G": 6500, "T": 6500}]).set_index("strain")
     >>> filter_by_sequence_length(metadata, sequence_index, min_length=27000)
@@ -379,7 +388,8 @@ def filter_by_non_nucleotide(metadata, sequence_index):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-01"}, {"region": "Europe", "date": "2020-01-02"}], index=["strain1", "strain2"])
     >>> sequence_index = pd.DataFrame([{"strain": "strain1", "invalid_nucleotides": 0}, {"strain": "strain2", "invalid_nucleotides": 1}]).set_index("strain")
     >>> filter_by_non_nucleotide(metadata, sequence_index)
@@ -410,7 +420,8 @@ def force_include_strains(metadata, include_file):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> import os
     >>> from tempfile import NamedTemporaryFile
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
@@ -445,7 +456,8 @@ def force_include_where(metadata, include_where):
     set[str]:
         Strains that pass the filter
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa"}, {"region": "Europe"}], index=["strain1", "strain2"])
     >>> force_include_where(metadata, "region!=Europe")
     {'strain1'}
@@ -647,7 +659,8 @@ def apply_filters(metadata, exclude_by, include_by):
     For example, filter data by minimum date, but force the include of strains
     from Africa.
 
-
+    Examples
+    --------
     >>> metadata = pd.DataFrame([{"region": "Africa", "date": "2020-01-01"}, {"region": "Europe", "date": "2020-10-02"}, {"region": "North America", "date": "2020-01-01"}], index=["strain1", "strain2", "strain3"])
     >>> exclude_by = [(filter_by_date, {"min_date": numeric_date("2020-04-01")})]
     >>> include_by = [(force_include_where, {"include_where": "region=Africa"})]
diff --git a/augur/filter/io.py b/augur/filter/io.py
index a7dc11933..75d542212 100644
--- a/augur/filter/io.py
+++ b/augur/filter/io.py
@@ -41,7 +41,8 @@ def filter_kwargs_to_str(kwargs):
     str :
         String representation of the kwargs for reporting.
 
-
+    Examples
+    --------
     >>> from augur.dates import numeric_date
     >>> from augur.filter.include_exclude_rules import filter_by_sequence_length, filter_by_date
     >>> sequence_index = pd.DataFrame([{"strain": "strain1", "ACGT": 28000}, {"strain": "strain2", "ACGT": 26000}, {"strain": "strain3", "ACGT": 5000}]).set_index("strain")
diff --git a/augur/filter/subsample.py b/augur/filter/subsample.py
index 823d01388..7f60af69d 100644
--- a/augur/filter/subsample.py
+++ b/augur/filter/subsample.py
@@ -31,7 +31,8 @@ def get_groups_for_subsampling(strains, metadata, group_by=None):
     list :
         A list of dictionaries with strains that were skipped from grouping and the reason why (see also: `apply_filters` output).
 
-
+    Examples
+    --------
     >>> strains = ["strain1", "strain2"]
     >>> metadata = pd.DataFrame([{"strain": "strain1", "date": "2020-01-01", "region": "Africa"}, {"strain": "strain2", "date": "2020-02-01", "region": "Europe"}]).set_index("strain")
     >>> group_by = ["region"]
@@ -253,6 +254,9 @@ class PriorityQueue:
     """A priority queue implementation that automatically replaces lower priority
     items in the heap with incoming higher priority items.
 
+    Examples
+    --------
+
     Add a single record to a heap with a maximum of 2 records.
 
     >>> queue = PriorityQueue(max_size=2)
@@ -334,6 +338,9 @@ def create_queues_by_group(groups, max_size, max_attempts=100, random_seed=None)
     attempts to create queues for which the sum of their maximum sizes is
     greater than zero.
 
+    Examples
+    --------
+
     Create queues for two groups with a fixed maximum size.
 
     >>> groups = ("2015", "2016")
@@ -477,6 +484,8 @@ def _calculate_sequences_per_group(
         maximum number of sequences allowed per group to meet the required maximum total
         sequences allowed
 
+    Examples
+    --------
     >>> _calculate_sequences_per_group(4, [4, 2])
     2
     >>> _calculate_sequences_per_group(2, [4, 2])
@@ -532,6 +541,8 @@ def _calculate_fractional_sequences_per_group(
         fractional maximum number of sequences allowed per group to meet the
         required maximum total sequences allowed
 
+    Examples
+    --------
     >>> np.around(_calculate_fractional_sequences_per_group(4, [4, 2]), 4)
     1.9375
     >>> np.around(_calculate_fractional_sequences_per_group(2, [4, 2]), 4)
diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index db4886b64..eccccf3d6 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -852,6 +852,8 @@ def timestamp_to_float(time):
     This is not entirely accurate as it doesn't account for months with different
     numbers of days, but should be close enough to be accurate for weekly pivots.
 
+    Examples
+    --------
     >>> import datetime
     >>> time = datetime.date(2010, 10, 1)
     >>> timestamp_to_float(time)
diff --git a/augur/index.py b/augur/index.py
index ae7715293..15a07ff7c 100644
--- a/augur/index.py
+++ b/augur/index.py
@@ -74,7 +74,8 @@ def index_sequence(sequence, values):
         for the given values, and a final column with the number of characters
         that didn't match any of those in the given values.
 
-
+    Examples
+    --------
     >>> other_IUPAC = {'r', 'y', 's', 'w', 'k', 'm', 'd', 'h', 'b', 'v'}
     >>> values = [{'a'},{'c'},{'g'},{'t'},{'n'}, other_IUPAC, {'-'}, {'?'}]
     >>> sequence_a = Bio.SeqRecord.SeqRecord(seq=Bio.Seq.Seq("ACTGN-?XWN"), id="seq_A")
diff --git a/augur/io/json.py b/augur/io/json.py
index af75a1796..2a4678ea2 100644
--- a/augur/io/json.py
+++ b/augur/io/json.py
@@ -118,6 +118,8 @@ class JSONDecodeError(json.JSONDecodeError):
     raised by :func:`load_json` and be caught by except blocks which catch the
     standard :class:`json.JSONDecodeError`.
 
+    Examples
+    --------
     >>> load_json('{foo: "bar"}')
     Traceback (most recent call last):
         ...
@@ -218,6 +220,8 @@ def shorten_left(text, length, placeholder):
     intended for shortening sentences and works at the word, not character,
     level.
 
+    Examples
+    --------
     >>> shorten_left("foobar", 6, "...")
     'foobar'
     >>> shorten_left("foobarbaz", 6, "...")
@@ -244,6 +248,8 @@ def contextualize_char(text, idx, context = 10):
     Avoids making a copy of *text* before snipping, in case *text* is very
     large.
 
+    Examples
+    --------
     >>> contextualize_char('hello world', 0, context = 4)
     '▸▸▸h◂◂◂ello…'
     >>> contextualize_char('hello world', 5, context = 3)
@@ -277,6 +283,8 @@ def mark_char(text, idx):
     """
     Prominently marks the *idx* char in *text*.
 
+    Examples
+    --------
     >>> mark_char('hello world', 0)
     '▸▸▸h◂◂◂ello world'
     >>> mark_char('hello world', 2)
diff --git a/augur/io/metadata.py b/augur/io/metadata.py
index c4ee27c62..a6e635a21 100644
--- a/augur/io/metadata.py
+++ b/augur/io/metadata.py
@@ -34,6 +34,8 @@ def read_metadata(metadata_file, id_columns=("strain", "name"), chunk_size=None)
     KeyError :
         When the metadata file does not have any valid index columns.
 
+    Examples
+    --------
 
     For standard use, request a metadata file and get a pandas DataFrame.
 
diff --git a/augur/io/vcf.py b/augur/io/vcf.py
index 9d65dda00..9808c5d38 100644
--- a/augur/io/vcf.py
+++ b/augur/io/vcf.py
@@ -10,6 +10,8 @@
 def is_vcf(filename):
     """Convenience method to check if a file is a vcf file.
 
+    Examples
+    --------
     >>> is_vcf(None)
     False
     >>> is_vcf("./foo")
diff --git a/augur/titer_model.py b/augur/titer_model.py
index d146892e1..9e3f84cac 100644
--- a/augur/titer_model.py
+++ b/augur/titer_model.py
@@ -35,7 +35,8 @@ def load_from_file(filenames, excluded_sources=None):
         tuple
             tuple of a dict of titer measurements, list of strains, list of sources
 
-
+        Examples
+        --------
         >>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
         >>> type(measurements)
         <class 'dict'>
@@ -148,7 +149,8 @@ def count_strains(titers):
         dict
             number of measurements per strain
 
-
+        Examples
+        --------
         >>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
         >>> titer_counts = TiterCollection.count_strains(measurements)
         >>> titer_counts["A/Acores/11/2013"]
@@ -184,7 +186,8 @@ def filter_strains(titers, strains):
             reduced dictionary of titer measurements containing only those were
             test and reference virus are part of the strain list
 
-
+        Examples
+        --------
         >>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
         >>> len(measurements)
         11
@@ -321,6 +324,8 @@ def strain_census(self, titers):
         make lists of reference viruses, test viruses and sera
         (there are often multiple sera per reference virus)
 
+        Examples
+        --------
         >>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv")
         >>> titers = TiterCollection(measurements)
         >>> sera, ref_strains, test_strains = titers.strain_census(measurements)
diff --git a/augur/translate.py b/augur/translate.py
index ca58fc75a..32c2598ad 100644
--- a/augur/translate.py
+++ b/augur/translate.py
@@ -33,6 +33,8 @@ def safe_translate(sequence, report_exceptions=False):
     Optionally, returns a tuple of the translated sequence and whether an
     exception was raised during initial translation.
 
+    Examples
+    --------
     >>> safe_translate("ATG")
     'M'
     >>> safe_translate("ATGGT-")
diff --git a/augur/tree.py b/augur/tree.py
index c6f60d8a8..63754cfb0 100644
--- a/augur/tree.py
+++ b/augur/tree.py
@@ -68,7 +68,8 @@ def check_conflicting_args(tree_builder_args, defaults):
     ConflictingArgumentsException
         When any user-provided arguments match those in the defaults.
 
-
+    Examples
+    --------
     >>> defaults = ("-ntmax", "-m", "-s")
     >>> check_conflicting_args("-czb -n 2", defaults)
     >>> check_conflicting_args("-czb -ntmax 2", defaults)
diff --git a/augur/utils.py b/augur/utils.py
index b13ccbd12..e17064b30 100644
--- a/augur/utils.py
+++ b/augur/utils.py
@@ -318,6 +318,8 @@ def get_parent_name_by_child_name_for_tree(tree):
 def annotate_parents_for_tree(tree):
     """Annotate each node in the given tree with its parent.
 
+    Examples
+    --------
     >>> import io
     >>> tree = Bio.Phylo.read(io.StringIO("(A, (B, C))"), "newick")
     >>> not any([hasattr(node, "parent") for node in tree.find_clades()])
@@ -343,6 +345,9 @@ def json_to_tree(json_dict, root=True, parent_cumulative_branch_length=None):
 
     Assigns links back to parent nodes for the root of the tree.
 
+    Examples
+    --------
+
     Test opening a JSON from augur export v1.
 
     >>> import json
diff --git a/augur/validate.py b/augur/validate.py
index cb9f9526f..4860eca3f 100644
--- a/augur/validate.py
+++ b/augur/validate.py
@@ -203,6 +203,8 @@ def get_unique_keys(list_of_dicts):
     """
     Returns a set of unique keys from a list of dicts
 
+    Examples
+    --------
     >>> list_of_dicts = [{"key1": "val1", "key2": "val2"}, {"key1": "val1", "key3": "val3"}]
     >>> sorted(get_unique_keys(list_of_dicts))
     ['key1', 'key2', 'key3']
diff --git a/scripts/diff_trees.py b/scripts/diff_trees.py
index 27fd48334..ed652f441 100644
--- a/scripts/diff_trees.py
+++ b/scripts/diff_trees.py
@@ -7,6 +7,8 @@ def clade_to_items(clade, attrs=("name", "branch_length")):
     """Recursively convert a clade of a tree to a list of nested lists according to
     the topology of the clade with the requested attributes per node.
 
+    Examples
+    --------
     >>> from io import StringIO
     >>> treedata = "(A, (B, C), (D, E))"
     >>> handle = StringIO(treedata)

From cc52176207bcc607c078ccabaed68d2af3f15d75 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 13:43:19 -0800
Subject: [PATCH 21/36] fix numpydoc: Use numpydoc type hints
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The bracket notation is suggested by PyCharm¹ but is not standard
numpydoc.

¹ https://www.jetbrains.com/help/pycharm/type-syntax-for-docstrings.html
---
 augur/align.py                        |  2 +-
 augur/filter/include_exclude_rules.py | 30 +++++++++++++--------------
 augur/filter/subsample.py             |  2 +-
 augur/io/metadata.py                  |  4 ++--
 augur/io/sequences.py                 |  2 +-
 augur/mask.py                         |  6 +++---
 augur/utils.py                        |  6 +++---
 7 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/augur/align.py b/augur/align.py
index 40f1068e9..edd76f8d8 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -54,7 +54,7 @@ def prepare(sequences, existing_aln_fname, output, ref_name, ref_seq_fname):
 
     Parameters
     ----------
-    sequences : list[str]
+    sequences : list of str
         List of paths to FASTA-formatted sequences to align.
     existing_aln_fname : str
         Path of an existing alignment to use, or None
diff --git a/augur/filter/include_exclude_rules.py b/augur/filter/include_exclude_rules.py
index 6252b4a36..6cfda62f5 100644
--- a/augur/filter/include_exclude_rules.py
+++ b/augur/filter/include_exclude_rules.py
@@ -24,7 +24,7 @@ def filter_by_exclude_all(metadata):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Empty set of strains
 
     Examples
@@ -48,7 +48,7 @@ def filter_by_exclude(metadata, exclude_file):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -117,7 +117,7 @@ def filter_by_exclude_where(metadata, exclude_where):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -170,7 +170,7 @@ def filter_by_query(metadata, query):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -200,7 +200,7 @@ def filter_by_ambiguous_date(metadata, date_column="date", ambiguity="any"):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -244,7 +244,7 @@ def filter_by_date(metadata, date_column="date", min_date=None, max_date=None):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -316,7 +316,7 @@ def filter_by_sequence_index(metadata, sequence_index):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -347,7 +347,7 @@ def filter_by_sequence_length(metadata, sequence_index, min_length=0):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -385,7 +385,7 @@ def filter_by_non_nucleotide(metadata, sequence_index):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -417,7 +417,7 @@ def force_include_strains(metadata, include_file):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -453,7 +453,7 @@ def force_include_where(metadata, include_where):
 
     Returns
     -------
-    set[str]:
+    set of str:
         Strains that pass the filter
 
     Examples
@@ -638,11 +638,11 @@ def apply_filters(metadata, exclude_by, include_by):
     ----------
     metadata : pandas.DataFrame
         Metadata to filter
-    exclude_by : list[tuple]
+    exclude_by : list of tuple
         A list of 2-element tuples with a callable to filter by in the first
         index and a dictionary of kwargs to pass to the function in the second
         index.
-    include_by : list[tuple]
+    include_by : list of tuple
         A list of 2-element tuples in the same format as the ``exclude_by``
         argument.
 
@@ -650,9 +650,9 @@ def apply_filters(metadata, exclude_by, include_by):
     -------
     set :
         Strains to keep (those that passed all filters)
-    list[dict] :
+    list of dict :
         Strains to exclude along with the function that filtered them and the arguments used to run the function.
-    list[dict] :
+    list of dict :
         Strains to force-include along with the function that filtered them and the arguments used to run the function.
 
 
diff --git a/augur/filter/subsample.py b/augur/filter/subsample.py
index 7f60af69d..104e8aad3 100644
--- a/augur/filter/subsample.py
+++ b/augur/filter/subsample.py
@@ -402,7 +402,7 @@ def calculate_sequences_per_group(target_max_value, group_sizes, allow_probabili
     target_max_value : int
         Maximum number of sequences to return by subsampling at some calculated
         number of sequences per group for the given counts per group.
-    group_sizes : list[int]
+    group_sizes : list of int
         A list with the number of sequences in each requested group.
     allow_probabilistic : bool
         Whether to allow probabilistic subsampling when the number of groups
diff --git a/augur/io/metadata.py b/augur/io/metadata.py
index a6e635a21..8691cbc88 100644
--- a/augur/io/metadata.py
+++ b/augur/io/metadata.py
@@ -20,7 +20,7 @@ def read_metadata(metadata_file, id_columns=("strain", "name"), chunk_size=None)
     ----------
     metadata_file : str
         Path to a metadata file to load.
-    id_columns : list[str]
+    id_columns : list of str
         List of possible id column names to check for, ordered by priority.
     chunk_size : int
         Size of chunks to stream from disk with an iterator instead of loading the entire input file into memory.
@@ -393,7 +393,7 @@ def write_records_to_tsv(records, output_file):
 
     Parameters
     ----------
-    records: iterator[dict]
+    records: iterable of dict
         Iterator that yields dict that contains sequences
 
     output_file: str
diff --git a/augur/io/sequences.py b/augur/io/sequences.py
index ce0b6118e..0abfe898f 100644
--- a/augur/io/sequences.py
+++ b/augur/io/sequences.py
@@ -82,7 +82,7 @@ def write_records_to_fasta(records, fasta, seq_id_field='strain', seq_field='seq
 
     Parameters
     ----------
-    records: iterator[dict]
+    records: iterable of dict
         Iterator that yields dict that contains sequences
 
     fasta: str
diff --git a/augur/mask.py b/augur/mask.py
index 54d3638e7..7a7d58571 100644
--- a/augur/mask.py
+++ b/augur/mask.py
@@ -35,7 +35,7 @@ def mask_vcf(mask_sites, in_file, out_file, cleanup=True):
 
     Parameters
     ----------
-    mask_sites: list[int]
+    mask_sites: list of int
         A list of site indexes to exclude from the vcf.
     in_file: str
         The path to the vcf file you wish to mask.
@@ -86,7 +86,7 @@ def mask_sequence(sequence, mask_sites, mask_from_beginning, mask_from_end, mask
     ----------
     sequence : Bio.SeqIO.SeqRecord
         A sequence to be masked
-    mask_sites: list[int]
+    mask_sites: list of int
         A list of site indexes to exclude from the FASTA.
     mask_from_beginning: int
         Number of sites to mask from the beginning of each sequence (default 0)
@@ -132,7 +132,7 @@ def mask_fasta(mask_sites, in_file, out_file, mask_from_beginning=0, mask_from_e
 
     Parameters
     ----------
-    mask_sites: list[int]
+    mask_sites: list of int
         A list of site indexes to exclude from the FASTA.
     in_file: str
         The path to the FASTA file you wish to mask.
diff --git a/augur/utils.py b/augur/utils.py
index e17064b30..1abf07463 100644
--- a/augur/utils.py
+++ b/augur/utils.py
@@ -461,7 +461,7 @@ def read_bed_file(bed_file):
 
     Returns
     -------
-    list[int]:
+    list of int:
         Sorted list of unique zero-indexed sites
     """
     mask_sites = []
@@ -492,7 +492,7 @@ def read_mask_file(mask_file):
 
     Returns
     -------
-    list[int]:
+    list of int:
         Sorted list of unique zero-indexed sites
     """
     mask_sites = []
@@ -518,7 +518,7 @@ def load_mask_sites(mask_file):
 
     Returns
     -------
-    list[int]
+    list of int
         Sorted list of unique zero-indexed sites
     """
     if mask_file.lower().endswith(".bed"):

From 9ac238626722ca83fc83c8143a9c7d13d4aa13ca Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 12:12:38 -0800
Subject: [PATCH 22/36] fix numpydoc: Correct type of label

"label" is not a valid type; it should be "str".
---
 augur/sequence_traits.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/augur/sequence_traits.py b/augur/sequence_traits.py
index 87659b13e..573511b57 100644
--- a/augur/sequence_traits.py
+++ b/augur/sequence_traits.py
@@ -256,7 +256,7 @@ def attach_features(annotations, label, count):
     ----------
     annotations : dict
         annotations fo stgrains as globed together by `annotate_strains`
-    label : label
+    label : str
         label of the feature set as specified by as command line argument
     count : str
         if equal to traits, will count the number of distinct features that

From 3a9c6ff03502d6670e203b949330ceef027f71a8 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 17:10:40 -0800
Subject: [PATCH 23/36] fix numpydoc: Remove trailing colon from exception
 classes

These resulted in nitpick "reference target not found" warnings.
---
 augur/filter/subsample.py | 2 +-
 augur/io/metadata.py      | 4 ++--
 augur/io/sequences.py     | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/augur/filter/subsample.py b/augur/filter/subsample.py
index 104e8aad3..a79e9101d 100644
--- a/augur/filter/subsample.py
+++ b/augur/filter/subsample.py
@@ -410,7 +410,7 @@ def calculate_sequences_per_group(target_max_value, group_sizes, allow_probabili
 
     Raises
     ------
-    TooManyGroupsError :
+    TooManyGroupsError
         When there are more groups than sequences per group and probabilistic
         subsampling is not allowed.
 
diff --git a/augur/io/metadata.py b/augur/io/metadata.py
index 8691cbc88..839c03bfb 100644
--- a/augur/io/metadata.py
+++ b/augur/io/metadata.py
@@ -31,7 +31,7 @@ def read_metadata(metadata_file, id_columns=("strain", "name"), chunk_size=None)
 
     Raises
     ------
-    KeyError :
+    KeyError
         When the metadata file does not have any valid index columns.
 
     Examples
@@ -129,7 +129,7 @@ def read_table_to_dict(table, duplicate_reporting=DataErrorMethod.ERROR_FIRST, i
 
     Raises
     ------
-    AugurError:
+    AugurError
         Raised for any of the following reasons:
         1. There are parsing errors from the csv standard library
         2. The provided *id_column* does not exist in the *metadata*
diff --git a/augur/io/sequences.py b/augur/io/sequences.py
index 0abfe898f..d1e9ed399 100644
--- a/augur/io/sequences.py
+++ b/augur/io/sequences.py
@@ -101,7 +101,7 @@ def write_records_to_fasta(records, fasta, seq_id_field='strain', seq_field='seq
 
     Raises
     ------
-    AugurError:
+    AugurError
         When the sequence id field or sequence field does not exist in a record
     """
     with open_file(fasta, "w") as output_fasta:

From 2f87db7fb0b2a03398bfe08417e00a1c50c494ba Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 16:55:08 -0800
Subject: [PATCH 24/36] fix numpydoc: Replace "Path-like" with os.PathLike

---
 augur/index.py        | 8 ++++----
 augur/io/file.py      | 2 +-
 augur/io/sequences.py | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/augur/index.py b/augur/index.py
index 15a07ff7c..c8b2e6f93 100644
--- a/augur/index.py
+++ b/augur/index.py
@@ -26,9 +26,9 @@ def index_vcf(vcf_path, index_path):
 
     Parameters
     ----------
-    vcf_path : str or Path-like
+    vcf_path : str or `os.PathLike`
         path to a VCF file to index.
-    index_path : str or Path-like
+    index_path : str or `os.PathLike`
         path to a tab-delimited file containing the composition details for each
         sequence in the given input file.
 
@@ -154,10 +154,10 @@ def index_sequences(sequences_path, sequence_index_path):
 
     Parameters
     ----------
-    sequences_path : str or Path-like
+    sequences_path : str or `os.PathLike`
         path to a sequence file to index.
 
-    sequence_index_path : str or Path-like
+    sequence_index_path : str or `os.PathLike`
         path to a tab-delimited file containing the composition details for each
         sequence in the given input file.
 
diff --git a/augur/io/file.py b/augur/io/file.py
index c2c704acb..59e6302e9 100644
--- a/augur/io/file.py
+++ b/augur/io/file.py
@@ -10,7 +10,7 @@ def open_file(path_or_buffer, mode="r", **kwargs):
 
     Parameters
     ----------
-    path_or_buffer : str or Path-like or IO buffer
+    path_or_buffer : str or `os.PathLike` or IO buffer
         Name of the file to open or an existing IO buffer
 
     mode : str
diff --git a/augur/io/sequences.py b/augur/io/sequences.py
index d1e9ed399..6a2f817b6 100644
--- a/augur/io/sequences.py
+++ b/augur/io/sequences.py
@@ -12,7 +12,7 @@ def read_sequences(*paths, format="fasta"):
 
     Parameters
     ----------
-    paths : list of str or Path-like objects
+    paths : list of str or `os.PathLike`
         One or more paths to sequence files of any type supported by BioPython.
 
     format : str
@@ -47,7 +47,7 @@ def write_sequences(sequences, path_or_buffer, format="fasta"):
     sequences : iterable of Bio.SeqRecord.SeqRecord
         A list-like collection of sequences to write
 
-    path_or_buffer : str or Path-like object or IO buffer
+    path_or_buffer : str or `os.PathLike` or IO buffer
         A path to a file to write the given sequences in the given format.
 
     format : str

From 3ad5369b93e4416f0e9d30ffab8dad94a3453491 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 16:57:06 -0800
Subject: [PATCH 25/36] fix numpydoc: Replace "IO buffer" with io.StringIO

---
 augur/io/file.py      | 2 +-
 augur/io/sequences.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/augur/io/file.py b/augur/io/file.py
index 59e6302e9..146455081 100644
--- a/augur/io/file.py
+++ b/augur/io/file.py
@@ -10,7 +10,7 @@ def open_file(path_or_buffer, mode="r", **kwargs):
 
     Parameters
     ----------
-    path_or_buffer : str or `os.PathLike` or IO buffer
+    path_or_buffer : str or `os.PathLike` or `io.StringIO`
         Name of the file to open or an existing IO buffer
 
     mode : str
diff --git a/augur/io/sequences.py b/augur/io/sequences.py
index 6a2f817b6..3497dbff7 100644
--- a/augur/io/sequences.py
+++ b/augur/io/sequences.py
@@ -47,7 +47,7 @@ def write_sequences(sequences, path_or_buffer, format="fasta"):
     sequences : iterable of Bio.SeqRecord.SeqRecord
         A list-like collection of sequences to write
 
-    path_or_buffer : str or `os.PathLike` or IO buffer
+    path_or_buffer : str or `os.PathLike` or `io.StringIO`
         A path to a file to write the given sequences in the given format.
 
     format : str

From 5b379852a28f445a62f94c3cbd1abfe7aaa92320 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 17:06:13 -0800
Subject: [PATCH 26/36] fix numpydoc: Remove docstring from
 traits.register_parser

Other register_parser functions don't have a docstring, and this one is
out of date.
---
 augur/traits.py | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/augur/traits.py b/augur/traits.py
index 24deba1b2..8ff5fb376 100644
--- a/augur/traits.py
+++ b/augur/traits.py
@@ -98,13 +98,6 @@ def mugration_inference(tree=None, seq_meta=None, field='country', confidence=Tr
 
 
 def register_parser(parent_subparsers):
-    """Add subcommand specific arguments
-
-    Parameters
-    ----------
-    parser : argparse
-        subcommand argument parser
-    """
     parser = parent_subparsers.add_parser("traits", help=__doc__)
     parser.add_argument('--tree', '-t', required=True, help="tree to perform trait reconstruction on")
     parser.add_argument('--metadata', required=True, metavar="FILE", help="table with metadata, as CSV or TSV")

From 8865478bdbbc318428d6c3f1dc6490e0bd5a417c Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 17:15:19 -0800
Subject: [PATCH 27/36] fix numpydoc: Remove "TYPE" placeholders

These were causing nitpick warnings:

    WARNING: py:class reference target not found: TYPE
---
 augur/frequency_estimators.py | 20 ++++++++++----------
 augur/titer_model.py          | 20 ++++++++++----------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index eccccf3d6..58550e7b4 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -329,25 +329,25 @@ class freq_est_clipped(object):
 
     Attributes
     ----------
-    dtps : TYPE
+    dtps
         Description
-    fe : TYPE
+    fe
         Description
-    good_pivots : TYPE
+    good_pivots
         Description
-    good_tps : TYPE
+    good_tps
         Description
-    obs : TYPE
+    obs
         Description
-    pivot_freq : TYPE
+    pivot_freq
         Description
-    pivot_lower_cutoff : TYPE
+    pivot_lower_cutoff
         Description
-    pivot_upper_cutoff : TYPE
+    pivot_upper_cutoff
         Description
-    pivots : TYPE
+    pivots
         Description
-    tps : TYPE
+    tps
         Description
     valid : bool
         Description
diff --git a/augur/titer_model.py b/augur/titer_model.py
index 9e3f84cac..524b6c94f 100644
--- a/augur/titer_model.py
+++ b/augur/titer_model.py
@@ -217,7 +217,7 @@ def __init__(self, titers, **kwargs):
 
         Parameters
         ----------
-        titers : TYPE
+        titers
             Description
         **kwargs
             Description
@@ -251,9 +251,9 @@ def normalize(self, ref, val):
 
         Parameters
         ----------
-        ref : TYPE
+        ref
             Description
-        val : TYPE
+        val
             Description
 
         Returns
@@ -338,7 +338,7 @@ def strain_census(self, titers):
 
         Parameters
         ----------
-        titers : TYPE
+        titers
             Description
 
         Returns
@@ -812,7 +812,7 @@ def cross_validate(self, n, **kwargs):
 
         Parameters
         ----------
-        n : TYPE
+        n
             Description
         **kwargs
             Description
@@ -848,9 +848,9 @@ def get_path_no_terminals(self, v1, v2):
 
         Parameters
         ----------
-        v1 : TYPE
+        v1
             Description
-        v2 : TYPE
+        v2
             Description
 
         Returns
@@ -1034,9 +1034,9 @@ def get_mutations(self, strain1, strain2):
 
         Parameters
         ----------
-        strain1 : TYPE
+        strain1
             Description
-        strain2 : TYPE
+        strain2
             Description
 
         Returns
@@ -1139,7 +1139,7 @@ def collapse_colinear_mutations(self, colin_thres):
 
         Parameters
         ----------
-        colin_thres : TYPE
+        colin_thres
             Description
         '''
         TT = self.design_matrix[:,:self.genetic_params].T

From a8b1c5fab9034d0f29bc91f8443b8f45d4ca1b34 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 18 Jan 2023 17:19:53 -0800
Subject: [PATCH 28/36] fix numpydoc: Remove unused placeholders in titer_model

---
 augur/titer_model.py | 85 --------------------------------------------
 1 file changed, 85 deletions(-)

diff --git a/augur/titer_model.py b/augur/titer_model.py
index 524b6c94f..b162cde2a 100644
--- a/augur/titer_model.py
+++ b/augur/titer_model.py
@@ -218,9 +218,7 @@ def __init__(self, titers, **kwargs):
         Parameters
         ----------
         titers
-            Description
         **kwargs
-            Description
         """
         self.kwargs = kwargs
 
@@ -252,14 +250,7 @@ def normalize(self, ref, val):
         Parameters
         ----------
         ref
-            Description
         val
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         consensus_func = np.mean
         return consensus_func(np.log2(self.autologous_titers[ref]['val'])) \
@@ -339,12 +330,6 @@ def strain_census(self, titers):
         Parameters
         ----------
         titers
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         """
         sera = set()
         ref_strains = set()
@@ -464,15 +449,10 @@ def _train(self, method='nnl1reg',  lam_drop=1.0, lam_pot = 0.5, lam_avi = 3.0,
         Parameters
         ----------
         method : str, optional
-            Description
         lam_drop : float, optional
-            Description
         lam_pot : float, optional
-            Description
         lam_avi : float, optional
-            Description
         **kwargs
-            Description
         '''
         self.lam_pot = lam_pot
         self.lam_avi = lam_avi
@@ -515,18 +495,9 @@ def validate(self, plot=False, cutoff=0.0, validation_set = None, fname=None):
         Parameters
         ----------
         plot : bool, optional
-            Description
         cutoff : float, optional
-            Description
         validation_set : None, optional
-            Description
         fname : None, optional
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         from scipy.stats import linregress, pearsonr
         if validation_set is None:
@@ -597,11 +568,6 @@ def compile_titers(self):
         during visualization, we need the average distance of a test virus from
         a reference virus across sera. hence the hierarchy [ref][test][serum]
         NOTE: this uses node.name instead of node.clade
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         def dstruct():
             return defaultdict(dict)
@@ -619,11 +585,6 @@ def compile_potencies(self):
         compile a json structure containing potencies for visualization
         we need rapid access to all sera for a given reference virus, hence
         the structure is organized by [ref][serum]
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         potency_json = defaultdict(dict)
         for (ref_clade, serum), val in self.serum_potency.items():
@@ -644,11 +605,6 @@ def compile_potencies(self):
     def compile_virus_effects(self):
         '''
         compile a json structure containing virus_effects for visualization
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         return {test_vir:np.round(val,TITER_ROUND) for test_vir, val in self.virus_effect.items()}
 
@@ -659,11 +615,6 @@ def compile_virus_effects(self):
     def fit_l1reg(self):
         '''
         regularize genetic parameters with an l1 norm regardless of sign
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         try:
             from cvxopt import matrix, solvers
@@ -723,11 +674,6 @@ def fit_nnl2reg(self):
 
     def fit_nnl1reg(self):
         '''l1 regularization of titer drops with non-negativity constraints
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         try:
             from cvxopt import matrix, solvers
@@ -813,14 +759,7 @@ def cross_validate(self, n, **kwargs):
         Parameters
         ----------
         n
-            Description
         **kwargs
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
 
         model_performance = []
@@ -849,14 +788,7 @@ def get_path_no_terminals(self, v1, v2):
         Parameters
         ----------
         v1
-            Description
         v2
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         if v1 in self.strain_lookup and v2 in self.strain_lookup:
             p1 = [self.strain_lookup[v1]]
@@ -886,7 +818,6 @@ def find_titer_splits(self, criterium=None):
         Parameters
         ----------
         criterium : None, optional
-            Description
         '''
         if criterium is None:
             criterium = lambda x:True
@@ -1035,14 +966,7 @@ def get_mutations(self, strain1, strain2):
         Parameters
         ----------
         strain1
-            Description
         strain2
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         if strain1 in self.sequences and strain2 in self.sequences:
             muts = []
@@ -1089,7 +1013,6 @@ def make_seqgraph(self, colin_thres = 5):
         Parameters
         ----------
         colin_thres : int, optional
-            Description
         '''
         seq_graph = []
         titer_dist = []
@@ -1140,7 +1063,6 @@ def collapse_colinear_mutations(self, colin_thres):
         Parameters
         ----------
         colin_thres
-            Description
         '''
         TT = self.design_matrix[:,:self.genetic_params].T
         mutation_clusters = []
@@ -1175,7 +1097,6 @@ def train(self,**kwargs):
         Parameters
         ----------
         **kwargs
-            Description
         '''
         self._train(**kwargs)
         for mi, mut in enumerate(self.relevant_muts):
@@ -1200,12 +1121,6 @@ def compile_substitution_effects(self, cutoff=1e-4):
         Parameters
         ----------
         cutoff : float, optional
-            Description
-
-        Returns
-        -------
-        TYPE
-            Description
         '''
         return {mut[0]+':'+mut[1]:np.round(val,int(-np.log10(cutoff)))
                 for mut, val in self.substitution_effect.items() if val>cutoff}

From cf1d935d689f6311b0f3acaa22e775758683ab05 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:24:01 -0800
Subject: [PATCH 29/36] fix numpydoc: Properly reference
 Bio.Align.MultipleSeqAlignment

---
 augur/align.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/augur/align.py b/augur/align.py
index edd76f8d8..c9b021d34 100644
--- a/augur/align.py
+++ b/augur/align.py
@@ -269,7 +269,7 @@ def strip_non_reference(aln, reference, insertion_csv=None):
 
     Parameters
     ----------
-    aln : MultipleSeqAlign
+    aln : Bio.Align.MultipleSeqAlignment
         Biopython Alignment
     reference : str
         name of reference sequence, assumed to be part of the alignment
@@ -382,7 +382,7 @@ def prettify_alignment(aln):
 
     Parameters
     ----------
-    aln : MultipleSeqAlign
+    aln : Bio.Align.MultipleSeqAlignment
         Biopython Alignment
     '''
     for seq in aln:
@@ -405,7 +405,7 @@ def make_gaps_ambiguous(aln):
 
     Parameters
     ----------
-    aln : MultipleSeqAlign
+    aln : Bio.Align.MultipleSeqAlignment
         Biopython Alignment
     '''
     for seq in aln:

From f6435c60f83564eba63a4f4bc9faa91ffc6e89ab Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:03:14 -0800
Subject: [PATCH 30/36] fix numpydoc: Properly reference
 Bio.Phylo.BaseTree.Tree

Bio.Phylo is the module, not the tree class.

This could instead be something more specific like
Bio.Phylo.Newick.Tree, but I opted for the more generic type.
---
 augur/ancestral.py            | 2 +-
 augur/clades.py               | 2 +-
 augur/distance.py             | 6 +++---
 augur/frequency_estimators.py | 6 +++---
 augur/titer_model.py          | 4 ++--
 augur/traits.py               | 2 +-
 augur/utils.py                | 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/augur/ancestral.py b/augur/ancestral.py
index 9d84c9b6e..4ae8871b4 100644
--- a/augur/ancestral.py
+++ b/augur/ancestral.py
@@ -28,7 +28,7 @@ def ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True,
 
     Parameters
     ----------
-    tree : Bio.Phylo or str
+    tree : Bio.Phylo.BaseTree.Tree or str
         tree or filename of tree
     aln : Bio.Align.MultipleSeqAlignment or str
         alignment or filename of alignment
diff --git a/augur/clades.py b/augur/clades.py
index 66d188eff..e3f517726 100644
--- a/augur/clades.py
+++ b/augur/clades.py
@@ -162,7 +162,7 @@ def assign_clades(clade_designations, all_muts, tree, ref=None):
         clade definitions as :code:`{clade_name:[(gene, site, allele),...]}`
     all_muts : dict
         mutations in each node
-    tree : Phylo.Tree
+    tree : Bio.Phylo.BaseTree.Tree
         phylogenetic tree to process
     ref : str or list, optional
         reference sequence to look up state when not mutated
diff --git a/augur/distance.py b/augur/distance.py
index 4520709b2..f38241cce 100644
--- a/augur/distance.py
+++ b/augur/distance.py
@@ -477,7 +477,7 @@ def get_distances_to_root(tree, sequences_by_node_and_gene, distance_map):
 
     Parameters
     ----------
-    tree : Bio.Phylo
+    tree : Bio.Phylo.BaseTree.Tree
         a rooted tree whose node names match the given dictionary of sequences
         by node and gene
 
@@ -517,7 +517,7 @@ def get_distances_to_last_ancestor(tree, sequences_by_node_and_gene, distance_ma
 
     Parameters
     ----------
-    tree : Bio.Phylo
+    tree : Bio.Phylo.BaseTree.Tree
         a rooted tree whose node names match the given dictionary of sequences
         by node and gene
 
@@ -577,7 +577,7 @@ def get_distances_to_all_pairs(tree, sequences_by_node_and_gene, distance_map, e
 
     Parameters
     ----------
-    tree : Bio.Phylo
+    tree : Bio.Phylo.BaseTree.Tree
         a rooted tree whose node names match the given dictionary of sequences
         by node and gene
 
diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index 58550e7b4..094bbab53 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -476,7 +476,7 @@ def __init__(self, tree, pivots, node_filter=None, min_clades=10, verbose=0, pc=
 
         Parameters
         ----------
-        tree : Bio.Phylo
+        tree : Bio.Phylo.BaseTree.Tree
             Biopython tree
         pivots : int or array
             number or list of pivots
@@ -1076,7 +1076,7 @@ def tip_passes_filters(self, tip):
         If no filters are defined, returns True.
 
         Args:
-            tip (Bio.Phylo): tip from a Bio.Phylo tree annotated with attributes in `tip.attr`
+            tip (Bio.Phylo.BaseTree.Tree): tip from a Bio.Phylo tree annotated with attributes in `tip.attr`
 
         Returns:
             bool: whether the given tip passes the defined filters or not
@@ -1132,7 +1132,7 @@ def estimate(self, tree):
         values in attribute defined by `self.weights_attribute`.
 
         Args:
-            tree (Bio.Phylo): annotated tree whose nodes all have an `attr` attribute with at least  "num_date" key
+            tree (Bio.Phylo.BaseTree.Tree): annotated tree whose nodes all have an `attr` attribute with at least  "num_date" key
 
         Returns:
             dict: node frequencies by clade
diff --git a/augur/titer_model.py b/augur/titer_model.py
index b162cde2a..f92f34858 100644
--- a/augur/titer_model.py
+++ b/augur/titer_model.py
@@ -1131,11 +1131,11 @@ def annotate_tree(self, tree):
 
         Parameters
         ----------
-        tree : Bio.Phylo
+        tree : Bio.Phylo.BaseTree.Tree
 
         Returns
         -------
-        Bio.Phylo
+        Bio.Phylo.BaseTree.Tree
             input tree instance with nodes annotated by per-branch and
             cumulative antigenic advance attributes `dTiterSub` and
             `cTiterSub`
diff --git a/augur/traits.py b/augur/traits.py
index 8ff5fb376..c9f812ff4 100644
--- a/augur/traits.py
+++ b/augur/traits.py
@@ -34,7 +34,7 @@ def mugration_inference(tree=None, seq_meta=None, field='country', confidence=Tr
 
     Returns
     -------
-    T : Phylo.Tree
+    T : Bio.Phylo.BaseTree.Tree
         Biophyton tree
     gtr : treetime.GTR
         GTR model
diff --git a/augur/utils.py b/augur/utils.py
index 1abf07463..c26a696e9 100644
--- a/augur/utils.py
+++ b/augur/utils.py
@@ -58,7 +58,7 @@ def read_tree(fname, min_terminals=3):
 
     Returns
     -------
-    Bio.Phylo :
+    Bio.Phylo.BaseTree.Tree :
         BioPython tree instance
 
     """

From 64d8bc5e2c21e3f97700e6cd799344bdc8ae1fd1 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:18:39 -0800
Subject: [PATCH 31/36] fix numpydoc: Properly reference
 Bio.Phylo.BaseTree.Clade

`Phylo.Node` is not a valid class. I believe the intention is
`Bio.Phylo.BaseTree.Clade`, since that is the type of
`Bio.Phylo.BaseTree.Tree.root`
---
 augur/clades.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/augur/clades.py b/augur/clades.py
index e3f517726..acfb4c301 100644
--- a/augur/clades.py
+++ b/augur/clades.py
@@ -124,7 +124,7 @@ def is_node_in_clade(clade_alleles, node, ref):
     ----------
     clade_alleles : list
         list of clade defining alleles
-    node : Phylo.Node
+    node : Bio.Phylo.BaseTree.Clade
         node to check, assuming sequences (as mutations) are attached to node
     ref : str or list
         positions

From 2c7c6edd7f58a210d5d9bf0e6c949d904c864872 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:05:15 -0800
Subject: [PATCH 32/36] fix numpydoc: Properly reference
 Bio.SeqRecord.SeqRecord

---
 augur/mask.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/augur/mask.py b/augur/mask.py
index 7a7d58571..0370ce121 100644
--- a/augur/mask.py
+++ b/augur/mask.py
@@ -84,7 +84,7 @@ def mask_sequence(sequence, mask_sites, mask_from_beginning, mask_from_end, mask
 
     Parameters
     ----------
-    sequence : Bio.SeqIO.SeqRecord
+    sequence : Bio.SeqRecord.SeqRecord
         A sequence to be masked
     mask_sites: list of int
         A list of site indexes to exclude from the FASTA.
@@ -97,7 +97,7 @@ def mask_sequence(sequence, mask_sites, mask_from_beginning, mask_from_end, mask
 
     Returns
     -------
-    Bio.SeqIO.SeqRecord
+    Bio.SeqRecord.SeqRecord
         Masked sequence in its original record object
 
     """

From 30781e17b6634bac4c422a5538460e58a6a1e669 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 12:54:22 -0800
Subject: [PATCH 33/36] fix numpydoc: Properly reference numpy.ndarray

---
 augur/frequency_estimators.py | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/augur/frequency_estimators.py b/augur/frequency_estimators.py
index 094bbab53..666fcaafe 100644
--- a/augur/frequency_estimators.py
+++ b/augur/frequency_estimators.py
@@ -46,7 +46,7 @@ def get_pivots(observations, pivot_interval, start_date=None, end_date=None, piv
 
     Returns
     -------
-    pivots : ndarray
+    pivots : numpy.ndarray
         floating point pivots spanning the given the dates
 
     """
@@ -95,12 +95,12 @@ def make_pivots(pivots, tps):
     pivots : int or iterable
         either number of pivots (a scalar) or the actual pivots
         (will be cast to array and returned)
-    tps : np.array
+    tps : numpy.ndarray
         observation time points. Will generate pivots spanning min/max
 
     Returns
     -------
-    pivots : np.array
+    pivots : numpy.ndarray
         array of pivot values
     '''
     if isinstance(pivots, int):
@@ -124,14 +124,14 @@ def running_average(obs, ws):
 
     Parameters
     ----------
-    obs : list or np.array(bool)
+    obs : list or numpy.ndarray(bool)
         observations
     ws : int
         window size as measured in number of consecutive points
 
     Returns
     -------
-    np.array(float)
+    numpy.ndarray(float)
         running average of the boolean observations
     '''
     ws=int(ws)
@@ -157,14 +157,14 @@ def fix_freq(freq, pc):
 
     Parameters
     ----------
-    freq : np.array
+    freq : numpy.ndarray
         frequency trajectory to be thresholded
     pc : float
         threshold value
 
     Returns
     -------
-    np.array
+    numpy.ndarray
         thresholded frequency trajectory
     '''
     freq[np.isnan(freq)]=pc
@@ -200,11 +200,11 @@ def __init__(self, tps, obs, pivots, stiffness = 20.0,
 
         Parameters
         ----------
-        tps : list or np.array(float)
+        tps : list or numpy.ndarray(float)
             array with numerical dates
-        obs : list or np.array(bool)
+        obs : list or numpy.ndarray(bool)
             array with boolean observations
-        pivots : int or np.array(float)
+        pivots : int or numpy.ndarray(float)
             either integer specifying the number of pivot values,
             or list of explicity pivots
         stiffness : float, optional
@@ -426,11 +426,11 @@ def __init__(self, tps, obs, pivots,  **kwargs):
 
         Parameters
         ----------
-        tps : np.array
+        tps : numpy.ndarray
             array of numerical dates
-        obs : np.array(bool)
+        obs : numpy.ndarray(bool)
             array of true/false observations
-        pivots : np.array
+        pivots : numpy.ndarray
             pivot values
         **kwargs
             Description
@@ -625,10 +625,10 @@ def __init__(self, aln, tps, pivots, **kwargs):
         ----------
         aln : Bio.Align.MultipleSeqAlignment
             alignment
-        tps : np.array(float)
+        tps : np.ndarray(float)
             Array of numerical dates, one for each sequence in the
             alignment in the SAME ORDER!
-        pivots : np.array(float)
+        pivots : np.ndarray(float)
             pivot values for which frequencies are estimated
         **kwargs
             Description
@@ -653,7 +653,7 @@ def estimate_genotype_frequency(self, gt):
 
         Returns
         -------
-        np.array
+        numpy.ndarray
             frequency trajectory
         '''
         match = []

From ef70f101db0ca9f65283c1e2b37deaeeea3e89d4 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:30:31 -0800
Subject: [PATCH 34/36] fix numpydoc: Properly reference treetime classes

---
 augur/ancestral.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/augur/ancestral.py b/augur/ancestral.py
index 4ae8871b4..bee763908 100644
--- a/augur/ancestral.py
+++ b/augur/ancestral.py
@@ -49,7 +49,7 @@ def ancestral_sequence_inference(tree=None, aln=None, ref=None, infer_gtr=True,
 
     Returns
     -------
-    TreeAnc
+    treetime.TreeAnc
         treetime.TreeAnc instance
     """
 
@@ -78,7 +78,7 @@ def collect_mutations_and_sequences(tt, infer_tips=False, full_sequences=False,
 
     Parameters
     ----------
-    tt : treetime
+    tt : treetime.TreeTime
         instance of treetime with valid ancestral reconstruction
     infer_tips : bool, optional
         if true, request the reconstructed tip sequences from treetime, otherwise retain input ambiguities

From 2ba5428ec9c18916bb9fa3ea51436adc3abb9e4d Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Thu, 19 Jan 2023 13:06:46 -0800
Subject: [PATCH 35/36] fix numpydoc: Update reference of pandas TextFileReader
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This class lives at pandas.io.parsers.TextFileReader, however, there are
no public docs for it and it isn't linked on pandas doc pages either¹.

Use single backticks to disable linking while denoting that it is a class².

¹ https://pandas.pydata.org/pandas-docs/version/1.4/reference/api/pandas.read_csv.html
² https://numpydoc.readthedocs.io/en/latest/format.html#common-rest-concepts
---
 augur/io/metadata.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/augur/io/metadata.py b/augur/io/metadata.py
index 839c03bfb..79feb723e 100644
--- a/augur/io/metadata.py
+++ b/augur/io/metadata.py
@@ -27,7 +27,7 @@ def read_metadata(metadata_file, id_columns=("strain", "name"), chunk_size=None)
 
     Returns
     -------
-    pandas.DataFrame or pandas.TextFileReader
+    pandas.DataFrame or `pandas.io.parsers.TextFileReader`
 
     Raises
     ------

From 8e3705dd45d70eb9dab0f84c8fff8b0e49041240 Mon Sep 17 00:00:00 2001
From: Victor Lin <13424970+victorlin@users.noreply.github.com>
Date: Wed, 8 Mar 2023 13:31:54 -0800
Subject: [PATCH 36/36] Update changelog

---
 CHANGES.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/CHANGES.md b/CHANGES.md
index 1c6155313..a31c60844 100644
--- a/CHANGES.md
+++ b/CHANGES.md
@@ -20,8 +20,10 @@
 * translate: Fix error handling when features cannot be read from reference sequence file. [#1168][] (@victorlin)
 * translate: Remove an unnecessary check which allowed for inaccurate error messages to be shown. [#1169][] (@victorlin)
 * frequencies: Previously, monthly pivot points calculated from the end of a month may have been shifted by 1-3 days. This is now fixed. [#1150][] (@victorlin)
+* docs: Fix minor formatting issues. [#1095][] (@victorlin)
 * Update development status on PyPI from "3 - Alpha" to "5 - Production/Stable". This should have been done since the beginning of this changelog, but now it is official. [#1160][] (@corneliusroemer)
 
+[#1095]: https://github.com/nextstrain/augur/pull/1095
 [#1150]: https://github.com/nextstrain/augur/pull/1150
 [#1160]: https://github.com/nextstrain/augur/pull/1160
 [#1168]: https://github.com/nextstrain/augur/pull/1168