diff --git a/.gitignore b/.gitignore index 58ac1e8..6b35cff 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,7 @@ +# Mac +*.DS_Store + + # Files output by the tests 2dhb.* diff --git a/notebooks/2022_hackathon_slides.ipynb b/notebooks/2022_hackathon_slides.ipynb new file mode 100644 index 0000000..29ba2d4 --- /dev/null +++ b/notebooks/2022_hackathon_slides.ipynb @@ -0,0 +1,667 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a900ac1e", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# CompSPI, Github and Coding Best Practices\n", + "\n", + "Nina Miolane, BioShape Lab, UC Santa Barbara\n", + "\n", + "
\"default\"/
\n" + ] + }, + { + "cell_type": "markdown", + "id": "2cf8066d", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. Why CompSPI?\n", + "2. CompSPI Overview\n", + "3. Related Librairies\n", + "4. Today: Contributing" + ] + }, + { + "cell_type": "markdown", + "id": "33ea91a6", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. **Why CompSPI?**\n", + "2. CompSPI Overview\n", + "3. Related Librairies\n", + "4. Today: Contributing" + ] + }, + { + "cell_type": "markdown", + "id": "0568a1bd", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Why CompSPI?\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\"default\"/\n", + "\n", + "Three objectives:\n", + "1. **Teach/learn** “hands-on\" cryo-EM computations\n", + "2. **Accelerate** developments in cryo-EM reconstruction\n", + "3. **Foster** reproducible research and dissemination\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "6f08d5a7", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 1. Teach/learn “hands-on\" cryo-EM computations\n", + "\n", + "\n", + "- Reduce: entry barier for newcomers\n", + "- Reduce: mentoring time for experts\n", + "- Couple: theory with implementation\n", + "\n", + "Interact with data, theory and methods directly with a clean unit-tested code." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "822e2680", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "There 111768 atoms.\n", + "The coordinates of the first atom are: [29.107, -44.272, -156.816].\n" + ] + } + ], + "source": [ + "from ioSPI.ioSPI import atomic_models\n", + "\n", + "pdb_path = \"data/4v6x.pdb\" # 80S ribosome\n", + "cif_path = \"data/486d.cif\" # 70S ribosome\n", + "\n", + "atomic_model = atomic_models.read_atomic_model(pdb_path, assemble=False)\n", + "atoms = atomic_models.extract_gemmi_atoms(atomic_model)\n", + "atomic_coordinates = atomic_models.extract_atomic_parameter(atoms, \"cartesian_coordinates\")\n", + "\n", + "print(f\"There {len(atoms)} atoms.\")\n", + "print(f\"The coordinates of the first atom are: {atomic_coordinates[0]}.\")" + ] + }, + { + "cell_type": "markdown", + "id": "6ffc18a0", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 2. Accelerate Developments in CryoEM Reconstruction\n", + "\n", + "- Enable: fast prototyping (plug-and-play)\n", + "- Enable: large benchmarks\n", + "\n", + "\n", + "CryoEM methods use similar concepts -> `Python classes`:\n", + "- Molecular volume: _e.g._ 3D array, NeRF\n", + "- Heterogeneity: _e.g._ discrete, continuous\n", + "- Image formation model: _e.g._ projector, noise\n", + "\n", + "
\"default\"/
\n", + "
\"default\"/
\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "97ec3e1f", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 3. Foster reproducible research and dissemination\n", + "\n", + "- Ease: reproducibility of experiments: _(User's perspective)_\n", + " - _e.g._ comparisons with baselines\n", + "- Increase: number of citations! _(Contributor's perspective)_\n", + "\n", + "\n", + "From: Josse, Holmes (2013)\n", + "\n", + "\"default\"/" + ] + }, + { + "cell_type": "markdown", + "id": "0c3a90b5", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Why CompSPI?\n", + "\n", + "1. **Teach/learn** “hands-on\" cryo-EM computations\n", + "2. **Accelerate** developments in cryo-EM reconstruction\n", + "3. **Foster** reproducible research and dissemination\n", + "\n", + "These objectives are not new!\n", + "\n", + "- Machine learning: democratized with librairies like `sklearn`\n", + "\n", + "Democratizing cryoEM/SPI methods requires a dedicated librairy" + ] + }, + { + "cell_type": "markdown", + "id": "1462e5ed", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. Why CompSPI?\n", + "2. **CompSPI Overview**\n", + "3. Related Librairies\n", + "4. Today: Contributing" + ] + }, + { + "cell_type": "markdown", + "id": "f4931df3", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## CompSPI = ioSPI + simSPI + compSPI\n", + "\n", + "
\"default\"/
\n" + ] + }, + { + "cell_type": "markdown", + "id": "a399e856", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# ioSPI\n", + "\n", + "`ioSPI`: Methods and tools to read and write data in all formats.\n", + "https://github.com/compSPI/ioSPI\n", + "\n", + "- Organization: Files correspond to data types:\n", + " - `micrographs.py`\n", + "- Programming: With functions:\n", + " - `def read_micrograph():`\n", + "- API (Application Programming Interface): Standard for I/O\n", + " - `def read_*(path, ...):`\n", + " - `def write_*(path, data, ...):`" + ] + }, + { + "cell_type": "markdown", + "id": "3092a99f", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# simSPI (Forward Models)\n", + "\n", + "`simSPI`: Methods and tools to simulate SPI data \n", + "\n", + "https://github.com/compSPI/simSPI\n", + "\n", + "- Organization (!): Files correspond to image formation steps\n", + " - `projector.py`\n", + "- Programming: With classes:\n", + " - `class Projector(torch.nn.Module):` (differentiable)" + ] + }, + { + "cell_type": "markdown", + "id": "c4915623", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# compSPI (Inverse Models)\n", + "\n", + "`compSPI`: Next-generation reconstruction methods for SPI.\n", + "\n", + "https://github.com/compSPI/compSPI\n", + "\n", + "- Organization (!): TBD.\n", + "- Programming: With classes:\n", + " - `class CryoAI(torch.nn.Module):` (differentiable)" + ] + }, + { + "cell_type": "markdown", + "id": "8523e8d1", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. Why CompSPI?\n", + "2. CompSPI Overview\n", + "3. **Related Librairies**\n", + "4. Today: Contributing" + ] + }, + { + "cell_type": "markdown", + "id": "c5c4c5ff", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Related Librairies\n", + "\n", + "By: Jed Yeo, Arjun Swani, Tyler Heim, Callum Hepworth.\n", + "\n", + "| Library | Description | io tools | sim tools | comp tools |\n", + "| --- | --- | --- | --- | --- |\n", + "| `Eman2` | CryoEM image processing | .hdf5 | yes | Fourier \\& Wiener Fourier|\n", + "| `Scipion2` | CryoEM image processing | many io tools | N/A | N/A|\n", + "| `Aspire` | CryoEM reconstructions | CLI, .mrc, .star | yes | Wiener Fourier|\n", + "| `PyTom` | Processing cryoET data | yes | yes | yes|\n", + "| `PEMDA` |EM model maniputation | yes | yes | N/A|" + ] + }, + { + "cell_type": "markdown", + "id": "66b9e2d7", + "metadata": {}, + "source": [ + "- io tools: many\n", + "- sim tools: many, although not differentiable\n", + "- comp tools (reconstruction): \"traditional\"\n", + "- - engineering: not always tested, documented, nor maintained" + ] + }, + { + "cell_type": "markdown", + "id": "875c30e7", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. Why CompSPI?\n", + "2. CompSPI Overview\n", + "3. Related Librairies\n", + "4. **Today: Contributing**" + ] + }, + { + "cell_type": "markdown", + "id": "ba20f7d4", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Your Goal\n", + "\n", + "Goal: Merge a [Pull Request (PR)](https://github.com/compSPI/ioSPI/pull/75) today _even if only 2 lines of code._\n", + "\n", + "Why:\n", + "- Coding as a group is different from coding alone\n", + "- Today's goal is to explain compSPI's collaborative workflow\n", + "\n", + "
\"default\"/
" + ] + }, + { + "cell_type": "markdown", + "id": "301c9866", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Achieving Your Goal\n", + "\n", + "0. Join `compSPI` slack workspace\n", + "1. Assign yourself to a GitHub task on PIMS Hackathon GitHub project\n", + "2. Code a few lines on a new GitHub branch\n", + "3. Document your code with docstrings\n", + "4. Test with unit-tests\n", + "5. Lint to respect the coding style\n", + "6. Submit a PR on GitHub\n", + "7. Address code reviews\n", + "8. Merge your PR" + ] + }, + { + "cell_type": "markdown", + "id": "68c00aef", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 1. Assign Yourself to a Task\n", + "\n", + "Where: https://github.com/orgs/compSPI/projects/3\n", + "\n", + "Why: \n", + "- Know who works on what\n", + "- Brainstorm before coding\n", + "\n", + "How: \n", + "- If you have code that you want to contribute: Create a task\n", + "- If you do not: Choose a task: learn `compSPI` while contributing\n", + "- Assign your GitHub handle (or comment on the task)\n", + " \n", + "https://github.com/orgs/compSPI/projects/3" + ] + }, + { + "cell_type": "markdown", + "id": "38df4bb1", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 2. Code a Few Lines\n", + "\n", + "Where: Explained in task.\n", + "\n", + "Why: Only a few lines:\n", + "- to ease the work of the code reviewers (10 min rule),\n", + "- to merge a PR today.\n", + "\n", + "How: \n", + "- Clone the repository: `git clone ...`\n", + "- Create a new branch: `git checkout -b my-task`\n", + "- Add code: `git add my_file.py; git commit -m \"comments\"`" + ] + }, + { + "cell_type": "markdown", + "id": "cc6c59e1", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 3. Document\n", + "\n", + "Where: Docstrings (NOT in comments)\n", + "\n", + "Why: \n", + "- Help users and contributors understand code\n", + "- Generate automatic doc website\n", + "\n", + "\n", + "How: \n", + "- Use numpy markdown style for docstrings\n", + "- See [Example of docstring](https://github.com/compSPI/ioSPI/blob/master/ioSPI/atomic_models.py#L10) or [Anatomy of a Docstring](https://github.com/compspi/compspi/blob/master/docs/contributing.rst#the-anatomy-of-a-docstring)" + ] + }, + { + "cell_type": "markdown", + "id": "23d8528b", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 4. Test with Unit-Tests\n", + "\n", + "What/Where: Unit-test: code testing a \"unit\" of software (naming important)\n", + "- 1 file `my_module.py` = 1 test file `test_my_module.py` \n", + "- 1 function `def my_fun()` = 1 test `def test_my_fun()`\n", + "\n", + "Why: Ensure that your code runs + does not break the software\n", + "\n", + "How: \n", + "- [Example of tests](https://github.com/compSPI/ioSPI/blob/master/tests/test_atomic_models.py)\n", + "- Details are on the PR interface.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "9526b034", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## Test: Run Tests\n", + "\n", + "- Locally\n", + "\n", + "`$ pip install -r dev-requirements.txt`\n", + "\n", + "`$ pytest tests/test_atomic_models.py -k test_my_function`\n", + "\n", + "`$ pytest tests/`\n", + "\n", + "- On GitHub (see later slide)\n", + "https://github.com/compSPI/ioSPI/pull/75\n" + ] + }, + { + "cell_type": "markdown", + "id": "27a2b899", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 5. Lint to Respect Coding Style\n", + "\n", + "What/Where: Clean your code by running a \"linter\" via command lines.\n", + "\n", + "\n", + "Why: \n", + "- Harmonize coding styles so that it is faster to read someone else's code\n", + "- Save time: _e.g._ saveToFile? save_to_file? savetofile?\n", + "\n", + "\n", + "How: \n", + "- [PEP8](https://peps.python.org/pep-0008/)\n", + "- Details are on the PR interface." + ] + }, + { + "cell_type": "markdown", + "id": "7411d69d", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Demo.\n", + "`pip install -r dev-requirements.txt`\n", + "`pre-commit install`\n", + "\n", + "or:\n", + "\n", + "`$ pip install -r dev-requirements.txt`\n", + "\n", + "`$ black . --check`\n", + "\n", + "`$ isort --profile black --check .`\n", + "\n", + "`$ flake8 ioSPI tests`" + ] + }, + { + "cell_type": "markdown", + "id": "c7dfaae0", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## 6. Submit PR on GitHub\n", + "\n", + "What: PR = request to merge your code into compSPI\n", + "\n", + "Why: Before merging into compSPI, code is checked by bots and people:\n", + "- does it run?\n", + "- does it respect coding conventions?\n", + "\n", + "Why: \n", + "- From your machine: `git push origin my-task`\n", + "- From GitHub: open PR" + ] + }, + { + "cell_type": "markdown", + "id": "0c60369e", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Example: Put These Slides on GitHub?\n", + "\n", + "0. Join `compSPI` slack workspace\n", + "1. Assign yourself to a GitHub task on PIMS Hackathon GitHub project\n", + "2. Code a few lines on a new GitHub branch\n", + "3. Document your code with docstrings\n", + "4. Test with unit-tests\n", + "5. Lint to respect the coding style\n", + "6. Submit a PR on GitHub\n", + "7. Address code reviews\n", + "8. Merge your PR" + ] + }, + { + "cell_type": "markdown", + "id": "f5c8a689", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Outline\n", + "\n", + "1. Why CompSPI?\n", + "2. CompSPI Overview\n", + "3. Related Librairies\n", + "4. Today: Contributing\n", + "\n", + "## Thank you for your attention." + ] + }, + { + "cell_type": "markdown", + "id": "af8c3e4b", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## GitHub\n", + "\n", + "PR = unit of code\n", + "Google standards: think about code reviewers\n", + "\n", + "`$ git push origin your-branch`\n", + "\n", + "
\"default\"/
" + ] + } + ], + "metadata": { + "celltoolbar": "Slideshow", + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/figs/compSPI_citations.png b/notebooks/figs/compSPI_citations.png new file mode 100644 index 0000000..de2e6c2 Binary files /dev/null and b/notebooks/figs/compSPI_citations.png differ diff --git a/notebooks/figs/compSPI_gh.png b/notebooks/figs/compSPI_gh.png new file mode 100644 index 0000000..241e3e2 Binary files /dev/null and b/notebooks/figs/compSPI_gh.png differ diff --git a/notebooks/figs/compSPI_noise.png b/notebooks/figs/compSPI_noise.png new file mode 100644 index 0000000..7e811f4 Binary files /dev/null and b/notebooks/figs/compSPI_noise.png differ diff --git a/notebooks/figs/compSPI_overview.png b/notebooks/figs/compSPI_overview.png new file mode 100644 index 0000000..94eef93 Binary files /dev/null and b/notebooks/figs/compSPI_overview.png differ diff --git a/notebooks/figs/compSPI_pr.png b/notebooks/figs/compSPI_pr.png new file mode 100644 index 0000000..09ba473 Binary files /dev/null and b/notebooks/figs/compSPI_pr.png differ diff --git a/notebooks/figs/compSPI_projector.png b/notebooks/figs/compSPI_projector.png new file mode 100644 index 0000000..07d114e Binary files /dev/null and b/notebooks/figs/compSPI_projector.png differ diff --git a/notebooks/figs/compSPI_readme2.png b/notebooks/figs/compSPI_readme2.png new file mode 100644 index 0000000..f715384 Binary files /dev/null and b/notebooks/figs/compSPI_readme2.png differ diff --git a/notebooks/figs/compSPI_readme3.png b/notebooks/figs/compSPI_readme3.png new file mode 100644 index 0000000..91d4964 Binary files /dev/null and b/notebooks/figs/compSPI_readme3.png differ diff --git a/notebooks/figs/hackathon_gh_badges.png b/notebooks/figs/hackathon_gh_badges.png new file mode 100644 index 0000000..81b0273 Binary files /dev/null and b/notebooks/figs/hackathon_gh_badges.png differ