Skip to content

Latest commit

 

History

History
88 lines (53 loc) · 5.84 KB

README.md

File metadata and controls

88 lines (53 loc) · 5.84 KB

Introduction to Reproducible Scientific Computing in Python (2024)

Welcome to the "Introduction to Scientific Computing in Python" workshop! This workshop is designed for individuals from diverse backgrounds in research and academia, particularly those new to Python or looking to enhance their scientific computing skills. My goal is to equip you with the basics of Python programming and guide you through developing reproducible code for scientific publications.

Workshop Overview

This workshop is divided into two main parts:

  1. Tech Week: A series of lectures aimed at introducing the basics of Python and essential programming concepts. These will be updated over time to improve the content
  2. Core Lecture: A detailed case study focusing on developing reproducible code, using object-oriented programming, version control with GitHub, and environment management.

The detailed schedule, readings, and lecture links are available in the Lectures and Readings Document.

Participant Break Down for Reference

  • Graduate Students: 47.6%
  • PhD Holders: 40.6%
  • Undergraduate Students: 11.8%
  • Python Beginners: 75.3%

Textbook

We will be using "Introduction to Scientific Programming with Python" by Sundnes (2020) as our reference textbook. It is available here. If you have trouble accessing the book, please contact me for assistance.

Tech Week

Lectures Links and Topics

  • Lecture 1: Introduction to Tech Week and a glimpse into Python's capabilities.
  • Lecture 2: Installation of Python and JupyterLab through Anaconda.
  • Lecture 3: Introduction to JupyterLab and fundamental Python skills.
  • Lecture 4: Understanding control structures (If statements and for loops).
  • Lecture 5: Functions, docstrings, and their importance.
  • Lecture 6: Working with lists, dictionaries, and user input.
  • Lecture 7: An introduction to Python libraries.
  • Scenario 8: Simulating EEG data as an application example (No video posted, this is a code walkthrough as a 'concept check').

Again, the detailed schedule, readings, and lecture links are available in the Lectures and Readings Document.

Core Lecture

This session will be a comprehensive walkthrough from initial "messy" code to polished, reproducible code suitable for publication. The case study is based on systems/cognitive neuroscience, but I hope to show folks that the principles that apply broadly across scientific computing. I specifically focus on four core areas that are a risk to reproducible computing. These are outlined in "Learning Objectives."

Learning Community

A Slack workspace has been set up for workshop participants to collaborate, share resources, and schedule meet-ups. Join the Slack workspace here. If the link has expired, you're welcome to reach out to me via email (see Contact) to get a new link. It will be open until May 2024.

Workshop Access

All materials, including Slack discussions and lectures, will be accessible until May 1st, 2024. After this date, the materials will be updated based on feedback for future iterations of the workshop.

Learning Objectives

This workshop aims to equip participants with foundational skills in scientific computing with Python, focusing on developing reproducible, interpretable code and understanding the lifecycle of iterative code development. Below are the key learning objectives and what participants can expect to learn.

1. Reproducible Computing Principles

  • Reproducibility Spectrum: Understand that reproducibility exists on a spectrum and is often at odds with scientific incentives.
  • Universal Design: Learn a few rules for developing reproducible and interpretable code

2. Iterative Code Development

  • Practical Approach: Master the practice of coding for the occasion, balancing the need for quick solutions and the foresight for future code utility.
  • Best Practice: Embrace test-driven development as a fundamental approach to ensure code reliability and functionality, as well as understanding it's often overkill and can hurt interpretability.

3. Understanding Common Threats to Reproducibility

  • Documentation and Loading Data: Tackle the initial challenges of "what now" after data downloading, emphasizing the importance of thorough documentation. Does the data run? How could I start quickly engaging with the data?
  • Environment Management: Identify the pitfalls of poor or nonexistent environment management strategies.
  • Black-Box Analyses: Recognize the risks associated with developing analyses that don't "explain" their inner workings.
  • Custom Figures: Discuss the implications of over relying on dressing up figures, which may compromise reproducibility.

4. Develop Skills to Combat Threats

  • Version Control: Utilize GitHub for version control to track and manage changes in the codebase, enhancing collaboration and code quality.
  • Generalized Functions: Learn to write generalized functions that increase code reuse and reduce errors.
  • Environment Management: Master techniques for effective environment management to ensure that code runs reliably across different systems.
  • Unit Tests and Pseudo-Unit Tests: Develop and apply unit tests and pseudo-unit tests to validate code functionality and prevent regressions.

Contact

For any inquiries, installation issues, or additional support, feel free to reach out via email pbloniasz [at] [bu] [.] [edu] or through the Slack workspace.

Thank you for joining this workshop. Let's dive into the exciting world of scientific computing with Python!