Now that I've identified the gaps, I'll need to translate these into performance-based outcomes:
1. Incorporate the principles of defensive programming into my workflow: Defensive programming include asserts, logging, and unit tests. This is important because producing error-free analysis is what will build trust with partners while opening up more opportunities for future projects.
2. Accelerate code speed: Improving code speed will be broken down into a few components:
- Get better at translating logic into code
- Understanding different Python runtimes under the hood
- Distributed Computing
3. Decide on the business area of focus: This will require more thinking & experimentation.
1. Clean, Pythonic Code - Incorporate PEP8 Standards into my coding [1 week] - Key here will be to remove bad habits from my programming, make my code legible and easy to understand.
Resources:
- Clean Python - Chapter 1
- Clean Code in Jupyter Notebooks
- Data Scientists, Your Variable Names Suck. Here's how to fix them
*Method of Learning - Read Clean Python + Active Recall on concepts by writing up my own example.
2. Writing Better Functions/Classes [1 week] - Improve the quality, effectiveness of the functions, classes I write.
Resources:
*Method of Learning - Read Clean Python + Active Recall on concepts by writing my own program.
**3. Debugging/Testing in Python[1 week] ** - Key here will be to reduce the number of errors in my code.
4. Distributed Computing[1 month] - Understand how to use pyspark functionality.
- Build an end to end ML pipeline using PySpark or Spark
5. Decorators & Context Managers[1 week] - Improve functions by incorporating decorators into my workflow
6. Generators & Iterators[1 week] -
- How I plan to integrate these into my schedule