Coverage Report
File | Stmts | Miss | Cover | Missing |
---|---|---|---|---|
pintless | ||||
quantity.py | 150 | 51 | 66% | 33, 37, 53, 60, 71, 78, 90–98, 104, 128, 136, 154–167, 176, 180, 184, 229, 245, 254–272, 278, 281–285, 289, 292, 295, 298, 301, 305, 308, 311, 317 |
registry.py | 165 | 15 | 91% | 75, 94, 111, 124, 127, 143, 145, 171, 193, 273, 295, 299, 314, 320, 328 |
unit.py | 148 | 11 | 93% | 37, 46, 49, 52, 119, 127, 239, 255, 295, 309, 351 |
TOTAL | 468 | 77 | 84% |
The unit library pint is fantastic. It removes a whole class of bugs from common data science workloads, and provides good tools for humans to process numbers. But it's very slow, and provides vast swaths of functionality that I don't need.
This library is like pint, but less --- it lets you use units without pain, but aims to be performant and small-in-memory. It's designed as a drop-in replacement to pint for those of us who don't use most of pint's features.
Choice of features is based on my personal experience in projects. From this I've developed some principles that will be followed here to prevent creep/bloat:
Things this doesn't support:
- Units that don't scale from 0, e.g. degrees C and F
- LaTeX output
- Translation to other languages --- want different units? Use a different definition file
- Simplification of units: algebraic simplifications are necessary, but choosing 'sensible' units for humans is beyond the scope of this lib
- Scientific notation and other non-unit number representation problems. Pintless attempts to touch the values as little as possible.
- Numpy/pandas support
Design Principles
- Fast is more important than small or simple
- Precompute where possible
- Having principles usually leads you to design for them rather than reality: design based on benchmarks
- Don't incur performance costs for obscure units or use-cases: allow users to specify minimal sets of things for common workflows
- Don't incur performance costs for nicer APIs that are only useful in interactive workflows (e.g. string processing, output to notebooks, etc)
- Quantity and Unit classes are numbers and should be as transparent/minimal as possible, holding as few external references as possible, and should be serialisable with as little pain as possible
- Better test pack
- Better benchmarks, and performance testing in CI
- Compilation of parts of the library to C
At the time of writing, pintless is roughly 20 times faster than pint. One of the roadmap actions above is to estabilish a much better benchmarking process for this figure, though, so take it with a grain of salt for now.
A small benchmark script is in the root of this repository, benchmark.py
. To run (and visualise the output using snakeviz:
python -m cProfile -o both.prof benchmark.py
snakeviz both.prof
The benchmark is dead simple, and prioritises repeated simple actions (as per a lot of processing workloads).