You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generic issue for keeping track of some TODOs in no specific order of importance
Important
I spent a lot of time getting boilerplate implementations to compile without warnings meaning there will be algorithms not implemented entirely correct and not producing expected results. SO, to combat this I will need to step thru each module's components (classes and methods, functions, dependency chains, call trees, etc) and modify what is needed
Overall optimization. So far the linear algebra module's matrix-matrix operations contain an optimized kernel for SSE ISAs. I want to add some optimizations in other modules of this project
For matrix and vector operations add optimized assembly kernels for AVX / AVX2 ISAs and verify compatibility with SSE2 ISAs
Make use of the matrix / vector optimizations in the machine learning module
Beyond matrix / vector operations target optimization in the entire linear algebra module
Explore how to optimize the number theory module as well and what this could possibly look like
For this, dig into each module and individual units of classes that can be turned into standalone assembly kernel files. To test with this for a soft estimate/boilerplate disassembly naive programs with different optimization flags (-O2, -O3, -fopt-info-vec-optimized, -mavx, -march=, etc,etc) and make these callable from the source C++ interface
Instead of creating assembly kernels for every function be meticulous and identify parts of the code where most time is spent and optimize these portions
Verify, cleanup, and optimize the following (compare against Eigen, Armadillo, Numpy, Scipy, etc?) using the above approach:
linalg/eigen.cpp
linalg/svd.cpp
linalg/tensor.cppTODO: this needs to be implemented correctly
nt/factorization.cpp
nt/prime_gen.cpp
nt/prime_test.cpp
nt/random.cpp
calculus/differential.cpp
stats/cdfs.cpp
stats/pdfs.cpp
stats/resampling.cpp
More areas but the above will suffice for now....
Improving workflows
In general, get all workflows to pass. As of right now the README shows "no status" for workflow badges
EDIT: this was due to workflows being triggered from a 'master' file (.github/workflows/opengpmp.yml)
Edit the documentation building workflow to not run on every commit. Instead this should be triggered when there are changes to the docs/ directory and the source code sitting in the include/ and modules directories
There may be other workflows that need to be edited to they only run on specific triggers
tinygpmp implementation could ideally make use of existing C++ code for ease of development but most embedded platforms make use of C already so re-implementation may be worth it? I don't plan for a large amount of users to consume this so C++ will be ideal for touching once and never again.
The issue with much of the C++ implementation is the use of large integer sizes in many cases 64 bit integers as well as 32 bit. To target embedded platforms this should be made dynamic somehow??
Similar issue as above in regard to memory usage. In the matrix operations specifically DGEMM we make use of allocating buffers in memory for matrices in advance. This assumes the machine has resources to do so and is fairly lazy but the speed of this implementation comes from that. To tackle this memory should likely not be allocated and a different approach should be taken
Eventually machine learning libraries as well when we get there...
Code duplication and cleanup
Cleanup misuses of OOP and classes in the project. Changes classes that are empty with many methods to either not be classes or use static methods that can be called without object instantiation
Remove duplicated code in test suite to start
Remove/refactor linalg matrix operations. As of right now there are a bunch of array and std::vector specific files for specific ISAs. This should be cleaned up along with a formal interface for the DGEMM implementation
Remember the calling tree for this, we want a main interface that calls the ASM kernels
language bindings
pygpmp is currently broken and needs more tailoring. In addition to SWIG make use of Boost Python for tailoring bindings correctly. Ideally we want as much of the wrapping and code generations to be left for SWIG to do and using Boost Python in places where more customization is needed.
Get each module working and wrapped correctly
Create and verify working samples for the wrapped code
gpmp.jl is under progress and I would like to make use of wrapit for automatic code generation for as much of the work as possible and specific tailoring in cases where needed.
Get each module working and wrapped correctly
Create and verify working samples for the wrapped code
These will be the only two languages I want to target for bindings
The text was updated successfully, but these errors were encountered:
Generic issue for keeping track of some TODOs in no specific order of importance
Important
I spent a lot of time getting boilerplate implementations to compile without warnings meaning there will be algorithms not implemented entirely correct and not producing expected results. SO, to combat this I will need to step thru each module's components (classes and methods, functions, dependency chains, call trees, etc) and modify what is needed
Overall optimization. So far the linear algebra module's matrix-matrix operations contain an optimized kernel for SSE ISAs. I want to add some optimizations in other modules of this project
linalg/eigen.cpp
linalg/svd.cpp
linalg/tensor.cpp
TODO: this needs to be implemented correctlynt/factorization.cpp
nt/prime_gen.cpp
nt/prime_test.cpp
nt/random.cpp
calculus/differential.cpp
stats/cdfs.cpp
stats/pdfs.cpp
stats/resampling.cpp
Improving workflows
README
shows "no status" for workflow badges.github/workflows/opengpmp.yml
)docs/
directory and the source code sitting in theinclude/
andmodules
directoriestinygpmp implementation could ideally make use of existing C++ code for ease of development but most embedded platforms make use of C already so re-implementation may be worth it? I don't plan for a large amount of users to consume this so C++ will be ideal for touching once and never again.
benchmarks which could probably make use of google/benchmark
Code duplication and cleanup
array
andstd::vector
specific files for specific ISAs. This should be cleaned up along with a formal interface for the DGEMM implementationlanguage bindings
The text was updated successfully, but these errors were encountered: