Porting Moment Computation to Rust and Caching CGR Matrix Computation. #26

YCC-ProjBackups · 2023-11-29T19:09:55Z

This is THE main PR that I was working on. I deleted the version control switch, so it only uses the most recent version of the code. Although I tried to rebase on the most recent main and my cleanup PR (to delete unused variables), I may have made a mistake. Please let me know if you see anything out of order.

Here are the list of main changes:

compute_moments is ported to Rust. As discussed with Rosy, the Rust version takes inverse and determinant computed by numpy for numerical stability reasons. This increases the runtime significantly compared to pure Rust version (4 ~ 10 times slower), but causes less error compared to the original Python version (maximum observed relative error was ~5e-9). Also, the "slower" version is still faster than the original implementation by 6 ~ 45 times on my local machine. I will look into the determinant computation and try to figure out the source of error. I will also run scripts on CHTC to get more objective runtime comparisons.
ClebschGordanReal now uses new CGRCacheList. It uses CLOCK algorithm (originally used for page replacement) to store CG matrices of 5 "most recently used" l_max values [*]. This has an unfortunate(?) side-effect of limiting l_max value to 2^31 - 1, but it should not be a problem as we expect l_max to be much smaller than the maximum value. On my local machine, it divides the number of calls to the constructor by the number of iterations ran with the same l_max. For example, when I ran tests on local machine that performs 5 iterations with l_max of 10, the number of calls to the constructor went from 3355 to 671 -- a decrease by factor of 5. As such, runtime also decreases by the factor of number of iterations. Note, the cache currently has a capacity of 5. This is an arbitrary decision, and you can increase it later. However, do note that doing so may increase memory usage, especially when it caches matrices for multiple large l_max. In the future, I will try to run some tests on CHTC to determine average runtime improvement caused by caching.
Looping structure inside the pairwise_ellip_expansion function has changed. Before, it was looping through every possible pair of species and checking if it is in the neighbor_list.keys. However, I changed so that I directly loop through the elements in neighbor_list.keys. As such, pairwise_ellip_expansion no longer requires species as one of its arguments.
I have added a new test_compute_moments.py to compare the moment generation between Rust version and Python version. There are two tests: tests with set parameters and a randomized test. As mentioned in (1), the maximum error observed so far was ~5e-9, so both of my tests uses assert numpy.allclose(x, y, 1e-8) to test for closeness. I have ran several iterations and it has not failed so far.

I know it is a big PR, so let me know if you have any questions and/or concerns.

[*]: If you search the CLOCK algorithm, it mimics the "least recently used" replacement policy. However, the algorithm is not exact and it has a chance to replace one of the more recently used entries.

Implemented Rust FFI for moments and caching for CGR. "Committing Rust FFI-related files" Changed name of csv file to something shorter. Changed the file system so that the file can be run from the root with python3 -m tests.ex_input Moved all configs to ex_inputs.py file and ran diff check on results from two languages. Implemented max-16-iter rule check for time mode Fixed a bug that occurred when code_timer in ex_input.py was set to None. Deleted unnecessary folder, equistore. Changed the script to perform comparison between different versions Updating isort on __init__ Update tests.yml to include coverage (#8) Update tests.yml Update tests.yml Update tests.yml Update README.md Update tests.yml (#9) * Update tests.yml * Update tests.yml Update tests.yml (#9) * Update tests.yml * Update tests.yml Adding new tests for EDP (#7) * Adding new tests for EDP * Making requisite changes for equistore compatibility * improved code coverage to 99% by testing for 3 additional cases: show_progress=True, multiple frames, and matrix rotations * pass the linter --------- Co-authored-by: Arthur Lin <[email protected]> 5 incorporate normalization factors (#6) * Added (and made default behavior) the ability to orthonormalize features that use the GTO basis. * This involves normalizing the features properly, creating an overlap matrix with orthogonal GTOs, and orthogonalizing the features. * Added relevant tests to test new orthonormality functionality * Added a jupyter notebook displaying how Lowdin Orthonormalization works (on a small gto basis set). --------- Co-authored-by: Rose K. Cersonsky <[email protected]> Co-authored-by: Arthur Lin <[email protected]> minor changes to accomodate new equistore api Updating to be in line with new equistore formatting (#15) Adding progress bar for sanity sake (#14) added warning for passing in integer, and cast any int arguments to f… (#13) * Raise an error when a float is passed for radial_gaussian_width, and added a unit test to ensure error is raised --------- Co-authored-by: Arthur Added code that allows multiple iterations per parameter set. Now uses tqdm to keep track of progress. Removed all internal timing code. The version selection feature still remains for testing purposes. Implemented v2, in which loop for pairwise_ellip_expansion changed (further testing needed). Also, minor changes to output file. Cross-platform support? Needs further testing Added mac move command Fixed file extension error on Mac Moved caching logic to CGR class itself, instead of passing it as a keyword argument. Removed unnecessary keyword argument from single_pass function minor syntax changes for 3.9 added test ellipsoid trimers for performance testing Added subprocess shell command to automatically run makefile. Running makefile automatically while pip install version 2. Added ell-trimers.xyz file. Cross platform support... hopefully. Version 3. Added MANIFEST.in file Changed l_cut to lcut in cg_combine. Fixed the issue of v0's output differing from the original implementation. Some minor cleanup before some more experimental changes Some code cleanup, mostly on the Rust side. Updated equistore to metatensor. Requires changing _classes.py of metatensor. Updated metatensor and rascaline to latest version. Changed import statements, and no _classes.py modification necessary. Removed unused (for now) Rust dependencies Some basic modification for test. Added code to test running time depending on frame numbers. Added code to test running time depending on frame numbers. Changed implementation to determine repetition number with argv Changed CGRCacheList replacement technique from FIFO to Clock (and added documentation) Fixed a detail in Clock algorithm (initial insertion should have replacement flag = 0) Fixed uninitialized _keys in cyclic_list.py Fixed uninitialized _keys in cyclic_list.py Updated metatensor field names to the most recent changes Removing equistore in favor of metatensor (#20) * Removing equistore in favor of metatensor * fixing metatensor.core Adding subtract self to the neighborlist generator (#19) Switching list comp (#24) Switching the code to do the list comp once instead of within each loop Fixed mismatch between Python and Rust compute_moment Cleaned up unused imports and variables. Fixed minor spelling error Deleted unncessary space in monomial_iterator.py Ran isort to satisfy the linter. Changed Rust implementation such that it takes inverse and determinant from Numpy instead of calculating internally. Deleted version control switches and ran isort to satisfy linter. Ran isort again. Fixed a bug that occurs when both error AND inverse are 0. Copied main's contract_pairwise_feat code. Adding ability to override number of radial bases manual override n and added test cases removed extraneous arg Linter Linter Pointing to cutoff radius properly I messed with the history so now I'm restoring it pass linter removed error check restricting independent specification of max_radial and radial_gaussian_width added tests to ensure constant n functionality is correct linter num_radial -> max_radial in edp.py removed changes pertaining to coupling sigma and n to keep PR clean restored original compute_gaussian_parameters function header incorporated necessary changes to make constant n work for both orthonormalization and moments generation pass linter removed deprecated comment made definition of maxradial consistent with num_n and updated tests Update lint.yml (#25) add diff to linter so we can see how files are failing the linter. Used Black formatter

arthur-lin1027 · 2023-12-08T00:26:01Z

Why are the tests failing exactly? It seems things aren't importing properly?

arthur-lin1027 · 2024-09-06T21:50:30Z

Closing this with because PRs #46 and #37 implemented these.

YCC-ProjBackups requested review from arthur-lin1027 and rosecers November 29, 2023 19:09

rosecers force-pushed the ycc/pr/rust_and_cache branch from cc7d30c to 916b2e7 Compare December 5, 2023 23:51

Linting gods

886344e

rosecers force-pushed the ycc/pr/rust_and_cache branch 3 times, most recently from 1041c79 to 50c4da4 Compare December 12, 2023 20:01

rosecers added 2 commits December 12, 2023 21:05

caveman time

ff0a5e5

caveman time

ca316dd

rosecers force-pushed the ycc/pr/rust_and_cache branch 3 times, most recently from 691ec50 to ca316dd Compare December 12, 2023 20:36

arthur-lin1027 mentioned this pull request Sep 2, 2024

Rust moments #46

Merged

arthur-lin1027 closed this Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Porting Moment Computation to Rust and Caching CGR Matrix Computation. #26

Porting Moment Computation to Rust and Caching CGR Matrix Computation. #26

YCC-ProjBackups commented Nov 29, 2023

arthur-lin1027 commented Dec 8, 2023

arthur-lin1027 commented Sep 6, 2024

Porting Moment Computation to Rust and Caching CGR Matrix Computation. #26

Porting Moment Computation to Rust and Caching CGR Matrix Computation. #26

Conversation

YCC-ProjBackups commented Nov 29, 2023

arthur-lin1027 commented Dec 8, 2023

arthur-lin1027 commented Sep 6, 2024