-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Use HPC for CI #386
WIP: Use HPC for CI #386
Conversation
so that Github-Actions now should run again
- module load CuPy | ||
- pwd | ||
- ls -lah | ||
- pip install -e . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to pip install
in the CI? User installation with pip is persistent, afaik. Is the gitlab not associated with any user?
I would prefer to put the module load
commands in the job scripts. But if you need to repeat the pip install
s every time, it's better here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, if we need pip install
each time. The jobs are run as a predefined user (the one who triggers the CI in Gitlab). As mirroring is done with a personal access token, that user is impersonated and the CI will always run as this specific user. So if pip install
is persistent, it will be available in all runs, as these runs are executed as the same user
Regarding module load
: We can put them in the script. I find it easier/nicer if the scripts are as short as possible. Therefore, I moved the module load
into the YAML-file. Furthermore, these steps (as module load or pip install) are executed on a login-node. The content of the sh-file is executed on a compute-node. In term of quota, it is "cheaper" if we don't spend compute-time for module load (although it does not take too long).
If you want to, feel free to move the pip install and module load into the script. If I shall do that, feel free to ping me!
paths: | ||
- benchmarks | ||
- sbatch.err | ||
- sbatch.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also have a directory where all job scripts can post their output? It would be neat to allow multiple job scripts. Not really needed, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure we can have a directory for the outputs. Each job in Gitlab spawns in a separate directory. Therefore, multiple jobs (from Gitlab) won't put files in the same directory. If multiple tasks from slurm are spawned from a single Job in Gitlab, having this directory might make sense.
For the job |
How do we continue? If I understand correctly, we need to add a personal access token from GitLab as a secret to this repository. Does it make sense that @pancetta sets up the GitLab repo now? He has to add the secret anyhow. |
Yes, probably it is best, if @pancetta does this. I can also create the repo (or use an existing one from me), but pancetta needs to add the secret in the end anyway. As said: The steps are listed above. If questions occur, I am willing to help. In person only in the next year, as this year, I'm out of office |
OK, the repository is |
Note that the gitlab repository is empty, so I could not allow force-push. I tried with wildcard |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## create_gitlab_ci #386 +/- ##
=================================================
Coverage 74.04% 74.04%
=================================================
Files 274 274
Lines 23153 23153
=================================================
Hits 17143 17143
Misses 6010 6010 ☔ View full report in Codecov by Sentry. |
This should also work in Pull-Requests
Corrected indentation of nested list
cdb77d4
into
Parallel-in-Time:create_gitlab_ci
commit cbaae05 Author: jakob-fritz <[email protected]> Date: Wed Apr 24 10:34:09 2024 +0200 Make create_gitlab_ci branch up-to-date before merging into master (Parallel-in-Time#418) * first working SDC version (M and Minv) * Update playground.py * cleaning up * Added some hyphens in plots (Parallel-in-Time#389) * Removed seperate file for GPU Dahlquist implementation (Parallel-in-Time#391) Co-authored-by: Thomas <[email protected]> * Review (Parallel-in-Time#388) * Bug is fixed and added new code * new code for the table * Edits in markdown file * some edits in test * Bugs fix * Codecov * I cleaned up my code and separated classes to make it easier to work with. It is not ready yet; if Codecov fails, I will include more tests. * forgot black * flake8 * bug fix * Edits codes according to the comments * Edited codes according to the comments in the GitHub * Defined new function in stability_simulation.py to check stability for given points and excluded codecov function that generates a table. * small edits for codecov * removed no cover * NCCL communicators (Parallel-in-Time#392) * Added wrapper for MPI communicator to use NCCL under the hood * Small fix * Moved NCCL communicator wrapper to helpers --------- Co-authored-by: Thomas <[email protected]> * Version bump for new release * proper readme and link * Started playground for machine learning generated initial guesses for (Parallel-in-Time#394) SDC * playing with FEniCS * blackening * Bug fix (Parallel-in-Time#395) * readme file changes * fixed bugs for stability plots and some edits in README file * some edits * typo in citation * Bump version * Bug fix (Parallel-in-Time#396) * Clear documentation and some edits in the code * forgot black * some changes * bump version * Cosmetic changes (Parallel-in-Time#398) * Parallel SDC (Reloaded) project (Parallel-in-Time#397) TL: Added efficient diagonal preconditioners and associated project. Coauthored by @caklovicka * Generic multi-component mesh (Parallel-in-Time#400) * Generic multicomponent mesh * new try * Added a test for MultiComponentMesh * Test that the type is conserved also after numpy operations * Added documentation for how to use `MultiComponentMesh` * Changed formatting of the documentation * Update ci_pipeline.yml * version freak show * version freak show II * version freak show III * version freak show IV * Update ci_pipeline.yml * version freak show V * 2D Brusselator problem (Parallel-in-Time#401) * Added 2D Brusselator problem from Hairer-Wanner II. Thanks @grosilho for the suggestion! * Added forgotten pytest marker * Fix brain afk error * Added work counter for right hand side evaluations * Removed file for running Brusselator from project * Retry at removing the file * I need to go to git school * Datatype `DAEMesh` for DAEs (Parallel-in-Time#384) * Added DAE mesh * Updated all DAE problems and the SDC-DAE sweeper * Updated playgrounds with new DAE datatype * Adapted tests * Minor changes * Black.. :o * Added DAEMesh only to semi-explicit DAEs + update for FI-SDC and ProblemDAE.py * Black :D * Removed unnecessary approx_solution hook + replaced by LogSolution hook * Update WSCC9 problem class * Removed unnecessary comments * Removed test_misc.py * Removed registering of newton_tol from child classes * Update test_problems.py * Rename error hook class for logging global error in differential variable(s) * Added MultiComponentMesh - @brownbaerchen + @tlunet + @pancetta Thank ugit add pySDC/implementations/datatype_classes/MultiComponentMesh.py * Updated stuff with new version of DAE data type * (Hopefully) faster test for WSCC9 * Test for DAEMesh * Renaming * ..for DAEMesh.py * Bug fix * Another bug fix.. * Preparation for PDAE stuff (?) * Changes + adapted first test for PDAE stuff * Commented out test_WSCC9_SDC_detection() - too long runtime * Minor changes for test_DAEMesh.py * Extended test for DAEMesh - credits for @brownbaerchen * Test for HookClass_DAE.py * Update for DAEMesh + tests * 🎉 - speed up test a bit (at least locally..) * Forgot to enable other tests again * Removed if-else-statements for mesh type * View for unknowns in implSysFlatten * Fix for RK sweeper - changed nodes in BackwardEuler class (Parallel-in-Time#403) * Made aborting the step at growing residual optional (Parallel-in-Time#405) * `pySDC`-build-in `LagrangeApproximation` class in `SwitchEstimator` (Parallel-in-Time#406) * SE now uses LagrangeApproximation class + removed Lagrange class in SE * Removed log message again (not corresponding to PR) * version bump * Added hook for logging to file (Parallel-in-Time#410) * Monodomain project (Parallel-in-Time#407) * addded some classes from oldexplicit_stabilized branch. Mainly, the problems description, datatype classes, explicit stabilized classes. Tested for IMEX on simple problems * added implicit,explicit,exponential integrator (in electrophysiology aka Rush-Larsen) * added exponential imex and mES, added parabolic_system in vec format * added new stabilized integrators using multirate, splitting and exponential approaches * before adding exponential_runge_kutta as underlying method, instead of the traditional collocation methods * added first order exponential runge kutta as underlying collocation method. To be generalized to higher order * generalized exponential runge kutta to higher order. Added exponential multirate stabilized method using exponential RK but must tbe checked properly * fixed a few things * optimized a few things * renamed project ExplicitStabilized to Monodomain * removed deprecated problems * fixed some renaming issues * did refactoring of code and put in Monodomain_NEW * removed old code and renamed new code * added finite difference discretization * added many things, cant remember * old convergence_controller * addded some classes from oldexplicit_stabilized branch. Mainly, the problems description, datatype classes, explicit stabilized classes. Tested for IMEX on simple problems * added implicit,explicit,exponential integrator (in electrophysiology aka Rush-Larsen) * added exponential imex and mES, added parabolic_system in vec format * added new stabilized integrators using multirate, splitting and exponential approaches * before adding exponential_runge_kutta as underlying method, instead of the traditional collocation methods * added first order exponential runge kutta as underlying collocation method. To be generalized to higher order * generalized exponential runge kutta to higher order. Added exponential multirate stabilized method using exponential RK but must tbe checked properly * fixed a few things * optimized a few things * renamed project ExplicitStabilized to Monodomain * removed deprecated problems * fixed some renaming issues * did refactoring of code and put in Monodomain_NEW * removed old code and renamed new code * added finite difference discretization * added many things, cant remember * added smooth TTP model for conv test, added DCT for 2D and 3D problems * added plot stuff and run scripts * fixed controller to original * removed explicit stabilized files * fixed other files * removed obsolete splittings from ionic models * removed old sbatch scripts * removed mass transfer and sweeper * fixed something * removed my base transfer * removed hook class pde * removed FD files * fixed some calls to FD stuff * removed FEM FEniCSx files * renamed FD_Vector to DCT_Vector * added hook for output and visualization script * removed plot scripts * removed run scripts, except convergence * removed convergence experiments script * fixed TestODE * added stability test in run_TestODE * added stability test in run_TestODE * added stability test in run_TestODE * removed obsolete stuff in TestODE * removed unneeded stuff from run_MonodomainODE * cleaned a bit run_MonodomainODE * removed utils/ * added few comments, cleaned a bit * removed schedule from workflow * restored tutorial step 7 A which I has modified time ago * run black on monodomain project * fixed a formatting thing * reformatted everything with black * Revert "revert formatted with black" This reverts commit 82c82e9. * added environment file for monodomain project, started to add stuff in workflow * added first test * added package tqdm to monodomain environment * added new TestODE using DCT_vectors instead of myfloat, moved phi_eval_lists from MonodomainODE to the sweeper * deleted old TestODE and myfloat stuff * renamed TestODEnew to TestODE * cleaned a bit * added stability, convergence and iterations tests. Changed a bit other scripts as needed * reactivated other tests in workflow * removed my tests temporarly * added monodomain marker to project pyproject.toml * changed files and function names for tests * fixed convergence test * made one test a bit shorter * added test for SDC on HH and fixed missing feature in SDC imex sweeper for monodomain * reformatted with correct black options * fixed a lint error * another lint error * adding tests with plot * modified convergence test * added test iterations in parallel * removed plot from tests * added plots without writing to file * added write to file * simplified plot * new plot * fixed plot in iterations parallel * added back all tests and plots * cleaned a bit * added README * fixed readme * modified comments in controllers * try to compute phi every step * removed my controllers, check u changed before comuting phis * enabled postprocessing in pipeline * added comments to data_type classes, removed unnecessary methods * added comments to hooks * added comments to the problem classes * added comments to the run scripts * added comments to sweepers and transfer classes * fixed the readme * decommented if in pipeline * removed recv_mprobe option * changed back some stuff outiside of monodomain project * same * again * fixed Thomas hints * removed old unneeded move coverage folders * fixed previously missed Thomas comments * begin change datatype * changed run_Monodomain * added prints * fixed prints * mod print * mod print * mod print * mod print * rading init val * rading init val * removed prints * removed prints * checking longer time * checking longer time * fixed call phi eval * trying 2D * trying 2D * new_data type passing tests * removed coverage folders * optmized phi eval lists * before changing phi type * changed eval phi lists * polished a bit * before switch indeces * reformatted phi computaiton to its traspose * before changing Q * optimized integral of exp terms * changed interfate to c++ code * moved definition of dtype u f * tests passed after code refactoring * Generic MPI FFT class (Parallel-in-Time#408) * Added generic MPIFFT problem class * Fixes * Generalized to `xp` in preparation for GPUs * Fixes * Ported Allen-Cahn to generic MPI FFT implementation * Ported Gray-Scott to generic MPI FFT (Parallel-in-Time#412) * Ported Gray-Scott to generic MPI FFT class * `np` -> `xp` * Reverted poor changes * Update README.md (Parallel-in-Time#413) Added the ExaOcean grant identified and the "Supported by the European Union - NextGenerationEU." clause that they would like us to display. * TIME-X Test Hackathon @ TUD: Test for `SwitchEstimator` (Parallel-in-Time#404) * Added piecewise linear interpolation to SwitchEstimator * Started with test for SwitchEstimator [WIP] * Test to proof sum_restarts when event occuring at boundary * Started with test to check adapt_interpolation_info [WIP] * Added test for SE.adapt_interpolation_info() * Update linear interpolation + logging + changing tolerances * Test for linear interpolation + update of other test * Correction for finite difference + adaption tolerance * Added test for DAE case for SE * Choice of FD seems to be important for performance of SE * Removed attributes from dummy probs (since the parent classes have it) * Test for dummy problems + using functions from battery_model.py * Moved standard params for test to function * Updated hardcoded solutions for battery models * Updated hardcoded solutions for DiscontinuousTestODE * Updated docu in SE for FDs * Lagrange Interpolation works better with baclward FD and alpha=0.9 * Added test for state function + global error * Updated LogEvent hooks * Updated hardcoded solutions again * Adapted test_problems.py * Minor changes * Updated tests * Speed-up test for buck converter * Black.. * Use msg about convergence info in Newton in SE * Moved dummy problem to file * Speed up loop using mask * Removed loop * SDC-DAE sweeper for semi-explicit DAEs (Parallel-in-Time#414) * Added SI-SDC-DAE sweeper * Starte with test for SemiImplicitDAE * Test for SI-SDC sweeper * Clean-up * Removed parameter from function * Removed test + changed range of loop in SI-sweeper --------- Co-authored-by: Robert Speck <[email protected]> Co-authored-by: Thomas Baumann <[email protected]> Co-authored-by: Thomas <[email protected]> Co-authored-by: Ikrom Akramov <[email protected]> Co-authored-by: Thibaut Lunet <[email protected]> Co-authored-by: Lisa Wimmer <[email protected]> Co-authored-by: Giacomo Rosilho de Souza <[email protected]> Co-authored-by: Daniel Ruprecht <[email protected]> commit 24cdf05 Author: Jakob Fritz <[email protected]> Date: Wed Apr 24 09:11:38 2024 +0200 Split installation and running into two jobs As one of the two jobs often failed during installation, while the other one succeeded. So it might be a race condition. Therefore, splitting installation and usage into separate jobs commit 488e7a4 Author: Jakob Fritz <[email protected]> Date: Wed Apr 24 08:34:59 2024 +0200 ci_pipeline.yml now more similar to upstream commit 9ab9b63 Author: Jakob Fritz <[email protected]> Date: Tue Apr 23 15:36:39 2024 +0200 Reduced diff to master commit cdb77d4 Author: jakob-fritz <[email protected]> Date: Mon Apr 22 16:46:44 2024 +0200 WIP: Use HPC for CI (Parallel-in-Time#386) Works on Parallel-in-Time#415 Added sync with Gitlab, now also for pull requests --------- Co-authored-by: Robert Speck <[email protected]> and Thomas Baumann <[email protected]> commit fb4b745 Author: Jakob Fritz <[email protected]> Date: Mon Apr 22 16:02:49 2024 +0200 Moved development of action into main branch and added version-tag commit 7de7187 Author: Jakob Fritz <[email protected]> Date: Mon Apr 22 11:19:53 2024 +0200 Added triggers for workflows again commit 5f45785 Author: Jakob Fritz <[email protected]> Date: Thu Apr 18 14:22:13 2024 +0900 Updated name of step, as merge is not ff-only anymore commit 2e9930f Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 13:36:44 2024 +0900 Wrong syntax for if else commit e33a611 Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 13:33:40 2024 +0900 Unshallow repo if needed commit c7db47a Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 13:18:26 2024 +0900 Add name and email for merge-commit commit f34de9c Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 12:09:20 2024 +0900 Also allow non-fast-forward merges commit 82d9233 Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 11:49:56 2024 +0900 Don't run mirror on push now (as gitlab-file is incorrect in this branch) commit d1b7250 Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 11:49:02 2024 +0900 Make unshallow before merging to properly compare history commit 9cfeea3 Author: Jakob Fritz <[email protected]> Date: Wed Apr 17 11:17:48 2024 +0900 Changed way to use variables (set locally and later in github_env) commit 6961ef3 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 16:31:51 2024 +0900 Reverted and changed way to store variable commit d906604 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 16:21:54 2024 +0900 Redone storing of var again commit faec097 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 14:50:56 2024 +0900 Corrected querying of a variable commit cbf0b5d Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 13:57:31 2024 +0900 Added more reporting for better debugging commit efdaa05 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 13:23:08 2024 +0900 Don't run main CI during development commit ccd646a Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 13:22:44 2024 +0900 First fetch, to be able to checkout branch commit 2712998 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 12:15:13 2024 +0900 Don't rerun CI on every push during this development commit 8a316e2 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 12:14:31 2024 +0900 Moved the check of condition from shell to yaml commit d347bd3 Author: Jakob Fritz <[email protected]> Date: Tue Apr 16 11:52:02 2024 +0900 Try to merge code (from PR) first So that merged state is tested in Gitlab-CI commit bcd64a5 Author: Jakob Fritz <[email protected]> Date: Mon Feb 5 11:25:43 2024 +0100 Use specific version of github2lab action commit 28472dc Author: Jakob Fritz <[email protected]> Date: Mon Jan 29 15:10:38 2024 +0100 Uses newer checkout-action to use new node-version (20) Version 16 is deprecated commit fefe88b Author: Jakob Fritz <[email protected]> Date: Mon Jan 29 14:53:18 2024 +0100 Minor formatting updates in README to trigger CI commit 3de1b56 Author: Jakob Fritz <[email protected]> Date: Fri Jan 26 16:13:53 2024 +0100 Formatted md-file to trigger CI commit ef6a866 Author: Jakob Fritz <[email protected]> Date: Thu Jan 18 15:48:25 2024 +0100 Set sha for checkout properly commit be3aef7 Author: Jakob Fritz <[email protected]> Date: Mon Jan 15 16:14:41 2024 +0100 Using default shallow checkout Otherwise, other own action complains commit f38f0e5 Author: Jakob Fritz <[email protected]> Date: Mon Jan 15 16:11:05 2024 +0100 Updated ref to use lastest code from PR; not merge Previously, a version of the code was used that was how a merge could look like. Now, the code is used as it is in the PR commit 249741b Author: Jakob Fritz <[email protected]> Date: Mon Jan 15 08:47:45 2024 +0100 Updated workflow for mirroring commit d8604b7 Author: Jakob Fritz <[email protected]> Date: Mon Jan 8 16:39:49 2024 +0100 Try exapnding the predefined variable commit c49accd Author: Jakob Fritz <[email protected]> Date: Mon Jan 8 16:37:10 2024 +0100 Another attempt to get the action to work commit 5e0118a Author: Jakob Fritz <[email protected]> Date: Mon Jan 8 16:28:51 2024 +0100 Hopefully now, variable is expanded instead using the name commit 832e7e5 Author: Jakob Fritz <[email protected]> Date: Mon Jan 8 16:07:29 2024 +0100 Exit instead of return needed Because exiting the shell instead of a function commit 5a5de4a Author: Jakob Fritz <[email protected]> Date: Mon Jan 8 12:00:55 2024 +0100 First version of CI to mirror pull_requests to Gitlab If someone with write-permissions triggered the workflow
No description provided.