Releases: RRZE-HPC/likwid
Releases · RRZE-HPC/likwid
likwid-5.0.2
Changelog for 5.0.2:
- Fix memory leak in calc_metric()
- New peakflops benchmarks in likwid-bench
- Fix for NUMA domain handling properly
- Improvements for perf_event backend
- Fix for perfctr and powermeter with perf_event backend
- Fix for likwid-mpirun for SLURM with cpusets
- Fix for likwid-setFrequencies in cpusets
- Update for POWER9 event list
- Updates for AMD Zen, Zen+ and Zen2 (events, groups)
- Fix for Intel Uncore events with same name for different devices
- Fix for file descriptor handling
- Fix for compilation with GCC10
- Remove sleep timer warning
- Update examples C-markerAPI and C-internalMarkerAPI
Note: If you want to use LIKWID 5.0.2 with Lua 5.1, please apply this patch
likwid-5.0.1
I'm happy to announce a new bugfix release of LIKWID 5.
- Some fixes for likwid-mpirun
- Fix for hybrid pinning with multiple hosts
- Fix for perf.groups without core-local events (switch to likwid-pin)
- Fix for command line parser
- For for mpiopts parameter
- Add UPMC as Uncore counter to splitUncoreEvents()
- Expand user-given input to abspath if possible
- Check for at least one executable in user-given command
- Add skip mask for SLURM + Intel OpenMP
- Check if user-given MPI type is available
- Fix for perf_event backend when used as root
- Include likwid-marker.h in likwid.h to not break old MarkerAPI code
- Enable build with ARM HPC compiler (ARMCLANG compiler setting)
- Fix creation of likwid-bench benchmarks on POWER platforms
- Fix for build system in NVIDIA_INTERFACE=BUILD_APPDAEMON=true
- Update for executable tester
- Update for MPI+X test (X: OpenMP or Pthreads)
Merry Christmas
likwid-5.0.0
New version LIKWID 5.0.0
Changelog:
- Support for ARM architectures. Special support for Marvell Thunder X2
- Support for IBM POWER architectures. Support for POWER8 and POWER9.
- Support for AMD Zen2 microarchitecture.
- Support for data fabric counters of AMD Zen microarchitecture
- Support for Nvidia GPU monitoring (with NvMarkerAPI)
- New clock frequency backend (with less overhead)
- Generation of benchmarks for likwid-bench on-the-fly from ptt files
- Switch back to C-based metric calculator (less overhead)
- Interface function to performance groups, create your own.
- Integration of GOTCHA for hooking into client application at runtime
- Thread-local initialization of streams for likwid-bench
- Enhanced support for SLURM with likwid-mpirun
- New MPI and Hybrid pinning features for likwid-mpirun
- Interface to enable the membind kernel memory policy
- JSON output filter file (use -o output.json)
- Update of internal HWLOC to 2.1.0
Note: The MarkerAPI Macros have been moved to a separate header "likwid-marker.h"
likwid-4.3.4
New bugfix release:
- Fix for detecting PCI devices if system can split up LLC and memory channels (Intel CoD or SNC)
- Don't pin accessDaemon to threads to avoid long access latencies due to busy hardware thread
- Fix for calculations in likwid-bench if streams are used for input and output
- Fix for LIKWID_MARKER_REGISTER with perf_event backend
- Support for Intel Atom (Tremont) (nothing new, same as Intel Atom (Goldmont Plus))
- Workaround for topology detection if LLC and memory channels are split up. Kernel does not detect it properly sometimes. (Intel CoD or SNC)
- Minor updates for build system
- Minor updates for documentation
Notice: If you want to compile likwid-4.3.4 with ACCESSMODE=perf_event, please apply the attached patch before compiling.
likwid-4.3.3
- Fixes for likwid-mpirun
- Fixes for events of Intel Skylake SP and Intel Broadwell
- Support for Intel CascadeLake X (only new eventlist, uses code from Intel Skylake SP)
- Fix for bitmask creation in Lua
- Event options for perf_event backend
- New assembly benchmarks in likwid-bench
- MarkerAPI: Function to reset regions
- Some new performance groups (DIVIDE and TMA)
- Fixes for AMD Zen performance groups
- Fix when using topology input file
- Minor bugfixes
likwid-4.3.2
- Fix in internal metric calculator
- Support for Intel Knights Mill (core, rapl, uncore)
- Intel Skylake X: Some fixes for events and perf. groups
- Set KMP_INIT_AT_FORK to bypass bug in Intel OpenMP memory allocator
- AMD Zen: Use RETIRED_INSTRUCTION instead of fixed-purpose counter for metric calculation
- All FLOPS_* groups now have vectorization ratio
- Fix for MarkerAPI with perf_event backend
- Fix for maximal/minimal uncore frequency
- Skip counters that are already in use, don't exit
- likwid-mpirun: minor fix when overloading a host
- Improved detection of PCI devices
likwid-4.3.1
Minor fixes for Intel Skylake and frequency module
likwid-4.3.0
- Support for Intel Skylake SP architecture (core, uncore, energy)
- Support for AMD Zen architecture (core, l2, energy)
- Support for Intel Goldmont Plus architecture
- Pinning strategy 'balanced'
- New Lua based calculator
- Support for Intel PState CPU frequency daemon
Minor:
- Fixed MCDRAM measurements on Intel Xeon Phi (KNL) with perf_event back end
Merry Christmas
likwid-4.2.1
- Fix for logical selection strings
- likwid-agent: general update
- likwid-mpirun: Improved SLURM support
- likwid-mpirun: Print metrics sorted as they are listen in perf. group
- likwid-perfctr: Print metrics/events as header in timeline mode
Redirect to file when -o switch is used - likwid-setFrequency: Commandline options to set min, max and current frequency
- Pinning-Library: Automatically detect and skip shepard threads
- Intel Broadwell: Added support for E3 (like Desktop), Fix for L3 group
- Intel IvyBridge: Fix for PCU fixed-purpose counters
- Intel Skylake: Fix for events CYCLE_ACTIVITY, new event L2_LINES_OUT
- Intel Xeon Phi (KNL): Fix for overflow register, Update for ENERGY group
OFFCORE_RESPONSE events are now tile-specific - Intel SandyBridge: Fix for L3CACHE group
- Event/Counter list contains only usable counters and events
- Fix and warning message for static library builds
likwid-4.2.0
- Support for Intel Xeon Phi (Knights Landing): Core, Uncore, RAPL
- Support for Uncore counters of some desktop chips (SandyBridge, IvyBridge, Haswell, Broadwell and Skylake)
- Basic support for Linux perf_event interface instead of native access. Currently only core-local counters working, Uncore is experimental
- Support to build against a existing Lua installation (5.1 - 5.3 tested)
- Support for CPU frequency manipulation, Lua interface updated
- Access module checks for LLNL's msr_safe kernel module
- Support for counter registers that are only available when HyperThreading is off
- Socket measurements can be used for all cores on the socket in metric formulas.
The LIKWID team wishes Merry Christmas to everyone.