Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AOCL 2.2 changes - Majorly include LAPACK 3.9.0 support #36

Open
wants to merge 1,204 commits into
base: master
Choose a base branch
from

Conversation

rsanagap
Copy link

@rsanagap rsanagap commented Jul 5, 2020

Field,

Please review and merge them.

Thanks

pradeeptrgit and others added 30 commits July 3, 2023 23:43
Most python scripts use EOL python2. This commit
updates the scripts to use python3

Change-Id: Iab35f8e21ac1aad918c796560ad3f7b5d826e893
When User provides path to an external openmp library FLA_MULTITHREADING_MODEL wasnt being set on linux and windows

Change-Id: Idf5684d2cad065d0a6368c8f76c1ca964d9b0076
- Enabled Ctest to run aoclflaprogress , legacy test amd main test suite with a single command
- Added negative test cases test.
- test/main/scripts/run_negative_test_cases.py has been removed
- Added example to add more such test in the future
- Updated BUILD.md

Signed-off-by: bsinghpa<[email protected]>
AMD-Internal: CPUPL-3467
Change-Id: Ic975b4982abc72267a8c5d2a717f151f0b47d29a
{c,z}hegst and {c,z}hegs2 takes lapack path when AMD optimised
path is enabled.

Signed-off-by: Jintu Das <[email protected]>
Change-Id: I45a918544187687cc0b8f18a2e955e8a4dca5f70
- If leading dimensions = -1, setting them to the least valid value
unless inputs are from commandline
- removed invalid input filtering in testsuite and allow them to
pass to LAPACK API to get invalid argument as result
- Added negative test cases for all the leading dimensions based
on einfo
- modified einfo logic to return test result as "FAIL" when einfo
is not provided along with invalid inputs
- moved all the existing static global variales/file pointers in
all the files to lapack.c

Signed-off-by: dnikku <[email protected]>
AMD-Internal: CPUPL-3076
Change-Id: If09494376d23878768c2369e19767d8a909d23b9
By default, external project's sources will be updated whenever
CMake is rerun. We want to skip this update step for AOCLUtils
library. Its sufficient to clone once based on release tag and
use it.

Change-Id: I9f160fb413a2c55662a247cf8999c05ebc8c003d
Enabled OMP code for the following APIs -
	- {c,z}hetrd_hb2st
	- {s,d}sytrd_sb2st
	- iparam2stage.

Signed-off-by: Jintu Das <[email protected]>
AMD-Internal: CPUPL-3232
Change-Id: Iaff2cad7f6df86653f05f91025d75497831d95cd
   1. Implemented AVX512 code for 8 parallel inner loop and outer loop iterations at a time.
   2. Written separate function using avx2 intrinsic to Calculate scalefactors and update trailing matrix, fla_lu_piv_small_d_update_tr_matrix_avx2.
   3. Created separate threshold macros for AVX2 and AVX512, FLA_DGETRF_SMALL_AVX2_THRESH0 and FLA_DGETRF_SMALL_AVX512_THRESH0.
   4. Configure time AOCL FAMILY identification through python script and enabling compiler flag accordingly
   5. Created object library for avx512 specific code and enable avx512 compiler flag

Change-Id: I05de51326c4ec2ca5703c6f7e9cc965e7bc4f18f
1. Use a separate TLS structure tl_context to track number of threads.
   The global context is initialized and updated with libFLAME-specific
   threading, and tl_context is updated from this if needed. Otherwise
   tl_context is updated on every call from OpenMP runtime information.
   Hardware ISA information remains in global_context.

2. Check OpenMP active level against max active levels when setting
   number of threads for starting a new parallel region to ensure the
   correct number of threads is used when BLIS is called within nested
   OpenMP parallelism, or if the user changes the parallelism via
   OpenMP function calls.

3. To reduce redundant code, FLASH_get_num_threads() has been adapted
   to call fla_thread_get_num_threads()

AMD-Internal: [CPUPL-3558]
Change-Id: Idc5eb604fb42ab73cbc99148afeb7a870c98fc1a
Change-Id: I6441fd07b06b3907775509d8386d4451c42f4179
Previous Design:
- AOCL FLA progress supports only single trhead application.

New Design :
- Now AOCL FLA progress supports for multithread Application.

Fixed the complier warning for aocl_fla_progress.

README is updated with details on testing the feature.

Signed-off-by: Parag <[email protected]>
AMD-Internal: CPUPL-3753
Change-Id: I09468c3201408316f415bc41b99350fa5b2b6c23
fPIC flag was missing with GCC which was causing compilation issues .
Added fPIC flag
Removed unwanted ctest print statements which were printed during configuration

Signed-off-by: bsinghpa<[email protected]>
Change-Id: Ie66b16aa2fbb9b89089f6c8d92df86e02d4b6f04
Fixes compiler warnings related to ilp64 build.

Signed-off-by: Jintu Das <[email protected]>
Change-Id: I04fdfa71d21a960e4cb666ef65f4fa4e609c7b62
- Added Code Coverage feature for Libflame
- Added Code Coverage Flags to netlib test
- Added generate_code_coverage_html.sh to view the code coverage in HTML format
- Added CTEST for netlib-test gcc and aocc
- Updated BUILD.md

Signed-off-by: bsinghpa<[email protected]>
AMD-Internal: CPUPL-3685
Change-Id: Ib36cb999466521db442f3a5d116ae68533d554bc
Get the num of threads from the context and removed
unneccessary OMP taskwait from {c,z}hetrd_hb2st.c and
{s,d}sytrd_sb2st.c

Signed-off-by: Jintu Das <[email protected]>
AMD-Internal: CPUPL-3232
Change-Id: I85b9c417a955c1ef9823694c2b192919779df03b
Modified {s,c,d,z}trsyl3 APIs to give consistent output both
in LP64 and ILP64 build.

Signed-off-by: Jintu Das <[email protected]>
AMD-Internal: CPUPL-3712
Change-Id: I660ab9284e27d7487ad55773da2168ecca290314
In order to enable closer integration with AOCL-BLAS, new configure
option, "ENABLE_AOCL_BLAS", is created to link with AOCL-BLAS at
build time. The location of AOCL-BLAS is to be set by user either by
using environment variable or cmake option "AOCL_ROOT". This location
must have "include" directory and a "lib" directory that contains the
necesaary header files and AOCL-BLAS binary respectively. Currently,
this feature is available for builds through Cmake only. The build
documentation is updated with these details in BUILD.md.

AMD-Internal: CPUPL-3826

Change-Id: I5de5e8ac1491c9c2f8da6a1ff39a25b326357361
DORG2R functionality changed from FLAME implementation
to F2C converted Netlib reference code. This is done for
compatibility with DGEQRF latest optimized which follows
Netlib's methodology.

Signed-off-by: Vasanthakumar R <[email protected]
AMD-Internal: SWLCSG-2364
Change-Id: I79ccd17404ef474967bc36e812caaa8625b72074
Add hidden Fortran string length argument to xerbla function and
calls. Netlib LAPACK custom test versions of xerbla no longer
need to have string equality test disabled.

AMD-Internal: [CPUPL-3013]
Change-Id: I8b3cabfe4fc72d329733bc2e19660fd89dab25f7
Added min, max and ladiv replacements.

Signed-off-by: Jintu Das <[email protected]>
AMD-Internal: CPUPL-3848
Change-Id: I498990414e4f4b650cac037fabab0a456a7d9600
This commit fixes function pointer and un-used variable related
warnings.

Signed-off-by: Jintu Das <[email protected]>
AMD-Internal: CPUPL-3651
Change-Id: Id1aaad558f7f333590a9e22e075525e7cb0d0b24
A new script has been added that prints the failed routines
that were not caught by the netlib LAPACK test suite script.
This script also prints the total number of failed tests, as well
as the number of info errors and illegal value failures.

Signed-off-by: Jintu Das <[email protected]>
Change-Id: I7ab195600ac29087bfa33e36a5924ed8acae93a0
Added AVX context checking for following list of APIs:
- fla_dhrot3_avx2
- fla_drot_avx2
- fla_zrot_avx2
- fla_dgeqrf_small_avx2
- fla_sscal_ix1_avx2
- fla_sger_avx2

Signed-off-by: Jintu Das <[email protected]>
Change-Id: I59a7fd266cdbc6e76bdf74a4648dcb97142dc233
Inlined different functionalites of DGESVD. Used AVX2 DGEQRF code.
Further optimization done for N >> M and small N.
Added include paths for local header files in CMAKE build system.
Resolved warnings in AVX2 files.

Signed-off-by: Vasanthakumar R <[email protected]
AMD-Internal: CPUPL-3251
Change-Id: I47fb4eae8f830dbffa5f42473b45414043b32016
- added ctest for main_test micro/medium/short/long
- Updated BUILD.md for installing libflame library

Signed-off-by: bsinghpa<[email protected]>
AMD-Internal: CPUPL-3883
Change-Id: Iecc690096d064e9598dda34e6186481fe03b5710
Exclusion of header x86-opt header files from FLAME.h casuing
errors while linking applications. Fix was to remove these
inclusions in FLA_lapack_var_prototypes.h and related changes.

Signed-off-by: Vasanthakumar R <[email protected]
Change-Id: Ie0a8a1b63ed1d8991d60a3a4fc54268c970cc910
Added support for testing Auxilary API's.
Added ROT and LARTG API test code in libflame main testsuite.

Signed-off-by: Parag <[email protected]>
AMD-Internal: CPUPL-3759
Change-Id: Id0ecc92e297dae603b94952596ab7dc0a1cadfc3
Added new test API to verify LAPACK ORG2R API functionality

AMD-Internal: CPUPL-3861
Signed-off-by: dnikku <[email protected]>
Change-Id: I42f666fcfed1eac28fd0be6af534efdf9f5b1c21
Set the input.global.operations files to default values

AMD-Internal: CPUPL-3861
Signed-off-by: dnikku <[email protected]>
Change-Id: I2a73956d1ab8afc35f3857dce996c2e3103bd7e4
ksaithar and others added 30 commits August 5, 2024 00:10
…ues of GESV API

details:Added Overflow and Underflow test cases
AMD-Internal: [CPUPL-4738]
Signed-off-by: ksaithar <[email protected]>
Change-Id: I583f97a93e22235b32af92a1bedd7998eb1f1e22
…ues of GEEVX API

details:Added Overflow and Underflow test cases
AMD-Internal: [CPUPL-5384]
Signed-off-by: ksaithar <[email protected]>
Change-Id: I461de0d8ba4595b98d5f1afba52b48625bddbe96
1. Added matrix initialization and scaling for overflow/underflow tests.
2. Added test cases through ctest.

AMD-Internal: [CPUPL-4762]
Signed-off-by: sujithhp <[email protected]>
Change-Id: Iff4b0e4d292baedd721ec40813c890fbcced4a8e
…t suite.

Removed AOCL FLA Progress test suite as GBTRF, GBTRS are
implemented in main test sutie.

Change-Id: I23b2faab80d93105beab7f1e3c1b69eda8b6a736
Added deafult path for the followin functions:
	1. fla_drot
	2. fla_sger
	3. fla_dhrot3

Signed-off-by: Sridhar Govindaswamy <[email protected]>
Change-Id: Iff7a41a6b229a8cf1f1c30e1dc72898fad4151e3
Testing GESDD API by generating input matrix from known
singular values and validating input,
output Singular values

AMD-Internal: CPUPL-5333
Signed-off-by: dnikku <[email protected]
Change-Id: Ie678f6974def0b12d4981a481e8fffdca4bc8bb0
Fixed the scaling factor for svd  input matrix

AMD-Internal: CPUPL-5352
Signed-off-by: dnikku <[email protected]>
Change-Id: I0a7713c7f14d9c5a075603b45023d616f4a75144
…e path

AMD Internal : [CPUPL-5549]

Change-Id: I1597cfb4a10afdb779c2845638fec63a9ba78e6e
AMD Internal : [CPUPL-5549]

Change-Id: Iae6972dde9f9382fa792de860bfb672061b8db90
Enabled validation test-2 and test-3 for jobz = s

Signed-off-by: venkatesha <[email protected]>
Change-Id: I11fc29086a75b071e5fbd37337762a32da7198fb
AMD Internal : [CPUPL-5549]

Change-Id: I78a99fd7ec2f681e0420f86080c2f244d643a46c
Added LAPACKE interface testing support in the main test suite
for 4.2 APIs

AMD-Internal: CPUPL-4258
Signed-off-by: dnikku <[email protected]>
Change-Id: I6acbd421bde95044e498d150625606d23ca8acfc
…indows

AMD Internal : [CPUPL-5576]

Change-Id: If66ad8edbe3dd0b0df06f54f29f29c9a0ee4119f
Added LAPACKE prototypes and removed lapacke.h inclusion to fix make build
errors.
[Todo: Include lapacke header file and removed duplicate prorotypes
 in main testsuite]

Signed-off-by: dnikku <[email protected]>
Change-Id: Idc923a4aa8618cab17f772910cdd826a7aa3532b
1. Added matrix initialization and scaling for overflow/underflow tests.
2. Added test cases through ctest.

AMD-Internal: [CPUPL-4761]
Signed-off-by: sujithhp <[email protected]>
Change-Id: I15dec63e003758fd3d29a24db6c73bdd6876ffff
details: Range for Eigen value selection extended to fix overflow failure
AMD-Internal: [CPUPL-5610]
Signed-off-by: ksaithar <[email protected]>
Change-Id: I330cb966652f051edaedbe911cc3a2315c4f66ae
…ss AOCL libraries

AMD Internal : [CPUPL-5607]

Change-Id: Id417a215a6fceb08594ca65c8364d7ee197db352
Few of the CPP template function were defined twice in libflame_interface
header file. The duplicate functions are removed.

AMD-Internal: CPUPL-5580
Change-Id: Ia3253cb0eca46aa8371fb0c5d9d2e6b526042a71
Fixed jobz=O validation error
Enabled jobz=O tests from config file for gesdd

Signed-off-by: Venkatesha <[email protected]>
AMD-Internal: [CPUPL-5557]
Change-Id: Ifd89981a4afd7f5fec9462c6d3c1e50bc0e46b56
Code change to fix issue in input initialization for
gtsv testing.
Change in file input reading changed to read RHS matrix.

Signed-off-by: Vasanthakumar R <[email protected]>
AMD-Internal: CPUPL-5568
Change-Id: Id191f9e618ae50d5a8ca93cc3bbbb7b4f7701738
Fixed issue with python3 executable in windows.

Signed-off-by: Venkatesha <[email protected]>
AMD-Internal: [CPUPL-5638]
Change-Id: Ifb4386b3f0a82d9e5bd8f26ee7730c8ce4093077
- pkg-config metadata file generation on installation through cmake
- file generated at ${CMAKE_INSTALL_PREFIX}/share/pkgconfig
- Formatting fixes for BUILD.md with additional information about pkg-config

Signed-off-by: samahmad <[email protected]>
AMD-Internal: CPUPL-5624
Change-Id: I0d8728d9b91c2828da68559dee57876db424e3ca
Updated formulae format in doxygen comments

Signed-off-by: Venkatesha <[email protected]>
AMD-Internal: [CPUPL-5638]
Change-Id: Id01ab1993250704391b79e7305ce7528ea038ece
Changed thresholds for EV range selection used for
random matrix generation

Signed-Off by: Vasanth R <[email protected]>
AMD-Internal: [CPUPL-5693]

Change-Id: Ic71d4007aaf6d6132f4f4644d95306f1a8119930
Changed the lartg function declaration in C++ header file to set
first 2 parameters to pointers instead of values to match with LAPACK
API equivalent.

AMD-Internal: CPUPL-5748
Change-Id: I4271792782b1a6788b3e1604679ead16b8c78e8c
- Removed quotes on paths in pkg-config template file

Signed-off-by: samahmad <[email protected]>
AMD-Internal: CPUPL-5624
Change-Id: I724f62b02ea2f52150c0313dc93c10fadf66eb49
Version of AOCL-LAPACK upgraded to 5.0.0 from 4.2.0
Signed-off-by: ksaithar <[email protected]>
Change-Id: If537ee27be337a69c56608a88a1f42e7031719de
Change-Id: I65b53cf57fb82528c7f581c50362b468c5aec00f
Change-Id: I74067c461c68cff55c6b5f02a9080e3df9352b9e
Change-Id: I5259e9b457dbdd10f6c447db70e9bdb1d1bdf4aa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants