-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AOCL 2.2 changes - Majorly include LAPACK 3.9.0 support #36
Open
rsanagap
wants to merge
1,204
commits into
flame:master
Choose a base branch
from
amd:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Most python scripts use EOL python2. This commit updates the scripts to use python3 Change-Id: Iab35f8e21ac1aad918c796560ad3f7b5d826e893
When User provides path to an external openmp library FLA_MULTITHREADING_MODEL wasnt being set on linux and windows Change-Id: Idf5684d2cad065d0a6368c8f76c1ca964d9b0076
- Enabled Ctest to run aoclflaprogress , legacy test amd main test suite with a single command - Added negative test cases test. - test/main/scripts/run_negative_test_cases.py has been removed - Added example to add more such test in the future - Updated BUILD.md Signed-off-by: bsinghpa<[email protected]> AMD-Internal: CPUPL-3467 Change-Id: Ic975b4982abc72267a8c5d2a717f151f0b47d29a
{c,z}hegst and {c,z}hegs2 takes lapack path when AMD optimised path is enabled. Signed-off-by: Jintu Das <[email protected]> Change-Id: I45a918544187687cc0b8f18a2e955e8a4dca5f70
- If leading dimensions = -1, setting them to the least valid value unless inputs are from commandline - removed invalid input filtering in testsuite and allow them to pass to LAPACK API to get invalid argument as result - Added negative test cases for all the leading dimensions based on einfo - modified einfo logic to return test result as "FAIL" when einfo is not provided along with invalid inputs - moved all the existing static global variales/file pointers in all the files to lapack.c Signed-off-by: dnikku <[email protected]> AMD-Internal: CPUPL-3076 Change-Id: If09494376d23878768c2369e19767d8a909d23b9
By default, external project's sources will be updated whenever CMake is rerun. We want to skip this update step for AOCLUtils library. Its sufficient to clone once based on release tag and use it. Change-Id: I9f160fb413a2c55662a247cf8999c05ebc8c003d
Enabled OMP code for the following APIs - - {c,z}hetrd_hb2st - {s,d}sytrd_sb2st - iparam2stage. Signed-off-by: Jintu Das <[email protected]> AMD-Internal: CPUPL-3232 Change-Id: Iaff2cad7f6df86653f05f91025d75497831d95cd
1. Implemented AVX512 code for 8 parallel inner loop and outer loop iterations at a time. 2. Written separate function using avx2 intrinsic to Calculate scalefactors and update trailing matrix, fla_lu_piv_small_d_update_tr_matrix_avx2. 3. Created separate threshold macros for AVX2 and AVX512, FLA_DGETRF_SMALL_AVX2_THRESH0 and FLA_DGETRF_SMALL_AVX512_THRESH0. 4. Configure time AOCL FAMILY identification through python script and enabling compiler flag accordingly 5. Created object library for avx512 specific code and enable avx512 compiler flag Change-Id: I05de51326c4ec2ca5703c6f7e9cc965e7bc4f18f
1. Use a separate TLS structure tl_context to track number of threads. The global context is initialized and updated with libFLAME-specific threading, and tl_context is updated from this if needed. Otherwise tl_context is updated on every call from OpenMP runtime information. Hardware ISA information remains in global_context. 2. Check OpenMP active level against max active levels when setting number of threads for starting a new parallel region to ensure the correct number of threads is used when BLIS is called within nested OpenMP parallelism, or if the user changes the parallelism via OpenMP function calls. 3. To reduce redundant code, FLASH_get_num_threads() has been adapted to call fla_thread_get_num_threads() AMD-Internal: [CPUPL-3558] Change-Id: Idc5eb604fb42ab73cbc99148afeb7a870c98fc1a
Change-Id: I6441fd07b06b3907775509d8386d4451c42f4179
Previous Design: - AOCL FLA progress supports only single trhead application. New Design : - Now AOCL FLA progress supports for multithread Application. Fixed the complier warning for aocl_fla_progress. README is updated with details on testing the feature. Signed-off-by: Parag <[email protected]> AMD-Internal: CPUPL-3753 Change-Id: I09468c3201408316f415bc41b99350fa5b2b6c23
fPIC flag was missing with GCC which was causing compilation issues . Added fPIC flag Removed unwanted ctest print statements which were printed during configuration Signed-off-by: bsinghpa<[email protected]> Change-Id: Ie66b16aa2fbb9b89089f6c8d92df86e02d4b6f04
Fixes compiler warnings related to ilp64 build. Signed-off-by: Jintu Das <[email protected]> Change-Id: I04fdfa71d21a960e4cb666ef65f4fa4e609c7b62
- Added Code Coverage feature for Libflame - Added Code Coverage Flags to netlib test - Added generate_code_coverage_html.sh to view the code coverage in HTML format - Added CTEST for netlib-test gcc and aocc - Updated BUILD.md Signed-off-by: bsinghpa<[email protected]> AMD-Internal: CPUPL-3685 Change-Id: Ib36cb999466521db442f3a5d116ae68533d554bc
Get the num of threads from the context and removed unneccessary OMP taskwait from {c,z}hetrd_hb2st.c and {s,d}sytrd_sb2st.c Signed-off-by: Jintu Das <[email protected]> AMD-Internal: CPUPL-3232 Change-Id: I85b9c417a955c1ef9823694c2b192919779df03b
Modified {s,c,d,z}trsyl3 APIs to give consistent output both in LP64 and ILP64 build. Signed-off-by: Jintu Das <[email protected]> AMD-Internal: CPUPL-3712 Change-Id: I660ab9284e27d7487ad55773da2168ecca290314
In order to enable closer integration with AOCL-BLAS, new configure option, "ENABLE_AOCL_BLAS", is created to link with AOCL-BLAS at build time. The location of AOCL-BLAS is to be set by user either by using environment variable or cmake option "AOCL_ROOT". This location must have "include" directory and a "lib" directory that contains the necesaary header files and AOCL-BLAS binary respectively. Currently, this feature is available for builds through Cmake only. The build documentation is updated with these details in BUILD.md. AMD-Internal: CPUPL-3826 Change-Id: I5de5e8ac1491c9c2f8da6a1ff39a25b326357361
DORG2R functionality changed from FLAME implementation to F2C converted Netlib reference code. This is done for compatibility with DGEQRF latest optimized which follows Netlib's methodology. Signed-off-by: Vasanthakumar R <[email protected] AMD-Internal: SWLCSG-2364 Change-Id: I79ccd17404ef474967bc36e812caaa8625b72074
Add hidden Fortran string length argument to xerbla function and calls. Netlib LAPACK custom test versions of xerbla no longer need to have string equality test disabled. AMD-Internal: [CPUPL-3013] Change-Id: I8b3cabfe4fc72d329733bc2e19660fd89dab25f7
Added min, max and ladiv replacements. Signed-off-by: Jintu Das <[email protected]> AMD-Internal: CPUPL-3848 Change-Id: I498990414e4f4b650cac037fabab0a456a7d9600
This commit fixes function pointer and un-used variable related warnings. Signed-off-by: Jintu Das <[email protected]> AMD-Internal: CPUPL-3651 Change-Id: Id1aaad558f7f333590a9e22e075525e7cb0d0b24
A new script has been added that prints the failed routines that were not caught by the netlib LAPACK test suite script. This script also prints the total number of failed tests, as well as the number of info errors and illegal value failures. Signed-off-by: Jintu Das <[email protected]> Change-Id: I7ab195600ac29087bfa33e36a5924ed8acae93a0
Added AVX context checking for following list of APIs: - fla_dhrot3_avx2 - fla_drot_avx2 - fla_zrot_avx2 - fla_dgeqrf_small_avx2 - fla_sscal_ix1_avx2 - fla_sger_avx2 Signed-off-by: Jintu Das <[email protected]> Change-Id: I59a7fd266cdbc6e76bdf74a4648dcb97142dc233
Inlined different functionalites of DGESVD. Used AVX2 DGEQRF code. Further optimization done for N >> M and small N. Added include paths for local header files in CMAKE build system. Resolved warnings in AVX2 files. Signed-off-by: Vasanthakumar R <[email protected] AMD-Internal: CPUPL-3251 Change-Id: I47fb4eae8f830dbffa5f42473b45414043b32016
- added ctest for main_test micro/medium/short/long - Updated BUILD.md for installing libflame library Signed-off-by: bsinghpa<[email protected]> AMD-Internal: CPUPL-3883 Change-Id: Iecc690096d064e9598dda34e6186481fe03b5710
Exclusion of header x86-opt header files from FLAME.h casuing errors while linking applications. Fix was to remove these inclusions in FLA_lapack_var_prototypes.h and related changes. Signed-off-by: Vasanthakumar R <[email protected] Change-Id: Ie0a8a1b63ed1d8991d60a3a4fc54268c970cc910
Added support for testing Auxilary API's. Added ROT and LARTG API test code in libflame main testsuite. Signed-off-by: Parag <[email protected]> AMD-Internal: CPUPL-3759 Change-Id: Id0ecc92e297dae603b94952596ab7dc0a1cadfc3
Added new test API to verify LAPACK ORG2R API functionality AMD-Internal: CPUPL-3861 Signed-off-by: dnikku <[email protected]> Change-Id: I42f666fcfed1eac28fd0be6af534efdf9f5b1c21
Set the input.global.operations files to default values AMD-Internal: CPUPL-3861 Signed-off-by: dnikku <[email protected]> Change-Id: I2a73956d1ab8afc35f3857dce996c2e3103bd7e4
…ues of GESV API details:Added Overflow and Underflow test cases AMD-Internal: [CPUPL-4738] Signed-off-by: ksaithar <[email protected]> Change-Id: I583f97a93e22235b32af92a1bedd7998eb1f1e22
…ues of GEEVX API details:Added Overflow and Underflow test cases AMD-Internal: [CPUPL-5384] Signed-off-by: ksaithar <[email protected]> Change-Id: I461de0d8ba4595b98d5f1afba52b48625bddbe96
1. Added matrix initialization and scaling for overflow/underflow tests. 2. Added test cases through ctest. AMD-Internal: [CPUPL-4762] Signed-off-by: sujithhp <[email protected]> Change-Id: Iff4b0e4d292baedd721ec40813c890fbcced4a8e
…t suite. Removed AOCL FLA Progress test suite as GBTRF, GBTRS are implemented in main test sutie. Change-Id: I23b2faab80d93105beab7f1e3c1b69eda8b6a736
Added deafult path for the followin functions: 1. fla_drot 2. fla_sger 3. fla_dhrot3 Signed-off-by: Sridhar Govindaswamy <[email protected]> Change-Id: Iff7a41a6b229a8cf1f1c30e1dc72898fad4151e3
Testing GESDD API by generating input matrix from known singular values and validating input, output Singular values AMD-Internal: CPUPL-5333 Signed-off-by: dnikku <[email protected] Change-Id: Ie678f6974def0b12d4981a481e8fffdca4bc8bb0
Fixed the scaling factor for svd input matrix AMD-Internal: CPUPL-5352 Signed-off-by: dnikku <[email protected]> Change-Id: I0a7713c7f14d9c5a075603b45023d616f4a75144
…e path AMD Internal : [CPUPL-5549] Change-Id: I1597cfb4a10afdb779c2845638fec63a9ba78e6e
AMD Internal : [CPUPL-5549] Change-Id: Iae6972dde9f9382fa792de860bfb672061b8db90
Enabled validation test-2 and test-3 for jobz = s Signed-off-by: venkatesha <[email protected]> Change-Id: I11fc29086a75b071e5fbd37337762a32da7198fb
AMD Internal : [CPUPL-5549] Change-Id: I78a99fd7ec2f681e0420f86080c2f244d643a46c
Added LAPACKE interface testing support in the main test suite for 4.2 APIs AMD-Internal: CPUPL-4258 Signed-off-by: dnikku <[email protected]> Change-Id: I6acbd421bde95044e498d150625606d23ca8acfc
…indows AMD Internal : [CPUPL-5576] Change-Id: If66ad8edbe3dd0b0df06f54f29f29c9a0ee4119f
Added LAPACKE prototypes and removed lapacke.h inclusion to fix make build errors. [Todo: Include lapacke header file and removed duplicate prorotypes in main testsuite] Signed-off-by: dnikku <[email protected]> Change-Id: Idc923a4aa8618cab17f772910cdd826a7aa3532b
1. Added matrix initialization and scaling for overflow/underflow tests. 2. Added test cases through ctest. AMD-Internal: [CPUPL-4761] Signed-off-by: sujithhp <[email protected]> Change-Id: I15dec63e003758fd3d29a24db6c73bdd6876ffff
details: Range for Eigen value selection extended to fix overflow failure AMD-Internal: [CPUPL-5610] Signed-off-by: ksaithar <[email protected]> Change-Id: I330cb966652f051edaedbe911cc3a2315c4f66ae
…ss AOCL libraries AMD Internal : [CPUPL-5607] Change-Id: Id417a215a6fceb08594ca65c8364d7ee197db352
Few of the CPP template function were defined twice in libflame_interface header file. The duplicate functions are removed. AMD-Internal: CPUPL-5580 Change-Id: Ia3253cb0eca46aa8371fb0c5d9d2e6b526042a71
Fixed jobz=O validation error Enabled jobz=O tests from config file for gesdd Signed-off-by: Venkatesha <[email protected]> AMD-Internal: [CPUPL-5557] Change-Id: Ifd89981a4afd7f5fec9462c6d3c1e50bc0e46b56
Code change to fix issue in input initialization for gtsv testing. Change in file input reading changed to read RHS matrix. Signed-off-by: Vasanthakumar R <[email protected]> AMD-Internal: CPUPL-5568 Change-Id: Id191f9e618ae50d5a8ca93cc3bbbb7b4f7701738
Fixed issue with python3 executable in windows. Signed-off-by: Venkatesha <[email protected]> AMD-Internal: [CPUPL-5638] Change-Id: Ifb4386b3f0a82d9e5bd8f26ee7730c8ce4093077
- pkg-config metadata file generation on installation through cmake - file generated at ${CMAKE_INSTALL_PREFIX}/share/pkgconfig - Formatting fixes for BUILD.md with additional information about pkg-config Signed-off-by: samahmad <[email protected]> AMD-Internal: CPUPL-5624 Change-Id: I0d8728d9b91c2828da68559dee57876db424e3ca
Updated formulae format in doxygen comments Signed-off-by: Venkatesha <[email protected]> AMD-Internal: [CPUPL-5638] Change-Id: Id01ab1993250704391b79e7305ce7528ea038ece
Changed thresholds for EV range selection used for random matrix generation Signed-Off by: Vasanth R <[email protected]> AMD-Internal: [CPUPL-5693] Change-Id: Ic71d4007aaf6d6132f4f4644d95306f1a8119930
Changed the lartg function declaration in C++ header file to set first 2 parameters to pointers instead of values to match with LAPACK API equivalent. AMD-Internal: CPUPL-5748 Change-Id: I4271792782b1a6788b3e1604679ead16b8c78e8c
- Removed quotes on paths in pkg-config template file Signed-off-by: samahmad <[email protected]> AMD-Internal: CPUPL-5624 Change-Id: I724f62b02ea2f52150c0313dc93c10fadf66eb49
Version of AOCL-LAPACK upgraded to 5.0.0 from 4.2.0 Signed-off-by: ksaithar <[email protected]> Change-Id: If537ee27be337a69c56608a88a1f42e7031719de
Change-Id: I65b53cf57fb82528c7f581c50362b468c5aec00f
Change-Id: I74067c461c68cff55c6b5f02a9080e3df9352b9e
Change-Id: I5259e9b457dbdd10f6c447db70e9bdb1d1bdf4aa
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Field,
Please review and merge them.
Thanks