Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HW] Add support for vector permute instructions #180

Closed
wants to merge 8 commits into from

Conversation

M-Ijaz-10x
Copy link
Contributor

Add support for vrgather, vrgatherei16 and vcompress vector permute instructions
NOTE: This PR depends on PR#149 and needs to be merged after that

Changelog

Fixed

  • N/A

Added

  • Support for vrgather, vrgatherei16 and vcompress vector permute instructions
  • Test for vrgatherei16 permute instruction

Changed

  • Tests for vrgather and vcompress permute instructions

Checklist

  • Automated tests pass
  • Changelog updated
  • Code style guideline is observed

Please check our contributing guidelines before opening a Pull Request.

@mp-17
Copy link
Collaborator

mp-17 commented Nov 21, 2022

Amazing, great job @M-Ijaz-10x! I will review it today!

@M-Ijaz-10x
Copy link
Contributor Author

@mp-17 I have rebased this branch with the latest main. Can you please review the changes?

@mp-17
Copy link
Collaborator

mp-17 commented Dec 8, 2022

Hello @M-Ijaz-10x, I have almost finished the backend run! I will check again in 1h and review it!

@mp-17
Copy link
Collaborator

mp-17 commented Dec 8, 2022

The run is not over yet, but I am quite sure it will finish in the night; busy servers.. BTW, I see that there is one check hanging with 2 lanes.
EDIT: actually, the run is pretty slow on the system only and it's still synthesizing even if the server is not that busy, so I think there is something weird with the RTL. I will check tomorrow.

@mp-17
Copy link
Collaborator

mp-17 commented Dec 12, 2022

Okay, there is definitely something in the RTL since the synthesis is still ongoing. I had a look at it, and I see why you used the mask unit instead of the slide unit (vrgather requires 3 elements per cycle from each lane, and vcompress needs bit-level deshuffling).
Anyway, I am not sure about some logic; do you have a schematic or a written summary of how you implemented the instructions? This would help. Unluckily, the module is getting very large, and it requires some refactoring to have the hardware under control.

@M-Ijaz-10x M-Ijaz-10x force-pushed the vpermute_v1 branch 4 times, most recently from df69f5d to a4f58b4 Compare December 26, 2022 05:32
@M-Ijaz-10x M-Ijaz-10x force-pushed the vpermute_v1 branch 2 times, most recently from 7cea91b to 70cc157 Compare December 28, 2022 16:21
@M-Ijaz-10x
Copy link
Contributor Author

@mp-17 I have fixed the broken checks in RTL. Hopefully the CIs will be green this time!

@mp-17
Copy link
Collaborator

mp-17 commented Jan 26, 2023

Hey @M-Ijaz-10x, thank you very much! I will start a CI run to see if there are still problems with 4 lanes.

@M-Ijaz-10x
Copy link
Contributor Author

@mp-17 Are the checks clear?

@mp-17
Copy link
Collaborator

mp-17 commented Dec 9, 2024

Closing this PR, as this was addressed with a larger refactoring of the MASKU, which would have become too huge otherwise. Thanks a lot!

@mp-17 mp-17 closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants