Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmarking between AVX/SSE4.1 #62

Open
jg1uaa opened this issue Sep 3, 2023 · 4 comments
Open

benchmarking between AVX/SSE4.1 #62

jg1uaa opened this issue Sep 3, 2023 · 4 comments

Comments

@jg1uaa
Copy link
Contributor

jg1uaa commented Sep 3, 2023

SSE support is renewed, I took benchmark.

method:

$ cd LPCNet/build_dir/src
$ cat ../../wav/all.wav | ./lpcnet_enc -s > test.out
$ time cat test.out | ./lpcnet_dec -s > /dev/null

results:

CPU build time(real) Note
Intel Core i7-7700 AVX 7.132s *1
Intel Core i7-7700 SSE4.1 8.941s *1
AMD A8-7600 AVX 15.146s *1
AMD A8-7600 SSE4.1 16.453s *1
Intel Core i3-13100 AVX 3.730s *2
Intel Core i3-13100 SSE4.1 4.870s *2
Intel Core i7-7700 AVX N/A *3
Intel Core i7-7700 SSE4.1 29.428s *3
Intel Core i7-7700 SSE4.1 10.858s *4

(*1)Debian-12.1/x86_64, gcc-12.2.0
(*2)Ubuntu-22.04.3/x86_64 LTS on WSL2, gcc-11.4.0
(*3)Slackware-15.0/i686 on QEMU-7.2.4/KVM, gcc-11.2.0
(*4)Slackware-15.0/i686 on QEMU-7.2.4/KVM, clang-13.0.0

QEMU on Slackware did not support AVX instruction.

conclusion:
on x86_64, SSE4.1 build is slightly slower than AVX but we can ignore this disadvantage.

on i686, SSE4.1 build depends with compiler.

suggestion:
we can use SSE4.1 as default on x86_64 environment.
with clever compiler, we will be able to do same things for i686.

@drowe67
Copy link
Owner

drowe67 commented Sep 3, 2023

Thanks for your analysis @jg1uaa. Can you please tell me how you are using LPCNet? We have found that FreeDV 2020 is not very robust to HF channels, and is not used by many people.

So we are not actively developing LPCNet and FreeDV 2020 at this time.

@tmiw
Copy link
Collaborator

tmiw commented Sep 3, 2023

In freedv-gui, we test for AVX as well as timing the decode of random audio to ensure reliable decode (i.e. at least a bit faster than real time). One question I have is whether by disabling AVX and only compiling SEE we'd up the ability to use 2020 modes on any additional machines. In other words, outside of QEMU, are there machines that would be fast enough to decode 2020 with SSE alone that aren't capable of using AVX?

OTOH given @drowe67's comment above this question may be moot.

@jg1uaa
Copy link
Contributor Author

jg1uaa commented Sep 4, 2023

@drowe67 Sorry, I don't use FreeDV(any modes) because of I do not have station license for that mode. In Japan, we have to write application form to use non-standard mode (for example, FT8, FreeDV, SSTV, FAX and so on) and need to get station license.

@tmiw all.wav has 49sec long. I think decoding time takes under 50% of original voice time might be stable, but no evidence.
I have Pentium G4600 machine, this is 6th-Gen and no AVX support so SSE support is mandatory. But, current 12th-Gen based Celeron/Pentium has AVX, it is an idea to keep no SSE support as default.

@tmiw
Copy link
Collaborator

tmiw commented Sep 4, 2023

Considering that we're probably going to deprecate this repo soon, I think AVX can be kept mandatory. Thanks for the testing, however!

@drowe67, good to close?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants