Server Limitations? #30
Replies: 8 comments
-
Hi, we have 16 GB on the server but the tool is also running spaCy and ideally Harmony would not crash if we have multiple users concurrently. Do you have a proposed different LLM to use? |
Beta Was this translation helpful? Give feedback.
-
I want to test other open source embedding options on huggingface but wanted to know if there were any space limitations first. I noticed I was also interested in different ways of parsing the questions, For example removing information from the questions that may not be relevant to the overall meaning, but I would have to have a closer look at the types of questions first. Is there a test set available that we can use for testing performance? |
Beta Was this translation helpful? Give feedback.
-
yes there is, please try the scripts in https://github.com/harmonydata/matching which are testing different LLMs against a number of datasets. This notebook shows the results on those datasets: https://github.com/harmonydata/matching/blob/main/analyse_results.ipynb |
Beta Was this translation helpful? Give feedback.
-
Hi I tested the model in the final column which seems to perform better than the model in production. I can make a pull request today if there are no other tests that need to be checked. |
Beta Was this translation helpful? Give feedback.
-
Hi
Thanks so much! That's fantastic! Yes please feel free to make the PR but
first can you check that the API server runs locally for you with this
change and the unit tests pass? Thanks!
…On Sun, 31 Mar 2024 at 05:16, sourface94 ***@***.***> wrote:
Hi I tested the model in the final column which seems to perform better
than the model in production. I can make a pull request today if there are
no other tests that need to be checked.
image.png (view on web)
<https://github.com/harmonydata/harmony/assets/15061574/5531b2f9-2a6c-4c1f-ac0e-426200f27ca7>
output.xlsx
<https://github.com/harmonydata/harmony/files/14814108/output.xlsx>
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADUBTVMBWE4HQOY5FWNTUJDY26E3DAVCNFSM6AAAAABFM65CVOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DSNRSHA4TG>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi, all tests passed apart from harmony/tests/test_match_mhc.py Line 53 in 9c4cdfc This length does not match with the length of the embeddings for the model I tested. Is this OK? |
Beta Was this translation helpful? Give feedback.
-
OK thanks. I guess we need to regenerate the Mental Health Catalogue
embeddings for the new LLM too.
That code is here: https://github.com/harmonydata/mentalhealthcatalogue_etl
but I appreciate it's not properly documented. Do you want to make your PR
and if you can see an easy way to fix the Mental Health Catalogue
integration you could add it, but if not I can add it in (it will be next
week as I'm not working this week)
…On Mon, 1 Apr 2024 at 00:45, sourface94 ***@***.***> wrote:
Hi, all tests passed apart from TestMatchMhc due to the embeddings being
a hardcoded length here
https://github.com/harmonydata/harmony/blob/9c4cdfce74e5fb61be2f2c7a824aceafb864c2c4/tests/test_match_mhc.py#L53
This length does not match with the length of the embeddings for the model
I tested
—
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADUBTVMPAYP6HH6UEUQZHJ3Y3CN2RAVCNFSM6AAAAABFM65CVOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DSNRXG43TK>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Great, I've made the request and will have a look at the repo you shared. |
Beta Was this translation helpful? Give feedback.
-
Hi I was wondering if there are any server limitations in regards to using other language models when getting the sentence embeddings?
Beta Was this translation helpful? Give feedback.
All reactions