Replies: 6 comments 11 replies
-
By "similar tools", I mean things like:
|
Beta Was this translation helpful? Give feedback.
-
ok, now the question is clearer. |
Beta Was this translation helpful? Give feedback.
-
I understand the goal of the project. |
Beta Was this translation helpful? Give feedback.
-
not impressive at all with the first example:
|
Beta Was this translation helpful? Give feedback.
-
The bigger the model the worse the performance for embedding and CPU instead of GPU would of course be the worst out of all four possible positions of small big CPU GPU. The MTEB is the way to go when you are choosing performant models for your hardware. Most everyone should be able to make sure the NVIDIA SDK is loaded in the docker and I think most of the projects have auto install if you don't have it. Ola running everything in GPU it's going to be the best. Most of the embedding models are small enough where that should be your best bet. Ollama does have quite several nice embedding models in their list. The Nomic embedding worked quite well for me (they always do good stuff) However I found the Mixed Bread model was slightly more performant. This is helpful when you are upserting hundreds of documents into your vector database for your rag pipeline. I'm not 100% sure of the buffer size for the scraping delay on this project however there was a posting on Reddit that was going around so it piqued my interest :) |
Beta Was this translation helpful? Give feedback.
-
will this be helpful for websites which has enabled anti scraping techniques ? |
Beta Was this translation helpful? Give feedback.
-
Hello,
how does it compare to similar (ie.: with the same goal) tools in terms of
Good idea though ;)
Beta Was this translation helpful? Give feedback.
All reactions