You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have fintune Evo with lora and then inference with cache on, i find that the inference with cache on was not right because i find that it's right when i turn the cache off. I don't know what's wrong with the cache.
The text was updated successfully, but these errors were encountered:
I have fintune Evo with lora and then inference with cache on, i find that the inference with cache on was not right because i find that it's right when i turn the cache off. I don't know what's wrong with the cache.
I needed to generate embeddings for 127,906 sequences ranging from 1-40kb, which took 168 hours. Can you provide some suggestions for improvement, such as model distillation?
I have fintune Evo with lora and then inference with cache on, i find that the inference with cache on was not right because i find that it's right when i turn the cache off. I don't know what's wrong with the cache.
The text was updated successfully, but these errors were encountered: