Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Model Request] 3 bit, 7B Luna-AI-Llama2-Uncensored-GGML #633

Closed
artemgordinskiy opened this issue Aug 1, 2023 · 9 comments
Closed

[Model Request] 3 bit, 7B Luna-AI-Llama2-Uncensored-GGML #633

artemgordinskiy opened this issue Aug 1, 2023 · 9 comments

Comments

@artemgordinskiy
Copy link

artemgordinskiy commented Aug 1, 2023

⚙️ Request New Models

Additional context

Hi,
I've tried running this model on my iPhone 13 Pro Max, but it crashes instantly:
https://huggingface.co/mlc-ai/mlc-chat-llama2-7b-chat-uncensored-q4f16_1

However, this one runs really well:
https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q3f16_1

I think 7b-chat-uncensored would work as well if it were quantized at 3 bits instead of 4. Would you be able to add it to your HuggingFace space?
I'm also open to doing this myself but the last time I tried, quantization did not work on my M1 MacBook using llama.cpp.

@CharlieFRuan
Copy link
Contributor

Hi @ArtemGordinsky, thanks for reaching out! Could you try following this tutorial to compile it?

@acalatrava
Copy link
Contributor

I compiled it for you, just add this URL to download the model and it should work:
https://huggingface.co/acalatrava/mlc-chat-luna-ai-llama2-7b-chat-uncensored-q3f16_1

#692

@artemgordinskiy
Copy link
Author

@acalatrava Thank you!
Unfortunately, it crashes instantly on load for me.
I don't know how to debug this since I don't see any error message. I'm guessing it's OOM, but the "censored" model ran fine so I'm not sure what it is...

@acalatrava
Copy link
Contributor

My bad... The problem is that the public MLC-LLM iPhone app does not have the library for this model, so when you load this model it will crash because it won't find the library for it. Every time you create a model you have to package the app with the library for that model. If you have a Mac you should be able to compile the app and install it by following the instructions on the docs.

@acalatrava
Copy link
Contributor

Now that I think more about it it may work by modifying the mlc-chat-config.json file, so I did it and upload it to huggingface. Please @ArtemGordinsky try to remove the model and download it again, it may work.

Unfortunately I cannot test it since I only have an iPhone 12...

@artemgordinskiy
Copy link
Author

@acalatrava This worked, thank you! 🙌

@acalatrava
Copy link
Contributor

Great! It seems that I should read the docs too since it’s explained here https://mlc.ai/mlc-llm/docs/get_started/mlc_chat_config.html#configure-mlc-chat-json 😅

@dylanbeadle
Copy link

Great model! Thank you!

@Re4mer
Copy link

Re4mer commented Oct 18, 2023

Hi @acalatrava
can you please make a q4f16_1 version of luna-ai-llama2-7b-chat-uncensored?
i want to use it on an Android phone and the app does not support 3bits quantization unlike the IOS app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants