context use word 'empty' instead of just null

last context is not working well

Use quote instead of double quotes for option :

Choose from ('weather' or 'other topic')

It seems date processing will not work (bacause of math) ex: today is 30/01/2000, next week is 07/2/2000 (but not work)

- using wizard vicuna sharded is better "hiepnh/Wizard-Vicuna-7B-Uncensored-HF-sharded" (llama1)
- using when : today works
- testing using how many days from today : -1 (yesterday)

llama2data.jsonl is the latest that works

Merge lora needs 20GB of memory for just 7B parameter llama -> kaggle merge lora 2, combine base model and lora

train using huggingface or github sync is the same, the difference is how many data, min 24 ?

Error if input value is empty when training

{ "instruction":"Sebutkan kata yang berhubungan dengan cuaca", "input":"", "response":"hujan, angin, petir, berawan, lembab, panas, terik" }

Apakah akan hujan besok ? works

Apakah besok hujan, works but context is besok

Train until loss is 1.1 or 1.2, see after training finished training_loss=...

TrainOutput(global_step=20, training_loss=1.0630455672740937, metrics={'train_runtime': 111.1406, 'train_samples_per_second': 0.72, 'train_steps_per_second': 0.18, 'total_flos': 19714939895808.0, 'train_loss': 1.0630455672740937, 'epoch': 0.8})

cd and ls

!cd /kaggle/working && ls

When using kaggle, to save version, select quick save first

Try using TheBloke/wizardLM-7B-HF, still cannot load memory not enough to shard ? should be sharded : 0001 of xxx.bin, decapoda is 405MB chunks, now working in kaggle also because 30GB memory is not available if using GPU

https://www.reddit.com/r/LocalLLaMA/comments/13v6qvh/til_sharding_a_model_into_smaller_chunks_may_make/

need to be sharded first, if it will be crashed when loading into memory !git clone https://github.com/oobabooga/text-generation-webui %cd text-generation-webui !pip install -r requirements.txt

!pip install -U transformers !pip install sentencepiece bitsandbytes accelerate

model_id = "TheBloke/wizardLM-7B-HF" dest = "./dest/{}".format(model.replace("/","_")) #May need to edit based on where you're storing your models shard_size = "1000MB"

from transformers import LlamaTokenizer , LlamaForCausalLM

tokenizer = LlamaTokenizer.from_pretrained(model_id)

model = LlamaForCausalLM.from_pretrained(model_id, load_in_8bit=True, device_map='auto')

model.save_pretrained(dest, max_shard_size=shard_size) tokenizer.save_pretrained(dest)

Colab remove dir not empty

import shutil shutil.rmtree('/content/testqlora/outputs')

Prepare

!pip install -r requirements.txt -q -U

testqlora

using qlora with test data

push to repo

git add --all git commit -m "commit1" git push

Fix for decapoda

import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig,LlamaTokenizer

model_id = "decapoda-research/llama-7b-hf" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 )

tokenizer = LlamaTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0})

fix for from peft...

from peft import LoraConfig, get_peft_model

config = LoraConfig( r=8, lora_alpha=32, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none", task_type="CAUSAL_LM" )

model = get_peft_model(model, config) print_trainable_parameters(model)

git clone

!git clone https://github.com/x4080/testqlora.git

colab change dir

import os os.chdir('/content/testqlora') !ls

Modify training pad token

import transformers

tokenizer.add_special_tokens({'pad_token': '[PAD]'})

trainer = transformers.Trainer( model=model, train_dataset=data["train"], args=transformers.TrainingArguments( per_device_train_batch_size=1, gradient_accumulation_steps=4, warmup_steps=2, max_steps=20, learning_rate=2e-4, fp16=True, logging_steps=1, output_dir="outputs", optim="paged_adamw_8bit" ), data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False), ) model.config.use_cache = False # silence the warnings. Please re-enable for inference! trainer.train()

Inference example

model.config.use_cache = False text = """HUMAN: Hello MESSAGE: Welcome to pizza john ORDER DETAILS: {} RELEVANCY:unknown ORDER CONFIRMED:no

HUMAN: Who is the president of the US? MESSAGE: I'm sorry, but I only process pizza orders. ORDER DETAILS: {} RELEVANCY: No ORDER CONFIRMED: No

HUMAN: can i order some pizza please """ device = "cuda:0"

inputs = tokenizer(text, return_tensors="pt").to(device) outputs = model.generate(**inputs, max_new_tokens=20) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

colab huggingface

import locale locale.getpreferredencoding = lambda: "UTF-8" !python -m pip install huggingface_hub !huggingface-cli login --token #

upload model to HF (no need to create repo first, it will create adapter_config and adapter_model)

model.push_to_hub("notzero/testlora2", use_auth_token=True)

Then need to merge lora with base model (use kaggle, has 30GB memory -> if not using GPU, use kaggle)

!pip install transformers !pip install peft !pip install sentencepiece

!git clone https://github.com/ggerganov/llama.cpp #!git clone https://github.com/project-baize/baize-chatbot !git clone https://github.com/x4080/testqlora.git

#!python /content/baize-chatbot/merge_lora.py
#--base decapoda-research/llama-7b-hf
#--target ~/model_weights/baize-7b
#--lora notzero/testlora2

!python ./testqlora/merge_lora.py
--base decapoda-research/llama-7b-hf
--target /kaggle/working/model_weights/mergeqlora
--lora notzero/testlora2

!cd /kaggle/working/llama.cpp && mkdir qmodel && mv /root/model_weights/mergeqlora /kaggle/working/llama.cpp/qmodel/temp #!mv /kaggle/working/llama.cpp/qmodel/7B/tokenizer.model /kaggle/working/llama.cpp/qmodel/

create dataset first in web

!python -m pip install huggingface_hub !huggingface-cli login --token ###

!cd /kaggle/working/llama.cpp && mkdir qmodel && mv /root/model_weights/mergeqlora /kaggle/working/llama.cpp/qmodel/temp

upload to modelcombined

from huggingface_hub import HfApi api = HfApi() api.upload_folder( folder_path="/kaggle/working/llama.cpp/qmodel/temp", repo_id="notzero/modelcombined", repo_type="dataset", )

Quantize (new notebook)

mkdir /root/modelcombined

!cd /root/modelcombined && wget https://huggingface.co/datasets/notzero/modelcombined/resolve/main/pytorch_model-00001-of-00002.bin !cd /root/modelcombined && wget https://huggingface.co/datasets/notzero/modelcombined/resolve/main/pytorch_model-00002-of-00002.bin !cd /root/modelcombined && wget https://huggingface.co/datasets/notzero/modelcombined/resolve/main/pytorch_model.bin.index.json !cd /root/modelcombined && wget https://huggingface.co/datasets/notzero/modelcombined/resolve/main/tokenizer.model !cd /root/modelcombined && wget https://huggingface.co/datasets/notzero/modelcombined/resolve/main/tokenizer_config.json

!cd llama.cpp && make !cd /kaggle/working/llama.cpp && python convert.py /root/modelcombined

!python -m pip install huggingface_hub !huggingface-cli login --token #

upload .bin ggml file

from huggingface_hub import HfApi api = HfApi() api.upload_file( path_or_fileobj="/root/modelcombined/ggml-model-f16.bin", path_in_repo="ggml-model-f16.bin", repo_id="notzero/modelcombined", repo_type="dataset", )

remove original downloaded model to save space

!cd /root/modelcombined && rm pytorch_model-00001-of-00002.bin !cd /root/modelcombined && rm pytorch_model-00001-of-00002.bin

quantize to q4

!cd llama.cpp && ./quantize /root/modelcombined/ggml-model-f16.bin //root/modelcombined/ggml-model-q4_0.bin q4_0

upload

from huggingface_hub import HfApi api = HfApi() api.upload_file( path_or_fileobj="/root/modelcombined/ggml-model-q4_0.bin", path_in_repo="ggml-model-q4_0.bin", repo_id="notzero/modelcombined", repo_type="dataset", )

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
._README.md		._README.md
._llama2data.jsonl		._llama2data.jsonl
._llama2datagenerator.js		._llama2datagenerator.js
._llama2datageneratorcsv.js		._llama2datageneratorcsv.js
._llama2datageneratorcsvworks.js		._llama2datageneratorcsvworks.js
._llama2datageneratorusingexample archive.js		._llama2datageneratorusingexample archive.js
._llama2datageneratorusingexample.js		._llama2datageneratorusingexample.js
._llama2datageneratorworks.js		._llama2datageneratorworks.js
._test.md		._test.md
._train.csv		._train.csv
._usingexample.txt		._usingexample.txt
._wasworking.jsonl		._wasworking.jsonl
.gitignore		.gitignore
README.md		README.md
combinetrain.txt		combinetrain.txt
combinetrain2.txt		combinetrain2.txt
combinetrain3.txt		combinetrain3.txt
combinetrain31.txt		combinetrain31.txt
combinetrain32.txt		combinetrain32.txt
combinetrain4.txt		combinetrain4.txt
example.http		example.http
importfromtextfiles.py		importfromtextfiles.py
llama2data.jsonl		llama2data.jsonl
llama2datagenerator.js		llama2datagenerator.js
llama2datageneratorcsv.js		llama2datageneratorcsv.js
llama2datageneratorcsvworks.js		llama2datageneratorcsvworks.js
llama2datageneratorusingexample archive.js		llama2datageneratorusingexample archive.js
llama2datageneratorusingexample.js		llama2datageneratorusingexample.js
llama2datageneratorworks.js		llama2datageneratorworks.js
merge_lora.py		merge_lora.py
multishotonly.jsonl		multishotonly.jsonl
multishotonlygenerator.js		multishotonlygenerator.js
notworking.jsonl		notworking.jsonl
oldtrain1.txt		oldtrain1.txt
oldtrain2.txt		oldtrain2.txt
package.json		package.json
requirements.txt		requirements.txt
sample.txt		sample.txt
test.md		test.md
train.csv		train.csv
train.jsonl		train.jsonl
train1.txt		train1.txt
train2.txt		train2.txt
train3.txt		train3.txt
train4.txt		train4.txt
traintext.jsonl		traintext.jsonl
trainweather.jsonl		trainweather.jsonl
trainweather2.jsonl		trainweather2.jsonl
trainweather3.jsonl		trainweather3.jsonl
trainweather4.jsonl		trainweather4.jsonl
trainweather5.jsonl		trainweather5.jsonl
trainweather5generator.js		trainweather5generator.js
usingexample.txt		usingexample.txt
wasworking.jsonl		wasworking.jsonl

x4080/testqlora

Folders and files

Latest commit

History

Repository files navigation