Huge memory demand of recover_model_weights.py? #73

mensch72 · 2023-09-13T09:54:42Z

[ continuation of #70 ]

When running recover_model_weights.py --alpaca-farm-model-name sft10k, memory use of the python process grows to >30GB, at which point it gets killed by my system due to out of memory. Is this expected behavior?

Log:

Downloading sft10k
Downloading (…)lve/main/config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 610/610 [00:00<00:00, 1.03MB/s]
Downloading (…)model.bin.index.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 26.8k/26.8k [00:00<00:00, 26.8MB/s]
Downloading (…)l-00001-of-00003.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.88G/9.88G [04:26<00:00, 37.1MB/s]
Downloading (…)l-00002-of-00003.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.89G/9.89G [05:15<00:00, 31.3MB/s]
Downloading (…)l-00003-of-00003.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7.18G/7.18G [03:13<00:00, 37.1MB/s]
Downloading shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [13:00<00:00, 260.29s/it]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [07:13<00:00, 144.42s/it]
Downloading (…)neration_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 132/132 [00:00<00:00, 153kB/s]
Downloading (…)okenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 726/726 [00:00<00:00, 4.54MB/s]
Downloading tokenizer.model: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 500k/500k [00:00<00:00, 1.35MB/s]
Downloading (…)/main/tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.84M/1.84M [00:00<00:00, 3.92MB/s]
Downloading (…)in/added_tokens.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 62.1kB/s]
Downloading (…)cial_tokens_map.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 435/435 [00:00<00:00, 2.81MB/s]
WARNING:root:Your base LLaMA checkpoint is converted with transformers==4.27.0.dev0, but transformers>=4.29.2 is expected. This may produce a corrupted checkpoint and lead to unexpected behavior. Please regenerate your base LLaMA checkpoint with transformers>=4.29.2.
Loading checkpoint shards:  12%|█████████████████                                                                                                                            | 4/33 [00:16<01:59,  4.13s/it]Killed

The text was updated successfully, but these errors were encountered:

* [GITIGNORE] rm jsons * load_dotenv * load_dotenv * [ENH] make evaluator easier to enherit * nit * changes from PR * typo

lolipopshock pushed a commit to lolipopshock/alpaca_farm that referenced this issue Sep 24, 2023

[ENH] make. it easier to cache to a DB (tatsu-lab#73)

0eb723b

* [GITIGNORE] rm jsons * load_dotenv * load_dotenv * [ENH] make evaluator easier to enherit * nit * changes from PR * typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge memory demand of recover_model_weights.py? #73

Huge memory demand of recover_model_weights.py? #73

mensch72 commented Sep 13, 2023

Huge memory demand of recover_model_weights.py? #73

Huge memory demand of recover_model_weights.py? #73

Comments

mensch72 commented Sep 13, 2023