could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices #1242

Alice1069 · 2024-05-27T07:29:17Z

could run the llm-foundry on AMD 4xMi250 machine

Steps to reproduce the behavior:

follow latest instructions from: https://github.com/ROCm/flash-attention/tree/flash_attention_for_rocm
start from docker image: rocm/pytorch:rocm5.7_ubuntu22.04_py3.10_pytorch_2.0.1
export GPU_ARCHS="gfx90a"
export PYTHON_SITE_PACKAGES=$(python -c 'import site; print(site.getsitepackages()[0])')
patch "${PYTHON_SITE_PACKAGES}/torch/utils/hipify/hipify_python.py" hipify_patch.patch
pip install .
verified by PYTHONPATH=$PWD python benchmarks/benchmark_flash_attention.py
"pip list" shows "flash-attn 2.0.4"
get llm-foundry v0.7 code
modify setup.py

'torch>=2.2.1,<2.3',

'torch>=2.0,<2.0.2',

pip3 install --upgrade pip
pip install -e .
command to run :
python data_prep/convert_dataset_hf.py
--dataset c4 --data_subset en
--out_root my-copy-c4 --splits train_small val_small
--concat_tokens 2048 --tokenizer EleutherAI/gpt-neox-20b --eos_text '<|endoftext|>'

composer train/train.py train/yamls/pretrain/mpt-1b.yaml data_local=my-copy-c4 train_loader.dataset.split=train_small eval_loader.dataset.split=val_small max_duration=10ba eval_interval=0 loss_fn=torch_crossentropy save_folder=mpt-1b

it said lack of lotary_emb
pip install lotary_emb
re run command, it said lack of libcudart.11.0
export LD_LIBRARY_PATH to include libudart
re run command , it said lack of libtorch_cuda.so

could you give me a detailed version of hwo to run llm-foundry on AMD mi250, i read through the 2 blogs about AMD, but not get the hint. any version of code is ok. Thank you!

nik-mosaic · 2024-06-06T21:09:32Z

Hi @Alice1069, the FlashAttention ROCM version is likely fairly old now, so the easiest thing to do would be to disable FlashAttention. The other thing to try would be to manually comment out all the rotary emb codepaths and not use ROPE.

Alice1069 added the bug Something isn't working label May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices #1242

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices #1242

Alice1069 commented May 27, 2024

nik-mosaic commented Jun 6, 2024

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices #1242

could you give an elaborated steps about how to run llm-foundry on AMD mi250 devices #1242

Comments

Alice1069 commented May 27, 2024

nik-mosaic commented Jun 6, 2024