You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running the incremental decoding cpp interface, a segmentation fault returned null in the compile_inference method of inference manager.
To reproduce the bug,
download the peft model from huggingface by running download_peft_model.py, example command: python inference/utils/download_peft_model.py --base_model_name JackFram/llama-160m goliaro/llama-160m-lora-full --refresh-cache
#0 0x0000155553abd9d6 in FlexFlow::FFModel::compile_inference (this=0x0) at /home/ubuntu/FlexFlow/src/runtime/inference_manager.cc:611
flexflow/flexflow-train#1 0x0000155553ab909c in FlexFlow::InferenceManager::compile_model_and_allocate_buffer (this=0x154fa020b250, model=0x154f941e3d60)
at /home/ubuntu/FlexFlow/src/runtime/inference_manager.cc:61
flexflow/flexflow-train#2 0x0000155553c1a262 in FlexFlow::RequestManager::serve_incr_decoding (this=0x154f94204f10, llm=0x154f941e3d60) at /home/ubuntu/FlexFlow/src/runtime/request_manager.cc:2504
flexflow/flexflow-train#3 0x0000155553c1a046 in FlexFlow::RequestManager::background_serving_task (task=0x154f94f05880, regions=..., ctx=0x154f982055f0, runtime=0x555556942710)
at /home/ubuntu/FlexFlow/src/runtime/request_manager.cc:2471
flexflow/flexflow-train#4 0x0000155553ba5d88 in Legion::LegionTaskWrapper::legion_task_wrapper<&FlexFlow::RequestManager::background_serving_task> (args=0x154f94f06950, arglen=8, userdata=0x0,
userlen=0, p=...) at /home/ubuntu/FlexFlow/deps/legion/runtime/legion/legion.inl:21196
flexflow/flexflow-train#5 0x000015554bc53cd0 in Realm::LocalTaskProcessor::execute_task (this=0x555556647e00, func_id=18, task_args=...)
at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/proc_impl.cc:1176
flexflow/flexflow-train#6 0x000015554bcd033a in Realm::Task::execute_on_processor (this=0x154f94f067d0, p=...) at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/tasks.cc:326
flexflow/flexflow-train#7 0x000015554bcd56b8 in Realm::UserThreadTaskScheduler::execute_task (this=0x5555568b30b0, task=0x154f94f067d0) at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/tasks.cc:1687
flexflow/flexflow-train#8 0x000015554bcd3451 in Realm::ThreadedTaskScheduler::scheduler_loop (this=0x5555568b30b0) at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/tasks.cc:1160
flexflow/flexflow-train#9 0x000015554bcdba90 in Realm::Thread::thread_entry_wrapper<Realm::ThreadedTaskScheduler, &Realm::ThreadedTaskScheduler::scheduler_loop> (obj=0x5555568b30b0)
at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/threads.inl:97
flexflow/flexflow-train#10 0x000015554bcea4ce in Realm::UserThread::uthread_entry () at /home/ubuntu/FlexFlow/deps/legion/runtime/realm/threads.cc:1405
flexflow/flexflow-train#11 0x000015554a6ef4e0 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from /lib/x86_64-linux-gnu/libc.so.6
flexflow/flexflow-train#12 0x0000000000000000 in ?? ()
The text was updated successfully, but these errors were encountered:
When running the incremental decoding cpp interface, a segmentation fault returned null in the compile_inference method of inference manager.
To reproduce the bug,
download_peft_model.py
, example command:python inference/utils/download_peft_model.py --base_model_name JackFram/llama-160m goliaro/llama-160m-lora-full --refresh-cache
./inference/incr_decoding/incr_decoding -ll:gpu 1 -ll:cpu 4 -ll:fsize 8192 -ll:zsize 12000 -ll:util 4 -llm-model JackFram/llama-160m -prompt ../inference/prompt/peft.json -peft-model goliaro/llama-160m-lora-full --use-full-precision --inference-debugging --fusion -enable-peft
Error Backtrace from gdb:
The text was updated successfully, but these errors were encountered: