-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Still Experiencing 'Error: Using LLVM 19.1.3 with -mcpu=apple-latest
is not valid in -mtriple=arm64-apple-macos
, using default -mcpu=generic
'
#3053
Comments
Totally forgot, I'm also having problems with the tokenizer library. I don't think they're related but could be totally wrong. I'm 95% sure this is built with each new iteration: /Users/zack/.home/gitrepos/LLMLife/target/aarch64-apple-darwin But according to ccmake's generated settings, it is expecting it at: So I have just copied and pasted in the aforementioned target/aarch64-apple-darwin folder where it is expected and then everything is g2g to build. Just did a new build with the latest avail. Relax + MLC-LLM and no changed settings outside LLVM/Metal/Install_Dev/OpenMP/SPM_Use_Shared/SPM_Use_TMALLOC, nothing changed :/. Actually, looks like maybe the tokenizer library has been the problem all along. I just did a more thorough job with the install, when I removed MLC_LLM/TVM from Poetry it seemed to have failed to add MLC on the re-add. ` - Installing mlc-llm (0.18.dev74+gd23d6f51 /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/python): Failed ChefBuildError Backend subprocess exited when trying to invoke get_requires_for_build_wheel Traceback (most recent call last): And now noticing that it did not create a libmlc_llm.dylib, at least this time.
I'll be looking how to generally retarget Cargo (I guess within the CCMake instructions/settings) but if someone would be kind enough to let me know how I can redirect Tokenizers to find it in the current location, that'd be great. |
Okay figured out the Tokenizers issue. But the OP's error still persists. I'm putting the computer's architecture as arm64 if that matters. I think I'd already mentioned it, but updated the target OS from 14.4 to 15.0, too, just to properly match my system's XCode version. Also, using /blahblah/tvm as my TVM source (when it asks when you do |
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
I used a few custom options if that matters.
I believe it was CoreML (On), Use_Metal (On), Use_LLVM (The custom dictated in docs, llvm-config blah blah blah), MSGPACK CXX20 (On), MSG_Use_Boost (ON), SPM_Use_Shared (On), SPM_Use_TMALLOC (On), TVM Debug with ABI (On), TVM_Log_Before_Throw (On), Use_BLAS (apple), Use_BNNS (On), Install_Dev (On), Summary (On), Hide_Symbols (On)
Not sure if there are any TVM-specific ones, just going down the list of my MLC build.
[2024-12-01 07:50:07] INFO auto_device.py:88: Not found device: cuda:0 [2024-12-01 07:50:08] INFO auto_device.py:88: Not found device: rocm:0 [2024-12-01 07:50:09] INFO auto_device.py:79: Found device: metal:0 [2024-12-01 07:50:11] INFO auto_device.py:88: Not found device: vulkan:0 [2024-12-01 07:50:12] INFO auto_device.py:88: Not found device: opencl:0 [2024-12-01 07:50:12] INFO auto_device.py:35: Using device: metal:0 [2024-12-01 07:50:12] INFO engine_base.py:143: Using library model: /Users/zack/.home/local/models/Llama_q3/mlc.dylib [07:50:13] /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/cpp/serve/config.cc:688: Under mode "local", max batch size will be set to 4, max KV cache token capacity will be set to 8192, prefill chunk size will be set to 2048. [07:50:13] /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/cpp/serve/config.cc:688: Under mode "interactive", max batch size will be set to 1, max KV cache token capacity will be set to 32768, prefill chunk size will be set to 2048. [07:50:13] /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/cpp/serve/config.cc:688: Under mode "server", max batch size will be set to 80, max KV cache token capacity will be set to 32768, prefill chunk size will be set to 2048. [07:50:13] /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/cpp/serve/config.cc:769: The actual engine mode is "interactive". So max batch size is 1, max KV cache token capacity is 32768, prefill chunk size is 2048. [07:50:13] /Users/zack/.home/gitrepos/LLMLife/frontend/mlc-llm/cpp/serve/config.cc:774: Estimated total single GPU memory usage: 41979.309 MB (Parameters: 30304.259 MB. KVCache: 10361.055 MB. Temporary buffer: 1313.996 MB). The actual usage might be slightly larger than the estimated number. [07:50:32] /Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/relax_vm/paged_kv_cache.cc:2666: Check failed: (args.size() == 22 || args.size() == 23) is false: Invalid number of KV cache constructor args.
Stack trace: Exception in thread Thread-1: Traceback (most recent call last): File "/Users/zack/.home/local/mise/installs/python/3.11.9/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() File "/Users/zack/.home/local/mise/installs/python/3.11.9/lib/python3.11/threading.py", line 982, in run self._target(*self._args, **self._kwargs) File "/Users/zack/.home/gitrepos/LLMLife/backend/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 245, in __call__ raise_last_ffi_error() File "/Users/zack/.home/gitrepos/LLMLife/backend/tvm/python/tvm/_ffi/base.py", line 481, in raise_last_ffi_error raise py_err tvm._ffi.base.TVMError: Traceback (most recent call last): File "/Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/runtime/relax_vm/paged_kv_cache.cc", line 2666 TVMError: Check failed: (args.size() == 22 || args.size() == 23) is false: Invalid number of KV cache constructor args.
`
Expected behavior
Well, expected with my limited number of customized build options I wouldn't run into trouble, but considering the other person's bug report was closed I imagine it may be related to that. While I'm here, may as well also ask what the status with Use_MPS is. I didn't use it as it caused problems in the past and it sounded like it was being phased out anyways.
Environment
How you installed MLC-LLM/TVM-Unity (
conda
, source): Source, PoetryPython version (e.g. 3.10): 3.11.9
TVM Unity Hash Tag (
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models):[07:53:41] /Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/llvm_instance.cc:226: Error: Using LLVM 19.1.3 with
-mcpu=apple-latestis not valid in
-mtriple=arm64-apple-macos, using default
-mcpu=generic[07:53:41] /Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/llvm_instance.cc:226: Error: Using LLVM 19.1.3 with
-mcpu=apple-latestis not valid in
-mtriple=arm64-apple-macos, using default
-mcpu=generic[07:53:41] /Users/zack/.home/gitrepos/LLMLife/backend/tvm/src/target/llvm/llvm_instance.cc:226: Error: Using LLVM 19.1.3 with
-mcpu=apple-latestis not valid in
-mtriple=arm64-apple-macos, using default
-mcpu=genericUSE_NVTX: OFF USE_GTEST: AUTO SUMMARIZE: OFF TVM_DEBUG_WITH_ABI_CHANGE: OFF USE_IOS_RPC: OFF USE_MSC: OFF USE_ETHOSU: OFF CUDA_VERSION: NOT-FOUND USE_LIBBACKTRACE: AUTO DLPACK_PATH: 3rdparty/dlpack/include USE_TENSORRT_CODEGEN: OFF USE_OPENCL_EXTN_QCOM: NOT-FOUND USE_TARGET_ONNX: OFF USE_AOT_EXECUTOR: ON BUILD_DUMMY_LIBTVM: OFF USE_CUDNN: OFF USE_TENSORRT_RUNTIME: OFF USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF USE_THRUST: OFF USE_CCACHE: AUTO USE_ARM_COMPUTE_LIB: OFF USE_CPP_RTVM: OFF USE_OPENCL_GTEST: /path/to/opencl/gtest TVM_LOG_BEFORE_THROW: ON USE_MKL: OFF USE_PT_TVMDSOOP: OFF MLIR_VERSION: NOT-FOUND USE_CLML: OFF USE_STACKVM_RUNTIME: OFF USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF ROCM_PATH: /opt/rocm USE_DNNL: OFF USE_MSCCL: OFF USE_NNAPI_RUNTIME: OFF USE_VITIS_AI: OFF USE_MLIR: OFF USE_RCCL: OFF USE_LLVM: llvm-config --ignore-libllvm --link-static USE_VERILATOR: OFF USE_TF_TVMDSOOP: OFF USE_THREADS: ON USE_MSVC_MT: OFF BACKTRACE_ON_SEGFAULT: OFF USE_GRAPH_EXECUTOR: ON USE_NCCL: OFF USE_ROCBLAS: OFF GIT_COMMIT_HASH: e6b2a55d1e1668d889ce69efa3921bc73dcb8b8a USE_VULKAN: OFF USE_RUST_EXT: OFF USE_CUTLASS: OFF USE_CPP_RPC: OFF USE_HEXAGON: OFF USE_CUSTOM_LOGGING: OFF USE_UMA: OFF USE_FALLBACK_STL_MAP: OFF USE_SORT: ON USE_RTTI: ON GIT_COMMIT_TIME: 2024-11-20 23:38:22 -0500 USE_HIPBLAS: OFF USE_HEXAGON_SDK: /path/to/sdk USE_BLAS: none USE_ETHOSN: OFF USE_LIBTORCH: OFF USE_RANDOM: ON USE_CUDA: OFF USE_COREML: ON USE_AMX: OFF BUILD_STATIC_RUNTIME: OFF USE_CMSISNN: OFF USE_KHRONOS_SPIRV: OFF USE_CLML_GRAPH_EXECUTOR: OFF USE_TFLITE: OFF USE_HEXAGON_GTEST: /path/to/hexagon/gtest PICOJSON_PATH: 3rdparty/picojson USE_OPENCL_ENABLE_HOST_PTR: OFF INSTALL_DEV: OFF USE_PROFILER: ON USE_NNPACK: OFF LLVM_VERSION: 19.1.3 USE_MRVL: OFF USE_OPENCL: OFF COMPILER_RT_PATH: 3rdparty/compiler-rt USE_NNAPI_CODEGEN: OFF RANG_PATH: 3rdparty/rang/include USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF USE_OPENMP: none USE_BNNS: OFF USE_FLASHINFER: OFF USE_CUBLAS: OFF USE_METAL: ON USE_MICRO_STANDALONE_RUNTIME: OFF USE_HEXAGON_EXTERNAL_LIBS: OFF USE_ALTERNATIVE_LINKER: AUTO USE_BYODT_POSIT: OFF USE_NVSHMEM: OFF USE_HEXAGON_RPC: OFF USE_MICRO: OFF DMLC_PATH: 3rdparty/dmlc-core/include INDEX_DEFAULT_I64: ON USE_RELAY_DEBUG: OFF USE_RPC: ON USE_TENSORFLOW_PATH: none TVM_CLML_VERSION: USE_MIOPEN: OFF USE_ROCM: OFF USE_PAPI: OFF USE_CURAND: OFF TVM_CXX_COMPILER_PATH: /opt/homebrew/opt/llvm/bin/clang++ HIDE_PRIVATE_SYMBOLS: OFF
Any other relevant information: Well, its odd because it would appear that my options ^, which I dictate with
ccmake ..
before I do cmake --build && nproc -j10, aren't seeming to be honored.I also forgot I do set "Use_OPENMP" (ON). But donno if that needs to be set to a file path for that one to work.
I compiled this last on Friday and used the latest builds at that time (with git pull --recurse-submodules to boot)
The text was updated successfully, but these errors were encountered: