bloat

refactor(core): embed llama.cpp's server binary directly for LLM inference #363

Job	Run time
cargo_bloat	4m 2s
	4m 2s

Provide feedback