-
Notifications
You must be signed in to change notification settings - Fork 0
Issues: flexflow/flexflow-serve
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Unable to use ./inference/incr_decoding/incr_decoding on inference branch
#1
opened Nov 26, 2024 by
hugolatendresse
cuIpcGetMemHandle triggered CUDA out of memory when I use flexflow on one gpu
#75
opened Sep 11, 2024 by
sjtu-zwh
Flexflow attempts to register duplicate sharding functors
bug
Something isn't working
#17
opened Apr 20, 2024 by
rohany
Can we modify the "expansion configuration" parameters, such as "token tree width" and "token tree depth"?
question
Further information is requested
#53
opened Mar 27, 2024 by
hky011011
use the cuda graph for specinfer and it's blocked
bug
Something isn't working
#46
opened Feb 27, 2024 by
lambda7xx
Using Algorithm 2 in SpecInfer paper, I get wrong outputs.
bug
Something isn't working
#10
opened Feb 19, 2024 by
dutsc
falcon-40b model loading does not work -- file_loader expects incorrect model structure
enhancement
New feature or request
#16
opened Jan 9, 2024 by
ktorkkola
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.