-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Pull requests: microsoft/onnxruntime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Implementation of flash attention for native webgpu ep
#22932
opened Nov 24, 2024 by
sushraja-msft
Loading…
Bump onnx from 1.16.1 to 1.17.0 in /onnxruntime/python/tools/transformers/models/phi2
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
#22928
opened Nov 22, 2024 by
dependabot
bot
Loading…
[TensorRT EP] Use TRT/CUDA/ORT version from runtime instead of build time to generate hash value
#22921
opened Nov 21, 2024 by
chilo-ms
Loading…
[js/webgpu] support FlashAttention-2 for attention operator
ep:WebGPU
ort-web webgpu provider
#22915
opened Nov 21, 2024 by
xhcao
Loading…
[QNN EP] [DRAFT] Support Conv float weight/bias.
#22906
opened Nov 20, 2024 by
adrianlizarraga
•
Draft
Override android qnn sdk version with pipeline param
#22895
opened Nov 19, 2024 by
sheetalarkadam
Loading…
[WebNN] Support negative steps for slice
ep:WebNN
WebNN execution provider
#22871
opened Nov 18, 2024 by
shiyi9801
Loading…
Refactor emulator start and stop functions for clarity and efficiency
platform:mobile
issues related to ONNX Runtime mobile; typically submitted using template
#22861
opened Nov 16, 2024 by
jchen351
Loading…
Keep the model metadata on the generated EP context model (use bridge api)
#22860
opened Nov 15, 2024 by
chilo-ms
Loading…
[TensorRT EP] Fix wrong input order when generating IndexedSubGraph
#22857
opened Nov 15, 2024 by
chilo-ms
Loading…
Enable QNN HTP spill fill buffer setting to save RAM usage.
ep:QNN
issues related to QNN exeution provider
#22853
opened Nov 15, 2024 by
HectorSVC
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.