Skip to content
@FMInference

Foundation Model Inference

Inference Systems for Foundation Models

Pinned Loading

  1. FlexLLMGen FlexLLMGen Public archive

    Running large language models on a single GPU for throughput-oriented scenarios.

    Python 9.2k 553

Repositories

Showing 3 of 3 repositories
  • FlexLLMGen Public archive

    Running large language models on a single GPU for throughput-oriented scenarios.

    FMInference/FlexLLMGen’s past year of commit activity
    Python 9,242 Apache-2.0 553 52 (3 issues need help) 6 Updated Oct 28, 2024
  • H2O Public

    [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

    FMInference/H2O’s past year of commit activity
    Python 407 48 31 1 Updated Aug 1, 2024
  • DejaVu Public
    FMInference/DejaVu’s past year of commit activity
    Python 302 39 26 1 Updated Apr 2, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Python