RepoGuard
Diperbarui 3 min ago
kvcache.ai

Organization

Jejak publik GitHub kvcache.ai

@kvcache-ai
Lihat profil di GitHub

KVCache.AI is a joint research project between MADSys and top industry collaborators, focusing on efficient LLM serving.

15

Repositori publik

22.973

Total bintang

1.146

Pengikut

Bahasa teratas

Python 5Cuda 2C++ 1Go 1JavaScript 1

Repositori publik

ktransformers

17.272

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python
Diperbarui 13 Jun 2026

Mooncake

5.567

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++
Diperbarui 13 Jun 2026

TrEnv-X

84

Tidak ada deskripsi yang diberikan untuk repositori ini.

Go
Diperbarui 1 Jun 2026

vllm

15

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
Diperbarui 26 Mei 2026

kvcache-blog

11

Tidak ada deskripsi yang diberikan untuk repositori ini.

JavaScript
Diperbarui 12 Jun 2026

sglang

11

SGLang is a fast serving framework for large language models and vision language models.

Python
Diperbarui 5 Jun 2026

custom_flashinfer

7

FlashInfer: Kernel Library for LLM Serving

Cuda
Diperbarui 1 Mar 2026

DeepEP_fault_tolerance

3

DeepEP: an efficient expert-parallel communication library that supports fault tolerance

Cuda
Diperbarui 10 Mar 2026

sglang_awq

2

SGLang is a fast serving framework for large language models and vision language models.

Python
Diperbarui 2 Mar 2026

accelerate

1

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Bahasa Tidak Dikenal
Diperbarui 13 Apr 2026

Model-Optimizer

0

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Bahasa Tidak Dikenal
Diperbarui 6 Mei 2026

evalscope

0

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python
Diperbarui 10 Apr 2026

transformers

0

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Bahasa Tidak Dikenal
Diperbarui 7 Apr 2026

gpustack

0

GPU cluster manager for optimized AI model deployment

Bahasa Tidak Dikenal
Diperbarui 8 Des 2025

sglang-npu

0

SGLang is a fast serving framework for large language models and vision language models.

Bahasa Tidak Dikenal
Diperbarui 12 Agu 2025

Apakah paparan ini dimaksudkan?

Pantau kvcache.ai dengan RepoGuard dan dapatkan pemberitahuan saat repositori publik baru muncul.

Pantau akun ini