vllm-project 是一个在 GitHub 上活跃的组织,专注于大规模语言模型的推理和服务。该组织的公共代码库涵盖多种编程语言,包括 Python、C++、Rust 和 Go,拥有一系列广泛使用的项目,如 vllm、vllm-omni 和 aibrix,展示了其在人工智能领域的贡献。
A high-throughput and memory-efficient inference and serving engine for LLMs
A framework for efficient model inference with omni-modality models
Cost-efficient and pluggable Infrastructure components for GenAI inference
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Community maintained hardware plugin for vLLM on Ascend
Community maintained hardware plugin for vLLM on Apple Silicon
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
Common recipes to run vLLM
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
TPU inference for vLLM, with unified JAX and PyTorch support.
A safetensors extension to efficiently store sparse quantized tensors on disk
A high-performance and light-weight router for vLLM large scale deployment
An LLM post-training framework with vLLM for RL Scaling
Fast and memory-efficient exact attention
Agent skills for vLLM
此仓库未提供描述。
vLLM Daily Summarization of Merged PRs
The vLLM XPU kernels for Intel GPU
此仓库未提供描述。
This repo hosts code for vLLM CI & Performance Benchmark infrastructure.
Community maintained hardware plugin for vLLM on Intel Gaudi
Stateful API logic for agentic applications using vLLM
Community maintained hardware plugin for vLLM on AWS Neuron
vLLM plugin for block-based diffusion language model (dLLM) support
Manages vllm-nccl dependency
此仓库未提供描述。
vLLM Model plugin for the encoder-decoder BART model
此仓库未提供描述。
vLLM Logo Assets
此仓库未提供描述。
vLLM Quantization plugin for GGUF
Performance benchmark & accuracy evaluation for vLLM
此仓库未提供描述。
Performance dashboard for vLLM
vLLM Quantization plugin for bitsandbytes
此仓库未提供描述。
此仓库未提供描述。
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
此仓库未提供描述。
Standalone fork of llm-multimodal from SMG
vllm-project 在 GitHub 上构建了一系列与大规模语言模型相关的项目,主要包括推理和服务引擎,以及高效的模型推理框架。这些项目旨在为不同的应用场景提供解决方案。
vllm-project 主要使用 Python、C++、Rust 和 Go 等编程语言。这些语言的多样性反映了该组织在构建高效和可扩展系统方面的能力。
是的,vllm-project 的所有代码库都是公开的。这意味着任何人都可以访问、审计和贡献这些项目,从而促进了开源社区的发展。