NVIDIA Corporation 的公共 GitHub 足迹

nvidia-docker

★17,557

Build and run Docker containers leveraging NVIDIA GPUs

open-gpu-kernel-modules

★17,084

NVIDIA Linux open GPU kernel module source

Megatron-LM

★16,688

Ongoing research training transformer models at scale

DeepLearningExamples

★14,818

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

TensorRT-LLM

★13,861

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

TensorRT

★13,065

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

FastPhotoStyle

★11,177

Style transfer, deep learning, feature transform

cosmos

★10,107

NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.

personaplex

★9,999

PersonaPlex code.

cutlass

★9,891

CUDA Templates and Python DSLs for High-Performance Linear Algebra

cuda-samples

★9,292

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

vid2vid

★8,705

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

garak

★8,095

the LLM vulnerability scanner

Isaac-GR00T

★7,333

NVIDIA Isaac GR00T N1.7 - A Foundation Model for Generalist Robots.

OpenShell

★7,073

OpenShell is the safe, private runtime for autonomous AI agents.

Rust

warp

★6,754

A Python framework for GPU-accelerated simulation, robotics, and machine learning.

FasterTransformer

★6,420

Transformer related optimization, including BERT, GPT

thrust

★5,003

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

nccl

★4,808

Optimized primitives for collective multi-GPU communication

DIGITS

★4,179

Deep Learning GPU Training System

HTML

SkillSpector

★3,962

Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks.

k8s-device-plugin

★3,789

NVIDIA device plugin for Kubernetes

TransformerEngine

★3,390

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

NeMo-Retriever

★2,937

NeMo Retriever Library is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever Library uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.

MinkowskiEngine

★2,935

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

physicsnemo

★2,927

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

Model-Optimizer

★2,920

A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

gpu-operator

★2,738

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

NeMo-Agent-Toolkit

★2,407

The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.

cccl

★2,378

CUDA Core Compute Libraries

cutile-python

★2,069

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

dcgm-exporter

★1,763

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM

Cosmos-Tokenizer

★1,725

A suite of image and video neural tokenizers

trt-samples-for-hackathon-cn

★1,661

Simple samples for TensorRT programming

nccl-tests

★1,555

NCCL Tests

Cuda

gdrcopy

★1,383

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

skills

★1,232

AI agent skills published by NVIDIA

cuda-quantum

★1,068

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

jetson-gpio

★1,064

A Python library that enables the use of Jetson's GPIOs

earth2studio

★982

Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.

spark-rapids

★979

Spark RAPIDS plugin - accelerate Apache Spark with GPUs

Scala

dgx-spark-playbooks

★949

Collection of step-by-step playbooks for setting up AI/ML workloads on NVIDIA DGX Spark devices with Blackwell architecture.

cuopt

★944

GPU accelerated decision optimization

Cuda

NVFlare

★938

NVIDIA Federated Learning Application Runtime Environment

DreamDojo

★932

Official Codebase for "DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos" (ICML 2026)

cudnn-frontend

★845

cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.

DCGM

★742

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs

MDL-SDK

★527

NVIDIA Material Definition Language SDK

NeMo-Framework-Launcher

★521

Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.

NeMo-text-processing

★474

NeMo text processing for ASR and TTS

soma-retargeter

★418

SOMA BVH to humanoid robot motion retargeting library built with Newton and NVIDIA Warp

aerial-cuda-accelerated-ran

★375

An SDK (Software Development Kit) for building commercial-grade, AI-native, 3GPP, and O-RAN compliant 5G/6G gNB software on NVIDIA-accelerated computing platforms.

OptiX_Apps

★351

Advanced Samples for the NVIDIA OptiX 7 Ray Tracing SDK

aicr

★329

Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes

Audio2Face-3D

★318

repo collection for NVIDIA Audio2Face-3D models and tools

flashdreams

★315

high-performance inference and serving library for interactive autoregressive video and world models

Audio2Face-3D-Samples

★306

A service to convert audio to facial blendshapes for lipsyncing and facial performances.

VisRTX

★277

NVIDIA OptiX based implementation of ANARI

nim-anywhere

★242

Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench

IsaacTeleop

★240

The unified framework for sim & real robot teleoperation

cosmos-framework

★232

Our inference and training framework to run on the Cosmos Models

SOL-ExecBench

★227

A benchmark of real-world DL kernel problems

Falcor

★223

Real-time rendering research framework

asset-harvester

★209

Asset Harvester: Extracting 3D Assets from Autonomous Driving Logs for Simulation

cosmos-curator

★207

Cosmos Curator is a powerful video curation system that processes, analyzes, and organizes video content using advanced AI models and distributed computing.

infra-controller

★193

NVIDIA Infra Controller - Hardware Lifecycle Management and multitenant networking

Rust

Audio2Face-3D-SDK

★191

High-performance C++/CUDA SDK for running Audio2Emotion and Audio2Face inference with integrated post-processing.

nsight-training

★182

Training material for Nsight developer tools

OSMO

★175

The developer-first platform for scaling complex Physical AI workloads across heterogeneous compute—unifying training GPUs, simulation clusters, and edge devices in a simple YAML

TypeScript

ncore

★168

Data representations, APIs, and tools for high quality AV and robotics applications

nvcf

★164

Platform for deploying and routing GPU-accelerated inference, streaming, and batch workloads at scale.

OpenShell-Community

★158

OpenShell is the safe, private runtime for autonomous AI agents.

Dockerfile

go-dcgm

★154

Golang bindings for Nvidia Datacenter GPU Manager (DCGM)

optix-toolkit

★138

Set of utilities supporting workflows common in GPU raytracing applications

ais-k8s

★132

Kubernetes Operator, Helm Charts, Ansible Playbooks, and utility scripts for large-scale AIStore deployments on Kubernetes.

cudaqx

★103

Accelerated libraries for quantum-classical computing built on CUDA-Q.

NV-Kernels

★102

Ubuntu kernels which are optimized for NVIDIA server systems

l2fwd-nv

★91

l2fwd-nv provides an example of how to leverage your DPDK network application with the NVIDIA GPUDirect RDMA techonology.

doca-platform

★88

DOCA Platform manages provisioning and service orchestration for Bluefield DPUs

gpu-admin-tools

★81

GPU Admin Tools. Includes Confidential Computing controls for H100, and other functionality

multi-storage-client

★74

Unified high-performance Python client for object and file stores.

NeMo-Relay

★58

Multi-language agent runtime for execution scope management, lifecycle events, and middleware on tool and LLM calls.

Rust

spark-rapids-jni

★57

RAPIDS Accelerator JNI For Apache Spark

Cuda

nodewright

★57

A Kubernetes Operator to manage Node OS customizations.

k8s-driver-manager

★53

The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.

spark-rapids-benchmarks

★47

Spark RAPIDS Benchmarks – benchmark sets and utilities for the RAPIDS Accelerator for Apache Spark

sop-monitoring-blueprints

★37

Industrial SOP Monitoring Blueprints for Training & Inference

OWL

★35

The OptiX Wrappers Library

numba-cuda-mlir

★32

repo for Numba-CUDA-MLIR

harmonizer

★30

Harmonizer is an online generative enhancement framework that transforms renderings from imperfect scenes into temporally consistent outputs while improving their realism.