Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content".

Python

Đã cập nhật 8 thg 6, 2026

I2V-Adapter

★233

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models

Python

Đã cập nhật 4 thg 5, 2026

MemFlow

★209

Official Implementation of "MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives"

Python

Đã cập nhật 10 thg 6, 2026

X-Dub

★200

Try X-Dub to sync any character in a video with any audio you like | Official repository for "From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping"

Python

Đã cập nhật 11 thg 6, 2026

DiffMoE

★177

[Arxiv 2025] Official PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT

Python

Đã cập nhật 11 thg 6, 2026

StyleMaster

★174

[CVPR'25] StyleMaster: Stylize Your Video with Artistic Generation and Translation

Jupyter Notebook

Đã cập nhật 7 thg 6, 2026

ComfyUI-KLingAI-API

★172

Không có mô tả nào được cung cấp cho kho lưu trữ này.

Python

Đã cập nhật 25 thg 5, 2026

MultiShotMaster

★163

CVPR 2026 | Official Implementation of "MultiShotMaster: A Controllable Multi-Shot Video Generation Framework"

Python

Đã cập nhật 29 thg 5, 2026

CamCloneMaster

★158

[SIGGRAPH Asia'25] Enabling Reference-based Camera Control via Context without Explicit 3D Estimation

Python

Đã cập nhật 27 thg 5, 2026

SVG-T2I

★152

[Arxiv 2025] Official PyTorch Implementation of "SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model Without Variational Autoencoder".

Python

Đã cập nhật 21 thg 5, 2026

ShotStream

★150

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

Python

Đã cập nhật 10 thg 6, 2026

VANS

★119

[CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO

Python

Đã cập nhật 4 thg 6, 2026

HumanAesExpert

★118

Official implementation of "HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment"

Python

Đã cập nhật 1 thg 6, 2026

StereoPilot

★115

The official implementation of StereoPilot

Python

Đã cập nhật 12 thg 6, 2026

RoboMaster

★107

[ICLR’26] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

Python

Đã cập nhật 16 thg 5, 2026

Uniaa

★94

Unified Multi-modal IAA Baseline and Benchmark

Python

Đã cập nhật 9 thg 3, 2026

VideoCanvas

★70

Official Code of "VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning"

Ngôn ngữ không xác định

Đã cập nhật 2 thg 5, 2026

MODA

★69

[ICML 2025 Spotlight] MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding

Python

Đã cập nhật 11 thg 6, 2026

AvatarForcing

★68

Official Pytorch implementation of AvatarForcing: One-Step Streaming Talking Avatars via Local-Future Sliding-Window Denoising

Python

Đã cập nhật 10 thg 6, 2026

VMoBA

★65

Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"

Python

Đã cập nhật 2 thg 4, 2026

SPF-Portrait

★63

Official implementation of "SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning"

Ngôn ngữ không xác định

Đã cập nhật 24 thg 4, 2026

PhysMaster

★57

Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning

Ngôn ngữ không xác định

Đã cập nhật 8 thg 5, 2026

T2I-CoReBench

★53

[ICLR'26] Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?

Python

Đã cập nhật 8 thg 5, 2026

Alchemist

★39

Không có mô tả nào được cung cấp cho kho lưu trữ này.

Python

Đã cập nhật 12 thg 5, 2026

diffusing-right-space

★31

Metric implementation and raw data of "Diffusing in the Right Space: A Systematic Study of Latent Diffusability"

Python

Đã cập nhật 13 thg 6, 2026

DecMem

★19

DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Python

Đã cập nhật 8 thg 6, 2026

VidEmo

★15

[NeurIPS'25] VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models

Python

Đã cập nhật 30 thg 4, 2026

SegTune

★14

[ACL'26 Oral] Official implementation of "SegTune: Structured and Fine-Grained Control for Song Generation".

Python

Đã cập nhật 12 thg 6, 2026

VQRAE

★12

VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction

Python

Đã cập nhật 28 thg 5, 2026

VFRTok

★11

Official implementation of NeurIPS'25 paper "VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption"

Python

Đã cập nhật 8 thg 6, 2026

IMBA-Loss

★11

[ICCV 2025] Official Implementation of the Paper "Imbalance in Balance: Online Concept Balancing in Generation Models".

Python

Đã cập nhật 16 thg 1, 2026

DVIS_Plus

★10

Decoupled Video Instance Segmentation Framework, improved version of dvis

Python

Đã cập nhật 17 thg 5, 2026

TexEditor

★7

TexEditor: Structure-Preserving Text-Driven Texture Editing

Jupyter Notebook

Đã cập nhật 2 thg 6, 2026

DVIS

★7

Decoupled Video Instance Segmentation Framework

Python

Đã cập nhật 17 thg 5, 2026

ScalingCache

★5

[ICLR 2026] Scalingcache: extreme acceleration of dits through difference scaling and dynamic interval caching

Python

Đã cập nhật 28 thg 4, 2026

kling-waic-express

★4

This is the program for supporting KlingAI Express in WAIC 2025.

Kotlin

Đã cập nhật 12 thg 6, 2026

SocioEmoDialog

★4

Scripts for processing and evaluating SocioEmoDialog datasets. It includes the core processing scripts, evaluation metrics, and additional documentation.

Python

Đã cập nhật 16 thg 1, 2026