A presença pública do TencentARC no GitHub é marcada por uma ampla gama de repositórios focados em inteligência artificial e aprendizado de máquina. Com linguagens primárias como Python, Jupyter Notebook e JavaScript, projetos notáveis incluem GFPGAN, PhotoMaker e InstantMesh, que abordam restauração facial, geração de imagens e modelagem 3D.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
PhotoMaker [CVPR 2024]
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
T2I-Adapter
[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official Code for MotionCtrl [SIGGRAPH 2024]
SEED-Voken: A Series of Powerful Visual Tokenizers
SEED-Story: Multimodal Long Story Generation with Large Language Model
[ICCV 2023] Consistent Image Synthesis and Editing
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing
[ACL 2024] Progressive LLaMA with Block Expansion.
A framework to convert any 2D videos to immersive stereoscopic 3D
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow:基于检索增强的图像序列上色
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Official code of SmartEdit [CVPR-2024 Highlight]
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Nenhuma descrição fornecida para este repositório.
[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Structured Video Comprehension of Real-World Shorts
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning
[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Official codes for DeSRA (ICML 2023)
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Nenhuma descrição fornecida para este repositório.
Nerf and extensions in all
Nenhuma descrição fornecida para este repositório.
mllm-npu: training multimodal large language models on Ascend NPUs
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
Nenhuma descrição fornecida para este repositório.
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Official codes for ConMIM (ICLR 2023)
Turning to Video for Transcript Sorting
Official code for "Binary embedding based retrieval at Tencent"
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Nenhuma descrição fornecida para este repositório.
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Codes for "Accelerating the Training of Video Super-Resolution"
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Nenhuma descrição fornecida para este repositório.
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
Nenhuma descrição fornecida para este repositório.
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
Common template for pytorch project. Easy to extent and modify for new project.
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
Visualization of 3d and 2d components interactively.
Nenhuma descrição fornecida para este repositório.
Nenhuma descrição fornecida para este repositório.
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026
vllm for ARC-Hunyuan-Video-7B
Nenhuma descrição fornecida para este repositório.
An optical flow forward warp's lib with backpropagation using pytorch.
O TencentARC desenvolve uma variedade de projetos relacionados a inteligência artificial e aprendizado de máquina. Repositórios notáveis incluem GFPGAN para restauração de rosto e PhotoMaker, que é focado em geração de imagens.
As principais linguagens de programação utilizadas pelo TencentARC incluem Python, Jupyter Notebook e JavaScript. Essas linguagens são fundamentais para o desenvolvimento de seus projetos tecnológicos.
Sim, todos os repositórios do TencentARC são públicos. Isso permite que outros desenvolvedores acessem, estudem e contribuam para seus projetos, promovendo a transparência e a colaboração na comunidade de código aberto.
Monitore ARC Lab, Tencent PCG com o RepoGuard e receba alertas no momento em que um novo repositório público aparecer.
Monitore esta conta