TencentARC는 GitHub에서 다양한 공개 저장소를 운영하고 있으며, 주요 프로그래밍 언어로는 Python, Jupyter Notebook, JavaScript, HTML이 포함됩니다. 이 조직의 주목할 만한 프로젝트로는 GFPGAN, PhotoMaker, InstantMesh 등이 있으며, 실용적인 알고리즘과 이미지 복원 기술을 개발하는 데 중점을 두고 있습니다.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
PhotoMaker [CVPR 2024]
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
T2I-Adapter
[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official Code for MotionCtrl [SIGGRAPH 2024]
SEED-Voken: A Series of Powerful Visual Tokenizers
SEED-Story: Multimodal Long Story Generation with Large Language Model
[ICCV 2023] Consistent Image Synthesis and Editing
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing
[ACL 2024] Progressive LLaMA with Block Expansion.
A framework to convert any 2D videos to immersive stereoscopic 3D
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow:基于检索增强的图像序列上色
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Official code of SmartEdit [CVPR-2024 Highlight]
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
이 저장소에 대한 설명이 제공되지 않았습니다.
[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Structured Video Comprehension of Real-World Shorts
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning
[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Official codes for DeSRA (ICML 2023)
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
이 저장소에 대한 설명이 제공되지 않았습니다.
Nerf and extensions in all
이 저장소에 대한 설명이 제공되지 않았습니다.
mllm-npu: training multimodal large language models on Ascend NPUs
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
이 저장소에 대한 설명이 제공되지 않았습니다.
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Official codes for ConMIM (ICLR 2023)
Turning to Video for Transcript Sorting
Official code for "Binary embedding based retrieval at Tencent"
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
이 저장소에 대한 설명이 제공되지 않았습니다.
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Codes for "Accelerating the Training of Video Super-Resolution"
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
이 저장소에 대한 설명이 제공되지 않았습니다.
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
이 저장소에 대한 설명이 제공되지 않았습니다.
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
Common template for pytorch project. Easy to extent and modify for new project.
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
Visualization of 3d and 2d components interactively.
이 저장소에 대한 설명이 제공되지 않았습니다.
이 저장소에 대한 설명이 제공되지 않았습니다.
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026
vllm for ARC-Hunyuan-Video-7B
이 저장소에 대한 설명이 제공되지 않았습니다.
An optical flow forward warp's lib with backpropagation using pytorch.
TencentARC는 이미지 복원 및 3D 메쉬 생성과 관련된 여러 프로젝트를 개발하고 있습니다. 주요 저장소인 GFPGAN과 InstantMesh가 그 예입니다.
TencentARC는 주로 Python, Jupyter Notebook, JavaScript, HTML과 같은 언어를 사용하여 다양한 프로젝트를 개발합니다.
네, TencentARC의 모든 저장소는 공개적으로 접근 가능하며, 누구나 이들의 코드와 프로젝트를 확인할 수 있습니다.