TencentARC, l'organisation de l'ARC Lab de Tencent PCG, dispose d'une présence publique significative sur GitHub avec un large éventail de dépôts. Les principales langues utilisées incluent Python, Jupyter Notebook et JavaScript, et des projets notables tels que GFPGAN et PhotoMaker témoignent de leur engagement dans le développement d'algorithmes pratiques pour la restauration d'images et d'autres applications avancées.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
PhotoMaker [CVPR 2024]
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
T2I-Adapter
[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official Code for MotionCtrl [SIGGRAPH 2024]
SEED-Voken: A Series of Powerful Visual Tokenizers
SEED-Story: Multimodal Long Story Generation with Large Language Model
[ICCV 2023] Consistent Image Synthesis and Editing
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing
[ACL 2024] Progressive LLaMA with Block Expansion.
A framework to convert any 2D videos to immersive stereoscopic 3D
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow:基于检索增强的图像序列上色
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Official code of SmartEdit [CVPR-2024 Highlight]
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Aucune description fournie pour ce dépôt.
[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Structured Video Comprehension of Real-World Shorts
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning
[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Official codes for DeSRA (ICML 2023)
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Aucune description fournie pour ce dépôt.
Nerf and extensions in all
Aucune description fournie pour ce dépôt.
mllm-npu: training multimodal large language models on Ascend NPUs
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
Aucune description fournie pour ce dépôt.
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Official codes for ConMIM (ICLR 2023)
Turning to Video for Transcript Sorting
Official code for "Binary embedding based retrieval at Tencent"
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Aucune description fournie pour ce dépôt.
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Codes for "Accelerating the Training of Video Super-Resolution"
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Aucune description fournie pour ce dépôt.
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
Aucune description fournie pour ce dépôt.
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
Common template for pytorch project. Easy to extent and modify for new project.
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
Visualization of 3d and 2d components interactively.
Aucune description fournie pour ce dépôt.
Aucune description fournie pour ce dépôt.
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026
vllm for ARC-Hunyuan-Video-7B
Aucune description fournie pour ce dépôt.
An optical flow forward warp's lib with backpropagation using pytorch.
TencentARC développe une variété de projets, notamment dans le domaine de la vision par ordinateur et de la génération d'images. Des dépôts comme GFPGAN et PhotoMaker sont des exemples de leur travail sur des algorithmes de restauration d'images et de traitement d'images.
TencentARC utilise principalement Python, Jupyter Notebook et JavaScript pour ses projets. Ces langages sont adaptés à leurs travaux de recherche et à leurs applications en intelligence artificielle et en traitement d'images.
Oui, tous les dépôts de TencentARC sont publics. Cela permet à la communauté de consulter, d'utiliser et de contribuer à leurs projets, favorisant ainsi la collaboration et l'innovation dans le domaine de la technologie.
Surveillez ARC Lab, Tencent PCG avec RepoGuard et soyez alerté dès qu'un nouveau dépôt public apparaît.
Surveiller ce compte