TencentARC adalah organisasi yang memiliki keberadaan publik yang signifikan di GitHub, dengan berbagai repositori yang berfokus pada pengembangan algoritma dan model dalam bidang pemrosesan gambar dan visi komputer. Penggunaan bahasa pemrograman seperti Python, Jupyter Notebook, dan JavaScript terlihat jelas dalam proyek-proyek terkemuka mereka, termasuk GFPGAN dan PhotoMaker.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
PhotoMaker [CVPR 2024]
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
T2I-Adapter
[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official Code for MotionCtrl [SIGGRAPH 2024]
SEED-Voken: A Series of Powerful Visual Tokenizers
SEED-Story: Multimodal Long Story Generation with Large Language Model
[ICCV 2023] Consistent Image Synthesis and Editing
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing
[ACL 2024] Progressive LLaMA with Block Expansion.
A framework to convert any 2D videos to immersive stereoscopic 3D
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow:基于检索增强的图像序列上色
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Official code of SmartEdit [CVPR-2024 Highlight]
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Tidak ada deskripsi yang diberikan untuk repositori ini.
[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Structured Video Comprehension of Real-World Shorts
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning
[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Official codes for DeSRA (ICML 2023)
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Tidak ada deskripsi yang diberikan untuk repositori ini.
Nerf and extensions in all
Tidak ada deskripsi yang diberikan untuk repositori ini.
mllm-npu: training multimodal large language models on Ascend NPUs
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
Tidak ada deskripsi yang diberikan untuk repositori ini.
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Official codes for ConMIM (ICLR 2023)
Turning to Video for Transcript Sorting
Official code for "Binary embedding based retrieval at Tencent"
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Tidak ada deskripsi yang diberikan untuk repositori ini.
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Codes for "Accelerating the Training of Video Super-Resolution"
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Tidak ada deskripsi yang diberikan untuk repositori ini.
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
Tidak ada deskripsi yang diberikan untuk repositori ini.
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
Common template for pytorch project. Easy to extent and modify for new project.
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
Visualization of 3d and 2d components interactively.
Tidak ada deskripsi yang diberikan untuk repositori ini.
Tidak ada deskripsi yang diberikan untuk repositori ini.
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026
vllm for ARC-Hunyuan-Video-7B
Tidak ada deskripsi yang diberikan untuk repositori ini.
An optical flow forward warp's lib with backpropagation using pytorch.
TencentARC membangun berbagai proyek yang berkaitan dengan pemrosesan gambar dan visi komputer. Beberapa repositori terkenal mereka meliputi GFPGAN dan PhotoMaker, yang mengembangkan algoritma untuk pemulihan wajah dan pembuatan gambar.
TencentARC menggunakan beberapa bahasa pemrograman, terutama Python dan Jupyter Notebook. Selain itu, mereka juga mengimplementasikan JavaScript dan HTML dalam proyek-proyek mereka, menciptakan solusi yang komprehensif.
Ya, semua repositori TencentARC bersifat publik. Hal ini memungkinkan akses terbuka bagi pengembang dan peneliti untuk melihat, menggunakan, dan berkontribusi pada proyek-proyek yang mereka kembangkan di GitHub.
Pantau ARC Lab, Tencent PCG dengan RepoGuard dan dapatkan pemberitahuan saat repositori publik baru muncul.
Pantau akun ini