TencentARC è un'organizzazione su GitHub che si concentra principalmente su linguaggi come Python, Jupyter Notebook e JavaScript. Tra i loro progetti di rilievo ci sono GFPGAN, un algoritmo per il restauro di volti, e PhotoMaker, un progetto innovativo presentato a CVPR 2024, dimostrando un'ampia gamma di applicazioni nella generazione e manipolazione di immagini.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
PhotoMaker [CVPR 2024]
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
T2I-Adapter
[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Official Code for MotionCtrl [SIGGRAPH 2024]
SEED-Voken: A Series of Powerful Visual Tokenizers
SEED-Story: Multimodal Long Story Generation with Large Language Model
[ICCV 2023] Consistent Image Synthesis and Editing
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
[ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframing
[ACL 2024] Progressive LLaMA with Block Expansion.
A framework to convert any 2D videos to immersive stereoscopic 3D
The official implementation of paper "ColorFlow: Retrieval-Augmented Image Sequence Colorization". ColorFlow:基于检索增强的图像序列上色
[ICCV 2025] GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
Official code of SmartEdit [CVPR-2024 Highlight]
Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"
ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
[ICCV 2025] AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
[CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation"
AudioStory: Generating Long-Form Narrative Audio with Large Language Models
Nessuna descrizione fornita per questo repository.
[ICCV 2025] FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Structured Video Comprehension of Real-World Shorts
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
[ICCV2025 Oral] Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[CVPR 2026 Highlight🔥] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[ICLR'26] IC-Custom: Diverse Image Customization via In-Context Learning
[ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer
[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Official codes for DeSRA (ICML 2023)
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code release of our paper "DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation".
[CVPR 2025] Boosting Generative Novel View Synthesis with Sparse and Unposed Images
[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Nessuna descrizione fornita per questo repository.
Nerf and extensions in all
Nessuna descrizione fornita per questo repository.
mllm-npu: training multimodal large language models on Ascend NPUs
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
Nessuna descrizione fornita per questo repository.
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Official codes for ConMIM (ICLR 2023)
Turning to Video for Transcript Sorting
Official code for "Binary embedding based retrieval at Tencent"
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Nessuna descrizione fornita per questo repository.
[IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Codes for "Accelerating the Training of Video Super-Resolution"
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Nessuna descrizione fornita per questo repository.
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
Nessuna descrizione fornita per questo repository.
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
Common template for pytorch project. Easy to extent and modify for new project.
The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
Visualization of 3d and 2d components interactively.
Nessuna descrizione fornita per questo repository.
Nessuna descrizione fornita per questo repository.
Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers. CVPR‘2026
vllm for ARC-Hunyuan-Video-7B
Nessuna descrizione fornita per questo repository.
An optical flow forward warp's lib with backpropagation using pytorch.
TencentARC sviluppa una varietà di progetti, principalmente nel campo della generazione e manipolazione di immagini. I loro repository principali includono GFPGAN e PhotoMaker, che evidenziano la loro ricerca avanzata in quest'area.
TencentARC utilizza principalmente Python, Jupyter Notebook e JavaScript nei loro progetti. Questi linguaggi sono fondamentali per le loro applicazioni di intelligenza artificiale e machine learning.
Sì, tutti i repository di TencentARC sono pubblici. Questo consente a chiunque di esplorare i loro progetti e contribuire al loro sviluppo, aumentando la trasparenza e la collaborazione nella comunità di sviluppo.
Monitora ARC Lab, Tencent PCG con RepoGuard e ricevi un avviso nel momento in cui appare un nuovo repository pubblico.
Monitora questo account