IDEA-Research在GitHub上有着丰富的公开存在,专注于数字经济领域的研究。该组织的主要编程语言包括Python、Jupyter Notebook和TypeScript,其知名项目如Grounded-Segment-Anything和GroundingDINO在学术界和工业界都得到了广泛应用,展示了其在视觉识别和对象检测方面的技术实力。
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
[CVPR2026] Detect Anything via Next Point Prediction
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
[NeurIPS 2023] Official implementation of the paper "Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset"
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
[CVPR 2023] Official implementation of the paper "One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer"
[ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"
[CVPR 2022 Oral] Official implementation of DN-DETR
[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
[ICML 2024] 🍅HumanTOMATO: Text-aligned Whole-body Motion Generation
[ICCV 2023] The official implementation of paper "HumanSD: A Native Skeleton-Guided Diffusion Model for Human Image Generation"
[CVPR 2023] The official implementation of CVPR 2023 paper "Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes"
[ECCV 2024 & NeurIPS 2024 & ICLR 2026] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3
The Go-To Choice for CV Data Visualization, Annotation, and Model Analysis.
[ICCV 2023] Official implementation of the paper "Detection Transformer with Stable Matching"
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
[CVPR 2023] Official implementation of the paper "Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR"
[NeurIPS 2023] Official implementation of the paper "DreamWaltz: Make a Scene with Complex 3D Animatable Avatars".
[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "
[ICCV 2023] Official implementation of the paper "DFA3D: 3D Deformable Attention For 2D-to-3D Feature Lifting"
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
[ICLR-2026] Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
[CVPR 2023] Official implementation of the paper: MP-Former: Mask-Piloted Transformer for Image Segmentation
[CVPR 2026] Implementation of paper "SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model"
Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.
[ICCV 2023] Official implementation of the paper "Neural Interactive Keypoint Detection"
Official implementation of the paper "Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model"
[AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding
Official PyTorch implementation of the paper "DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training".
Related code, checkpoints and project page for V-Reflection
[AAAI 2026] Official implementation of the paper ”SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features“
此仓库未提供描述。
[ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"
Implementation and checkpoints of Imagen, Google's text-to-image synthesis neural network, in Pytorch
[Arxiv 2024] MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms
Official implementation of the paper "SegVGGT: Joint 3D Reconstruction and Instance Segmentation from Multi-View Images"
此仓库未提供描述。
此仓库未提供描述。
此仓库未提供描述。
IDEA-Research主要构建与数字经济和计算机视觉相关的项目,尤其在对象检测和分割方面具有显著贡献。其项目如Grounded-Segment-Anything和GroundingDINO被广泛使用,展示了其在该领域的技术深度。
IDEA-Research在其GitHub项目中主要使用Python、Jupyter Notebook和TypeScript等编程语言。这些语言使其能够开发出多样化的研究工具和应用,特别是在数据处理和机器学习方面。
是的,IDEA-Research的所有仓库都是公开的。这使得其他研究人员和开发者可以自由访问其代码和资源,促进了知识共享和技术交流。