RepoGuard
refreshing…
Tongyi Lab, Alibaba Group

Organization

Public GitHub footprint of Tongyi Lab, Alibaba Group

@Alibaba-NLP
View profile on GitHub

Our team at Tongyi Lab is dedicated to pioneer advancements in AI search technologies.

China

43

Public repositories

25,446

Total stars

1,656

Followers

Alibaba-NLP, part of Tongyi Lab at Alibaba Group, is actively contributing to the open-source community on GitHub. The organization focuses on AI search technologies, with primary repositories developed in Python, including notable projects like DeepResearch and ZeroSearch, which address advanced research and search capabilities in AI.

Top languages

Python 34

Public repositories

DeepResearch

19,374

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python
Updated Jun 13, 2026

ZeroSearch

1,291

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python
Updated Jun 13, 2026

VRAG

947

Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.

Python
Updated Jun 12, 2026

ViDoRAG

664

[EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

Python
Updated Jun 11, 2026

OmniSearch

430

Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

Python
Updated Jun 11, 2026

ACE

313

[ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction

Python
Updated Jun 1, 2026

CHRONOS

300

Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"

Python
Updated Jun 12, 2026

EcomGPT

275

An Instruction-tuned Large Language Model for E-commerce

Python
Updated Jun 12, 2026

qqr

254

qqr is an RL training framework for open-ended agents.

Python
Updated Jun 10, 2026

HiAGM

230

Hierarchy-Aware Global Model for Hierarchical Text Classification

Python
Updated Jun 1, 2026

SeqGPT

227

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Python
Updated Jun 1, 2026

Multi-CPR

206

[SIGIR 2022] Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

Python
Updated Jun 1, 2026

KB-NER

186

Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.

Python
Updated May 22, 2026

MaskSearch

155

Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"

Python
Updated Jun 6, 2026

CLNER

93

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

Python
Updated May 19, 2026

MultilangStructureKD

74

[ACL 2020] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling

Python
Updated Jun 1, 2026

E2Rank

57

E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker

Python
Updated Jun 10, 2026

LaRA

51

The code for LaRA Benchmark

Python
Updated Jun 8, 2026

CoFE-RAG

45

No description provided for this repository.

Python
Updated Jun 7, 2026

RankingGPT

35

code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》

Python
Updated Apr 9, 2026

ProtoRE

32

Code for 'Prototypical Representation Learning for Relation Extraction'.

Python
Updated Jun 1, 2026

MuVER

32

[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Python
Updated Apr 9, 2026

AISHELL-NER

25

[ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech

Unknown Language
Updated Jan 4, 2026

DAAT-CWS

23

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

Python
Updated Jun 1, 2026

MANNER

20

[ACL 2023] MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition

Python
Updated Jun 1, 2026

HLATR

20

Hybrid List Aware Transformer Reranking

Unknown Language
Updated Apr 9, 2026

AIN

20

Code for our EMNLP 2020 Paper "AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network"

Python
Updated Apr 9, 2026

CDQA

18

CDQA: Chinese Dynamic Question Answering Benchmark

Python
Updated Apr 9, 2026

EBM-Net

14

Codes for the EMNLP'2020 paper "Predicting Clinical Trial Results by Implicit Evidence Integration".

Python
Updated Nov 27, 2024

StructuralKD

11

[ACL-IJCNLP 2021] Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor

Python
Updated Jun 1, 2026

WebDetective

7

A new evaluation paradigm for deep search that identifies specific LLM failure sources, introduces challenging hint-free datasets with holistic evaluation, and offers a strong baseline incorporating memory and verification.

Python
Updated Jun 1, 2026

Vec-RA-ODQA

6

Source code of paper Improving "Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts

Python
Updated Jun 1, 2026

IBKD

3

This is the official repository for the IBKD knowledge distillation method, as described in the paper .

Python
Updated Jun 1, 2026

MarCo-Dialog

3

No description provided for this repository.

Python
Updated Mar 17, 2022

VLLM-KB

2

[EMNLP 2025] Code for "Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference"

Python
Updated Apr 9, 2026

Key-Point-Analysis

1

No description provided for this repository.

Python
Updated Aug 29, 2024

Gumbel-CRF

1

Implementation of NeurIPS 20 paper: Latent Template Induction with Gumbel-CRFs

Unknown Language
Updated Mar 24, 2024

Partially-Observed-TreeCRFs

1

Implementation of AAAI 21 paper: Nested Named Entity Recognition with Partially Observed TreeCRFs

Unknown Language
Updated Feb 28, 2023

hilichurl

0

No description provided for this repository.

Unknown Language
Updated Jan 13, 2026

Triaffine-nested-ner

0

[ACL 2022 Findings] Fusing Heterogeneous Factors with Triaffine Mechanism for Nested Named Entity Recognition

Unknown Language
Updated May 1, 2022

ICD-MSMN

0

[ACL 2022] Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding

Unknown Language
Updated Apr 29, 2022

Alibaba-TREC-PM

0

Codes and data for Alibaba's winning systems at the TREC Precision Medicine Track 2020.

Unknown Language
Updated Aug 28, 2021

PoincareProbe

0

Implementation of ICLR 21 paper: Probing BERT in Hyperbolic Spaces

Unknown Language
Updated Apr 7, 2021

Frequently asked questions

What does Alibaba-NLP build on GitHub?

Alibaba-NLP builds various tools and frameworks focused on AI search technologies. Key repositories include DeepResearch, which is an open-source deep research agent, and ZeroSearch, aimed at enhancing the search capabilities of large language models.

Which programming languages does Alibaba-NLP use?

Alibaba-NLP primarily uses Python for its development work. This language is prevalent across their public repositories, allowing for efficient implementation of their AI-driven projects and frameworks.

Are Alibaba-NLP's repositories public?

Yes, Alibaba-NLP's repositories are public on GitHub. This openness allows collaboration and engagement with the broader development community, fostering advancements in AI search technologies and other related fields.

Is this exposure intended?

Monitor Tongyi Lab, Alibaba Group with RepoGuard and get alerted the moment a new public repository appears.

Monitor this account