10 h ago에 업데이트됨

Organization

THUNLP의 공개 GitHub 발자국

@thunlp

GitHub에서 프로필 보기

Natural Language Processing Lab at Tsinghua University

FIT Building, Tsinghua University, Beijing

269

공개 저장소

83,916

총 별점

3,402

팔로워

THUNLP는 베이징의 칭화대학교에 위치한 자연어 처리 연구소로, GitHub에서 다양한 공개 리포지토리를 운영하고 있습니다. 주요 프로그래밍 언어로는 Python, C++, TeX, Java, JavaScript, HTML이 있으며, GNNPapers와 WantWords와 같은 주목할 만한 프로젝트를 포함하고 있습니다.

주요 언어

Python 74C++ 4TeX 3Java 2JavaScript 1HTML 1C 1TypeScript 1

공개 저장소

GNNPapers

★16,792

Must-read papers on graph neural networks (GNN)

알 수 없는 언어

업데이트됨 2026년 6월 13일

WantWords

★7,109

An open-source online reverse dictionary.

JavaScript

업데이트됨 2026년 6월 12일

OpenPrompt

★4,877

An Open-Source Framework for Prompt-Learning.

Python

업데이트됨 2026년 6월 11일

OpenNRE

★4,466

An Open-Source Package for Neural Relation Extraction (NRE)

Python

업데이트됨 2026년 6월 10일

PromptPapers

★4,315

Must-read papers on prompt-based tuning for pre-trained language models.

알 수 없는 언어

업데이트됨 2026년 6월 7일

OpenKE

★4,040

An Open-Source Package for Knowledge Embedding (KE)

Python

업데이트됨 2026년 6월 11일

PLMpapers

★3,362

Must-read Papers on pre-trained language models.

알 수 없는 언어

업데이트됨 2026년 6월 8일

UltraChat

★2,864

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python

업데이트됨 2026년 6월 13일

NRLPapers

★2,517

Must-read papers on network representation learning (NRL) / network embedding (NE)

TeX

업데이트됨 2026년 6월 10일

THULAC-Python

★2,087

An Efficient Lexical Analyzer for Chinese

Python

업데이트됨 2026년 6월 9일

OpenNE

★1,705

An Open-Source Package for Network Embedding (NE)

Python

업데이트됨 2026년 5월 26일

TAADpapers

★1,574

Must-read Papers on Textual Adversarial Attack and Defense

Python

업데이트됨 2026년 5월 20일

KRLPapers

★1,525

Must-read papers on knowledge representation learning (KRL) / knowledge embedding (KE)

TeX

업데이트됨 2026년 5월 21일

KB2E

★1,423

Knowledge Graph Embeddings including TransE, TransH, TransR and PTransE

C++

업데이트됨 2026년 5월 30일

ERNIE

★1,420

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

Python

업데이트됨 2026년 5월 26일

THUOCL

★1,079

THUOCL（THU Open Chinese Lexicon）中文词库

알 수 없는 언어

업데이트됨 2026년 6월 13일

OpenDelta

★1,045

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Python

업데이트됨 2026년 5월 26일

NREPapers

★1,030

Must-read papers on neural relation extraction (NRE)

TeX

업데이트됨 2026년 6월 2일

OpenCLaP

★984

Open Chinese Language Pre-trained Model Zoo

알 수 없는 언어

업데이트됨 2026년 5월 8일

ToolLearningPapers

★922

이 저장소에 대한 설명이 제공되지 않았습니다.

알 수 없는 언어

업데이트됨 2026년 6월 2일

WebCPM

★911

Official codes for ACL 2023 paper "WebCPM: Interactive Web Search for Chinese Long-form Question Answering"

HTML

업데이트됨 2026년 5월 31일

RCPapers

★889

Must-read papers on Machine Reading Comprehension

알 수 없는 언어

업데이트됨 2026년 5월 26일

LLMxMapReduce

★875

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 9일

THULAC

★832

An Efficient Lexical Analyzer for Chinese

C++

업데이트됨 2026년 6월 1일

Chinese_Rumor_Dataset

★782

中文谣言数据

알 수 없는 언어

업데이트됨 2026년 6월 1일

OpenAttack

★777

An Open-Source Package for Textual Adversarial Attack.

Python

업데이트됨 2026년 6월 8일

FewRel

★746

A Large-Scale Few-Shot Relation Extraction Dataset

Python

업데이트됨 2026년 5월 26일

OPD

★654

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python

업데이트됨 2026년 6월 13일

DocRED

★652

Dataset and codes for ACL 2019 DocRED: A Large-Scale Document-Level Relation Extraction Dataset.

Python

업데이트됨 2026년 6월 2일

OpenHowNet

★637

Core Data of HowNet and OpenHowNet Python API

Python

업데이트됨 2026년 5월 29일

ProactiveAgent

★609

A LLM-based Agent that predict its tasks proactively.

Python

업데이트됨 2026년 6월 12일

TensorFlow-TransX

★513

An implementation of TransE and its extended models for Knowledge Representation Learning on TensorFlow

Python

업데이트됨 2026년 5월 6일

CAIL

★509

Chinese AI & Law Challenge

알 수 없는 언어

업데이트됨 2026년 6월 9일

LegalPapers

★498

Must-read Papers on Legal Intelligence

알 수 없는 언어

업데이트됨 2026년 5월 25일

BERT-KPE

★447

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 5월 24일

OpenMatch

★442

An Open-Source Package for Information Retrieval.

Python

업데이트됨 2026년 5월 24일

LLaVA-UHD

★424

LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs

Python

업데이트됨 2026년 6월 11일

Fast-TransX

★405

An Efficient implementation of TransE and its extended models for Knowledge Representation Learning

C++

업데이트됨 2026년 6월 6일

InfLLM

★404

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

Python

업데이트됨 2026년 6월 9일

Few-NERD

★400

Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Recognition Dataset"

Python

업데이트됨 2026년 5월 26일

TensorFlow-Summarization

★386

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 12일

BMCourse

★371

The repo for Tsinghua summer course: Interdisciplinary Seminar on Big Models

Python

업데이트됨 2026년 5월 20일

LEGENT

★341

Open Platform for Embodied Agents

Python

업데이트됨 2026년 6월 2일

THULAC-Java

★339

An Efficient Lexical Analyzer for Chinese

Java

업데이트됨 2026년 6월 6일

ChatEval

★335

Codes for our paper "ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate"

Python

업데이트됨 2026년 6월 5일

NSC

★287

Neural Sentiment Classification

Python

업데이트됨 2026년 5월 8일

DeltaPapers

★284

Must-read Papers of Parameter-Efficient Tuning (Delta Tuning) Methods on Pre-trained Models.

알 수 없는 언어

업데이트됨 2026년 6월 9일

JustRL

★276

[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Python

업데이트됨 2026년 6월 11일

PL-Marker

★272

Source code for "Packed Levitated Marker for Entity and Relation Extraction"

Python

업데이트됨 2026년 6월 10일

OpenBackdoor

★209

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

Python

업데이트됨 2026년 5월 18일

SE-WRL

★196

Improved Word Representation Learning with Sememes

업데이트됨 2026년 5월 8일

LegalPLMs

★194

Source code and checkpoints for legal pre-trained language models.

Python

업데이트됨 2026년 6월 5일

Auto_CLIWC

★168

Code for Chinese LIWC Lexicon Expansion via Hierarchical Classification of Word Embeddings with Sememe Attention (AAAI18)

Python

업데이트됨 2026년 5월 26일

DeepNote

★134

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 7일

TritonBench

★133

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Python

업데이트됨 2026년 6월 8일

attribute_charge

★132

The source code of our COLING'18 paper "Few-Shot Charge Prediction with Discriminative Legal Attributes".

Python

업데이트됨 2026년 5월 1일

LEVEN

★123

Source code and dataset for ACL2022 Findings Paper "LEVEN: A Large-Scale Chinese Legal Event Detection dataset"

Python

업데이트됨 2026년 6월 12일

Ouroboros

★117

Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)

Python

업데이트됨 2026년 5월 26일

MatPlotAgent

★115

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 8일

MultiRD

★110

Code and data of the AAAI-20 paper "Multi-channel Reverse Dictionary Model"

Python

업데이트됨 2026년 5월 6일

GEAR

★100

Source code for ACL 2019 paper "GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification"

Python

업데이트됨 2026년 5월 24일

TopJudge

★100

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 4월 30일

Prompt-Transferability

★99

On Transferability of Prompt Tuning for Natural Language Processing

Python

업데이트됨 2026년 5월 26일

KV-PLM

★89

Source code for "A Deep-learning System Bridging Molecule Structure and Biomedical Text with Comprehension Comparable to Human Professionals"

Python

업데이트됨 2026년 6월 3일

DebugBench

★86

The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".

Python

업데이트됨 2026년 5월 21일

ChartCoder

★79

[ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation

Python

업데이트됨 2026년 4월 28일

Advbench

★77

Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP".

Python

업데이트됨 2026년 5월 6일

NeuIRPapers

★74

Must-read Papers on Neural Information Retrieval

알 수 없는 언어

업데이트됨 2026년 5월 29일

MMDW

★73

Max-margin DeepWalk

Java

업데이트됨 2026년 5월 6일

Optima

★72

Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"

Python

업데이트됨 2026년 5월 8일

KARL

★68

KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding

Python

업데이트됨 2026년 5월 14일

CorefBERT

★67

Source code for EMNLP 2020 paper "Coreferential Reasoning Learning for Language Representation"

Python

업데이트됨 2026년 5월 24일

H-Neurons

★66

The official implementation of the paper: H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs

Python

업데이트됨 2026년 6월 9일

Adaptive-Note

★60

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 7일

Delta-CoMe

★59

Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024

Python

업데이트됨 2026년 5월 2일

EmbodiedEval

★58

Evaluate Multimodal LLMs as Embodied Agents

Python

업데이트됨 2026년 6월 11일

FR-Spec

★55

[ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling

C++

업데이트됨 2026년 5월 29일

duplex-model

★46

이 저장소에 대한 설명이 제공되지 않았습니다.

TypeScript

업데이트됨 2026년 6월 2일

HiddenKiller

★45

Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"

Python

업데이트됨 2026년 5월 7일

SubCharTokenization

★45

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 4월 30일

VERNet

★42

Source codes of Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction

Python

업데이트됨 2026년 5월 29일

EmbodiedAIxLLMPapers

★38

Papers on integrating large language models with embodied AI

알 수 없는 언어

업데이트됨 2026년 5월 2일

Seq1F1B

★37

Sequence-level 1F1B schedule for LLMs.

Python

업데이트됨 2026년 4월 23일

hybrid-linear-attention

★36

Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

Python

업데이트됨 2026년 5월 20일

SparsingLaw

★32

The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".

Python

업데이트됨 2026년 6월 9일

explore-and-evaluate

★31

Code for EMNLP2020 paper "Exploring and Evaluating Attributes, Values, and Structures for Entity Alignment".

Python

업데이트됨 2026년 5월 3일

CokeBERT

★30

CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models

Python

업데이트됨 2026년 5월 24일

Model_Emotion

★27

Neuron Activation

Python

업데이트됨 2026년 5월 1일

LoRAFlow

★25

ACL 2024: LoRA-Flow Dynamic LoRA Fusion for Large Language Models in Generative Tasks

Python

업데이트됨 2026년 5월 12일

VisualDS

★24

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 4월 28일

KG-Infused-RAG

★23

Official implementation for the paper "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs"

Python

업데이트됨 2026년 6월 1일

SchemaReinforcementLearning

★23

Learning to Generate STRUCTURED Output with Schema Reinforcement Learning

Python

업데이트됨 2026년 4월 28일

NOSA

★17

The official implementation of NOSA

Python

업데이트됨 2026년 6월 11일

hyperbolic_llm

★13

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 6월 9일

ClueAnchor

★12

[EMNLP 2025 Findings] ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation

Python

업데이트됨 2026년 6월 6일

Chujian

★12

A large-scale dataset of Chu bamboo slip scripts and a multi-granularity tokenizer for ancient Chinese scripts

Python

업데이트됨 2026년 5월 27일

SMP

★8

Single-Shot Meta-Pruning (SMP) for attention heads of Transformers

Python

업데이트됨 2026년 4월 29일

DECO

★2

Source code for paper "DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices".

Python

업데이트됨 2026년 5월 23일

CPMobius

★1

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 5월 14일

LexRel

★1

이 저장소에 대한 설명이 제공되지 않았습니다.

Python

업데이트됨 2026년 5월 7일

자주 묻는 질문

thunlp는 GitHub에서 무엇을 개발하나요?

thunlp는 자연어 처리와 관련된 다양한 프로젝트를 개발합니다. 주요 프로젝트로는 GNNPapers와 WantWords가 있으며, 이들은 모두 연구와 실용성에 중점을 두고 있습니다.

thunlp는 어떤 프로그래밍 언어를 사용하나요?

thunlp는 주로 Python, C++, TeX, Java, JavaScript 및 HTML과 같은 여러 프로그래밍 언어를 사용하여 리포지토리를 개발합니다. 이러한 언어들은 다양한 연구 프로젝트에 적합합니다.

thunlp의 리포지토리는 공개인가요?

네, thunlp의 모든 리포지토리는 공개되어 있습니다. 이를 통해 연구자들과 개발자들이 쉽게 접근하고 기여할 수 있도록 하고 있습니다.

이 노출이 의도된 것인가요?

THUNLP을 RepoGuard로 모니터링하고 새로운 공개 저장소가 나타나는 순간 알림을 받으세요.

이 계정 모니터링하기