YAYI 2 是中科闻歌研发的新一代开源大语言模型,采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)
-
Updated
Apr 7, 2024 - Python
YAYI 2 是中科闻歌研发的新一代开源大语言模型,采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)
Foundation Architecture for (M)LLMs
A curated list of pretrained sentence and word embedding models
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
Summarization Papers
中文法律LLaMA (LLaMA for Chinese legel domain)
word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
Code associated with the Don't Stop Pretraining ACL 2020 paper
Live Training for Open-source Big Models
Papers and Datasets on Instruction Tuning and Following. ✨✨✨
ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
MWPToolkit is an open-source framework for math word problem(MWP) solvers.
[NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.
Worth-reading papers and related resources on attention mechanism, Transformer and pretrained language model (PLM) such as BERT. 值得一读的注意力机制、Transformer和预训练语言模型论文与相关资源集合
EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection (WWW23)
On Transferability of Prompt Tuning for Natural Language Processing
Bamboo-7B Large Language Model
Add a description, image, and links to the pretrained-language-model topic page so that developers can more easily learn about it.
To associate your repository with the pretrained-language-model topic, visit your repo's landing page and select "manage topics."