Skip to content

swtheing/LLM-Performance-Improvement-Paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

LLM-Performance-Improvement-Paper

Overview

The paradigm of improving the performance of LLM

Paper List

[1] Scale laws

[2] Diversity & Quality

  • LIMA : Less Is More for Alignment
  • Textbooks are all you need

[3] Filter Model

  • Textbooks are all you need
  • GPT3:Language Models are Few-Shot Learners

[4][5] Teacher Model & Zero shot labelling

  • Textbooks are all you need
  • GPT Self-Supervision for a Better Data Annotator

[6] Reward Model

  • Lets Verify Step by Step

[7] RLHF or DPO

  • Training language models to follow instructions with human feedback
  • Direct Preference Optimization: Your Language Model is Secretly a Reward Model
  • Scaling Laws for Reward Model Overoptimization
  • AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
  • Direct Preference Optimization: Your Language Model is Secretly a Reward Model
  • RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
  • RRHF: Rank Responses to Align Language Models with Human Feedback without tears

[8] Instruct Understanding & Prompt-Answer Matching

  • Training language models to follow instructions with human feedback
  • LIMA : Less Is More for Alignment
  • The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
  • Scaling Instruction-Finetuned Language Models

[9] Self Instruct from GPT

  • Aligning Language Models with Self-Generated Instructions
  • SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions
  • Wizardlm: Empowering large language models to follow complex instructions
  • WizardCoder: Empowering Code Large Language Models with Evol-Instruct
  • Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
  • Alpaca: A Strong Replicable Instruction-Following Model
  • How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

[10] Principle-Driven Alignment:

  • Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published