Skip to content

The aim is to eradicate the mechanical tone characteristic of machine-generated writing, thereby supplanting LLMs on the scale of hundreds of billions.

Notifications You must be signed in to change notification settings

tardigrade2017/faker_writer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Faker Writer

A lightweight LLM for fine-tuning ultra-large LLMs.

Why did we choose to develop this project?

While ultra-large-scale LLMs such as ChatGPT, Tongyi Qianwen, Wenxin Yiyuan, KImi, and Baichuan excel at executing complex instructions and grasping the intricacies of human language, their output is often distinguishable due to an inherent signature that arises from factors such as moral alignment. This distinctive quality makes it apparent that the text was generated by an LLM rather than a human being. However, in applications like creative writing, casual conversation, or role-playing scenarios, there is a pressing need for these models to produce language that is not only more natural but also convincingly indistinguishable from human authorship.

The advent of fine-tuning techniques for large models has indeed offered a means to cater to diverse, specialized requirements. Nevertheless, these methods come with substantial drawbacks, including exorbitant costs and inherent complexities. Notably, fine-tuning can lead to catastrophic forgetting, where the model loses previously learned knowledge during the customization process. As a result, the high financial outlay and technical challenges associated with fine-tuning pose significant barriers for small companies, individual developers, and non-profit organizations, rendering this approach largely inaccessible to such entities.

Therefore, developing a lightweight model capable of fine-tuning ChatGPT is of particular importance.

Data

Hundreds of thousands of data entries originate from the Chinese Internet. We have successfully employed these data to rectify Wenxin Yiyuan's linguistic patterns through fine-tuning.

example

About

The aim is to eradicate the mechanical tone characteristic of machine-generated writing, thereby supplanting LLMs on the scale of hundreds of billions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published