Pleaee @showgood880702  i need this dataset as text2text .json file to understand the structure correctly

>  {> 
> "input": "###Instruction: ....\n\n###human: ....\n\n###chatbot: ....\n\n###human: ....\n\n###chatbot: ....\n\n###human: .....\n\n###chatbot:",> 
> > 
> "output": ".....###"> 
> }> 
> > 
> Thank you very much for the explanation. > 
> I am still a little confused about the training data structure for a chatbot. For example, here I have a multi-round conversation used as training data. Should I feed it to the model as I showed before, with the end_mark and the end? > 
> > 
>  {"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}> 
>  {"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}> 
>  {"input": "###Instruction: ....\n\n###human: ....\n\n###:chatbot:", "output": ".....###"}> 
> > 
> or should I split them as pairs of <input> and <output> as different instances, and start with the instruction? 

 _Originally posted by @showgood880702 in [#357](https://github.com/OptimalScale/LMFlow/issues/357#issuecomment-1537383616)_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pleaee @showgood880702 i need this dataset as text2text .json file to understand the structure correctly #941

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pleaee @showgood880702 i need this dataset as text2text .json file to understand the structure correctly #941

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions