SENTENCE RECONSTRUCTION

The purpose of this project is to take in input a sequence of words corresponding to a random permutation of a given english sentence, and reconstruct the original sentence.

The otuput can be either produced in a single shot, or through an iterative (autoregressive) loop generating a single token at a time.

Constraints:

No pretrained model can be used.
The neural network models should have less than 20M parameters.
No postprocessing should be done (e.g. no beamsearch).
You cannot use additional training data.

Dataset

The dataset is composed by sentences taken from the generics_kb dataset of hugging face. We restricted the vocabolary to the 10K most frequent words, and only took sentences making use of this vocabulary.

!pip install datasets ds = load_dataset('generics_kb',trust_remote_code=True)['train']

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
IreneBurri_DeepLearningProject.ipynb		IreneBurri_DeepLearningProject.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SENTENCE RECONSTRUCTION

Constraints:

Dataset

About

Uh oh!

Releases

Packages

Languages

ireneburri/SentenceReconstruction_DeepLearning

Folders and files

Latest commit

History

Repository files navigation

SENTENCE RECONSTRUCTION

Constraints:

Dataset

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages