-
Notifications
You must be signed in to change notification settings - Fork 0
Text
CTRL: A Conditional Transformer Language Model for Controllable Generation [code] (Sept 2019)
Language transformer model with context text to generate content.
Extreme Language Model Compression with Optimal Subwords and Shared Projections (Sept 2019)
Tiny BERT. Master-student training to decrease the BERT model.
Large Memory Layers with Product Keys [code] (Jul 2019)
A key-value memory layer that can scale to very large sizes while keeping exact search on the key space.
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context [code] (Jun 2019)
Transformer architecture which adds recurrence to avoid the limitations of the context size. It uses the previous context in a recurrent way.
Language Models are Unsupervised Multitask Learners [code] (Feb 2019)
GPT-2 model of open AI for text
A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks [code] (Nov 2018)
Train a network to perform several tasks where initial layers perform some tasks and deeper layers others. That is why is hierarchical.
BERT: Pre-training of Deep Bidirectional Transformers forLanguage Understanding [code] (Oct 2018)
Method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus, and then use that model for downstream NLP tasks that we care about.
Attention Is All You Need (Dec 2017)
Translation of sequences based solely on attention mechanisms, dispensing with recurrence and convolutions
entirely.
Plan, Attend, Generate: Planning for Sequence-to-Sequence Models (Nov 2017)
Model which can plan ahead in the future when it computes its alignments between input and output sequences, constructing a matrix of proposed future alignments and a commitment vector.
A Deep Reinforced Model for Abstractive Summarization (May 2017)
Model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning.
- Check in which category the paper fits
- Check in which subcategory the paper fits (create a new one if needed)
- Add the title, link, the month and year it was published, a link to the code if exits and the contribution of the paper. Papers should be sorted by more recent first in each category. Example:
Title of the paper [code] (Jun 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.
Title of the paper (Jan 2018)
A couple of lines describing the main contribution of the paper. Do not copy the abstract or write more than 2 lines in order to keep the wiki tidy.