Skip to content

Rvbens/Chatbot-en-Espanol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chatbot en Español

Introduction

Conversational agent in spanish done with deep learning and a dataset of movies subtitles. If you want to walk directly to the transformer version you can do it here.

Installation

Clone the repository and install:

pip install .

You can alternatively install from pip, which doesn't download the exploratory notebooks:

pip install spanish_chatbot

For a quickstart:

from spanish_chatbot import TransformerChatbot
chatbot = TransformerChatbot(load_quant=True,use_cuda=False) # load pre-trained model
chatbot.evaluateOneInput('Hola')                             # one input, one output
chatbot.evaluateCycle()                                      # Cicle of input and outputs

Model description

  • Seq2seq. For a detailed explanation in spanish you can see this blog post. Features:

  • Transformer. Features:

    • Weight tying
    • Beam search
    • Quantization: Pytorch Dynamic Quantization. Model size reduced to 41% of the original and 2x inference speed up. Backends suported:
      • x86 CPUs with AVX2 support or higher (without AVX2 some operations have inefficient implementations)
      • ARM CPUs (typically found in mobile/embedded devices)

Instructions

  • For training:
    1. Download dataset from here here (2Gb) and put it on /data
    2. Generate data with python pre_processing.py. Arguments:
      • --lines: number of lines from the orignial dataset to be processed. Default 500_00
      • --max_len: max length of the sentence. Default: 40
      • --min_count: min count of a word to be left of the vocabulary. Default: 10
    3. Run the training notebook for training and evaluating of the model

For a detailed explanation of the processing see the notebook.

Credits

About

Chatbot in spanish using differents model: Seq2Seq model with Luong attention and transformer

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors