CTranslate is a C++ implementation of OpenNMT's translate.lua script with no LuaTorch dependencies. It facilitates the use of OpenNMT models in existing products and on various platforms using Eigen as a backend.
CTranslate provides optimized CPU translation and optionally offloads matrix multiplication on a CUDA-compatible device using cuBLAS. It only supports OpenNMT models released with the release_model.lua script.
- CUDA for matrix multiplication offloading on a GPU
- Intel® MKL for an alternative BLAS backend
CMake and a compiler that supports the C++11 standard are required to compile the project.
git submodule update --init
mkdir build
cd build
cmake ..
make
It will produce the dynamic library libonmt.so (or .dylib on Mac OS, .dll on Windows) and the translation client cli/translate.
CTranslate also bundles OpenNMT's Tokenizer which provides the tokenization tools lib/tokenizer/cli/tokenize and lib/tokenizer/cli/detokenize.
- To give hints about Eigen location, use the
-DEIGEN_ROOT=<path to Eigen library>option. - To compile only the library, use the
-DLIB_ONLY=ONflag. - To disable OpenMP, use the
-DWITH_OPENMP=OFFflag.
- Unless you are cross-compiling for a different architecture, add
-DCMAKE_CXX_FLAGS="-march=native"to thecmakecommand above to optimize for speed. - Consider installing Intel® MKL when you are targetting Intel®-powered platforms. If found, the project will automatically link against it.
See --help on the clients to discover available options and usage. They have the same interface as their Lua counterpart.
This project is also a convenient way to load OpenNMT models and translate texts in existing software.
Here is a very simple example:
#include <iostream>
#include <onmt/onmt.h>
int main()
{
// Create a new Translator object.
auto translator = onmt::TranslatorFactory::build("enfr_model_release.t7");
// Translate a tokenized sentence.
std::cout << translator->translate("Hello world !") << std::endl;
return 0;
}
For a more advanced usage, see:
include/onmt/TranslatorFactory.hto instantiate a new translatorinclude/onmt/ITranslator.h(theTranslatorinterface) to translate sequences or batch of sequencesinclude/onmt/TranslationResult.hto retrieve results and attention vectorsinclude/onmt/Threads.hto programmatically control the number of threads to use
Also see the headers available in the Tokenizer that are accessible when linking against CTranslate.
Some model configurations are currently unsupported:
- GRU
- deep bidirectional encoder
- pyramidal deep bidirectional encoder
- concat variant of global attention
- bridges other than copy
Additionally, CTranslate misses some advanced features of translate.lua:
- gold data score
- best N hypotheses
- hypotheses filtering
- beam search normalization