Skip to content

Conversation

@AVBelyy
Copy link

@AVBelyy AVBelyy commented Apr 22, 2022

Abstract

This PR adds the ability to run SAFRAN inference in the batched mode from Python 3.x. It is based on the code from Main.cpp, where the parts related to initialization and inference are decoupled to implement a simple Python API using SWIG. Currently, Python API only supports prediction of tails using the "applymax" action (since this is what was required for my application), but it should be straightforward to support the prediction of heads and support other actions as well.

Example usage

api = safran.SAFRAN(path_to_train, path_to_rules)
result = api.query([['head_1', 'rel_1', 'tail_1'],
                    ...,
                    ['head_n', 'rel_n', 'tail_n']], k=100)

where ['head_i', 'rel_i', 'tail_i'] are RDF triples, in SAFRAN train/validation/test file format, and result has the following format:

result = {
    ('head_1', 'rel_1'): [('tail_11', confidence_11, rule_id_11), .., ('tail_1k', confidence_1k, rule_id_1k)],
    ...
    ('head_n', 'rel_n'): [('tail_n1', confidence_n1, rule_id_n1), .., ('tail_nk', confidence_nk, rule_id_nk)],
}

where rule_id_ij is the ID (0-indexed line number in the path_to_rules file) of the rule that predicted the tail_ij entity with confidence confidence_ij.

Installation

Install SAFRAN first: reference. Then, depending on your OS:

Ubuntu

sudo apt-get install python3 python3-dev swig
cd /path/to/safran/repo
cd python_bindings
python3 setup.py install

MacOS (homebrew)

brew install python3 swig
cd /path/to/safran/repo
cd python_bindings
python3 setup.py install

On MacOS, you would need a C++ compiler that supports OpenMP via -fopenmp flag. Currently, it seems that the default clang does not support it and you need workarounds, such as using lvm or gcc from brew (src). You would also need C++17 support in your C++ compiler to compile SAFRAN and Python bindings.

Changelog

A few things were changed in the SAFRAN library itself to allow smoother interoperation between the wrapper and the C++ code, namely:

  • Explanation is now an abstract class, decoupling SQLite-specific implementation and the Explanation interface. There are two implementations of Explanation, namely SQLiteExplanation (used in Main.cpp) and InMemoryExplanation (used in Python wrapper).
  • RuleApplication now supports "hot-swapping" TesttripleReaders with RuleApplication::updateTTR, to be able to be run in batched mode.
  • TesttripleReader separates initialization and reading from file, to allow reading test triples from multiple sources (e.g. from memory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant