Skip to content

Proposal: Simplify Rust Bindings #39

@mvisani

Description

@mvisani

Hi @xrl,

I’ve realized that we don’t need to manually rewrite all functions from every class as we’re currently doing. By including the RDKit directory in the build.rs file, CXX can automatically find not just the functions in our wrappers, but also those in the RDKit library itself. This can significantly speed up development since we can directly reference the functions in the cxx::bridge.

I’ve created a small example repository to demonstrate how straightforward this is.

Highlights of the Current Wrapper

wrapper.h:

#pragma once
#include "rust/cxx.h"
#include <GraphMol/Atom.h>
#include <memory>

namespace RDKit {
std::shared_ptr<Atom> make_shared(std::unique_ptr<Atom> atom);
std::unique_ptr<Atom> newAtom();
std::unique_ptr<Atom> newAtomFromAtomicNum(int atomicNum);
std::unique_ptr<Atom> newAtomFromSymbol(const std::string &symbol);
std::unique_ptr<Atom> newAtomFromOther(const Atom &other);
rust::String getSymbolAsString(const Atom &atom);
bool MatchRust(const Atom &atom, std::unique_ptr<Atom> other);
int calcExplicitValence(Atom &atom, bool strict = true);
int calcImplicitValence(Atom &atom, bool strict = true);
} // namespace RDKit

wrapper.cc:

#include "rdkit-rust-ffi/include/wrapper.h"

namespace RDKit {
std::shared_ptr<Atom> make_shared(std::unique_ptr<Atom> atom) { return std::shared_ptr<Atom>(atom.release()); }
std::unique_ptr<Atom> newAtom() { return std::unique_ptr<Atom>(new Atom()); }
std::unique_ptr<Atom> newAtomFromAtomicNum(int atomicNum) { return std::unique_ptr<Atom>(new Atom(atomicNum)); }
// Additional wrapper functions...
}

Using RDKit's Built-in Functions

As seen in the repository, there are more functions listed in lib.rs than in the wrapper, but all are callable and can be used directly. For example, passing self: &Atom allows us to call functions directly like atom.getTotalValence(). This removes the need for boilerplate code like:

pub fn get_is_aromatic(&self) -> bool {
    ro_mol_ffi::get_is_aromatic(self.ptr.as_ref())
}

pub fn get_atomic_num(&self) -> i32 {
    ro_mol_ffi::get_atomic_num(self.ptr.as_ref())
}

This simplifies usage in the rdkit crate and makes the code more maintainable.

Next Steps

We have two options:

  1. Refactor the existing repo to follow this approach. While it may take some time initially, it will speed up development and reduce potential errors.
  2. Continue building on the example repo I’ve created and start a new crate.

What are your thoughts?

Best regards,
Marco

P.S. I’ve added an explanation on how to download and compile RDKit and link the C++ library to our project. This should also add Windows support (although I haven’t tested it yet).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions