First off, thank you for considering contributing to xeus-ocaml! This project is a community effort, and we welcome any form of contribution, from bug reports and documentation improvements to new features.
This document provides a detailed guide for developers who want to understand the project's internals and contribute to its development. If you have questions, please feel free to open an issue on our GitHub issue tracker.
For a detailed reference of the project's C++ and OCaml APIs, please see our hosted documentation, which is automatically generated from the source code comments.
View the full API Documentation
xeus-ocaml is a hybrid C++/OCaml kernel designed to run entirely in the browser using WebAssembly (WASM). This architecture provides a fast, serverless Jupyter experience by executing all components—the Jupyter protocol handler and the OCaml language engine—within the same browser execution context.
-
C++ Kernel Core (
xocaml.wasm): The kernel's foundation is a C++ application built withxeus-lite, a lightweight version of thexeuslibrary specifically for WASM. This C++ layer is compiled to a WebAssembly module (xocaml.wasm). Its primary responsibility is to handle the Jupyter Messaging Protocol, acting as the bridge between the Jupyter frontend (like JupyterLab or Notebook) and the OCaml backend. -
OCaml Backend (
xocaml.js): All the OCaml logic—the toplevel (REPL), Merlin for code intelligence, and helper libraries—is compiled into a single JavaScript file (xocaml.js) usingjs_of_ocaml. This script exposes a clean JavaScript API that the C++ core can call into. -
Direct Communication Bridge: The C++ (WASM) and OCaml (JS) components communicate directly and efficiently within the browser's main thread:
- The C++ code uses Emscripten's
emscripten::valAPI to make direct, type-safe calls to the JavaScript functions exported by the OCaml backend. This is the primary way C++ gives commands to OCaml. - For asynchronous operations (like code execution), the C++ side passes C++ callback functions (bound via
EMSCRIPTEN_BINDINGS) to the OCaml/JS side. The OCaml code, using itsLwtlibrary for concurrency, performs the long-running task and invokes the C++ callback with the result when finished.
- The C++ code uses Emscripten's
-
JupyterLab Mime Renderer Extension: A small TypeScript-based extension, located in the
extension/directory, enhances the frontend by adding support for custom MIME types, such as rendering Graphviz DOT strings into SVG images.
This in-process model avoids the complexity and latency of Web Workers, enabling near-instantaneous communication for features like code completion.
The C++ source code is organized into a clear, modular structure.
include/xinterpreter.hpp,src/xinterpreter.cpp: This is the core of the kernel. Theinterpreterclass inherits fromxeus::xinterpreterand implements the main handlers for Jupyter messages (execute_request_impl,complete_request_impl, etc.). It manages the lifecycle of asynchronous execution requests.include/xocaml_engine.hpp,src/xocaml_engine.cpp: This is the crucial C++-to-JavaScript bridge. It abstracts away the Emscripten binding details, providing clean functions likecall_merlin_syncandcall_toplevel_asyncthat the rest of the C++ code can use without directly touchingemscripten::val.include/xcompletion.hpp,src/xcompletion.cpp: Contains the logic specifically for handlingcomplete_requestmessages. It constructs the appropriate JSON request for Merlin, calls the OCaml engine, and formats the response into a valid Jupytercomplete_reply.include/xinspection.hpp,src/xinspection.cpp: Similar to completion, this file handlesinspect_requestmessages, calling Merlin for type and documentation information and formatting it for display in tooltips.src/main_emscripten_kernel.cpp: The main entry point for the WebAssembly build. It usesEMSCRIPTEN_BINDINGSto export thexeus_ocaml::interpreterto JavaScript, making it accessible to thexeus-litefrontend loader.
This is the kernel's core feature: running OCaml code.
-
Logic Flow:
- A user runs a cell. The Jupyter frontend sends an
execute_requestmessage. xinterpreter.cpp: Theexecute_request_implmethod is called. It creates a unique ID for the request and stores the reply callback.xocaml_engine.cpp: It callscall_toplevel_async, passing the code and a C++ callback function that is bound to the request ID.ocaml/src/xocaml/xocaml.ml: The exportedprocessToplevelActionJavaScript function receives the call. It invokesXtoplevel.eval.ocaml/src/xtoplevel/xtoplevel.ml: Theevalfunction is the heart of the OCaml REPL. It usesjs_of_ocaml-toplevelto parse and execute the code phrase by phrase. It captures all outputs (stdout, stderr, the final value, and any rich display data) into a structured list.- The result list is returned asynchronously via an
Lwtpromise. When it resolves, the JavaScript callback provided by C++ is invoked. src/xinterpreter.cpp: Thehandle_eval_callbackC++ function is triggered. It parses the JSON result, publishes the various outputs (stdout, results, display data) back to the frontend, and sends the finalexecute_replyto signal completion.
- A user runs a cell. The Jupyter frontend sends an
-
Key Files:
src/xinterpreter.cpp,ocaml/src/xtoplevel/xtoplevel.ml,ocaml/src/xocaml/xocaml.ml.
This feature provides IDE-like assistance. To function, Merlin needs access to compiled interface (.cmi), implementation (.cmt), and interface-implementation (.cmti) files. We use a sophisticated hybrid approach to load these files:
-
Logic Flow:
- A user presses
Tab(completion) orShift+Tab(inspection). xinterpreter.cpp: Thecomplete_request_implorinspect_request_implmethod is called, delegating toxcompletion.cpporxinspection.cpp.xcompletion.cpp/xinspection.cpp: The handler builds a JSON request that matches the OCamlProtocol.tdefinition.xocaml_engine.cpp: It callscall_merlin_sync. This is a synchronous call that blocks until the JavaScript function returns.ocaml/src/xocaml/xocaml.ml: TheprocessMerlinActionfunction receives the request and callsXmerlin.process_merlin_action.ocaml/src/xmerlin/xmerlin.ml: This module uses themerlin-liblibrary to process the request against the current source code buffer.- The result is converted to JSON and returned synchronously all the way back to C++, where it's formatted into a Jupyter reply.
- A user presses
-
Standard Library Loading Strategy:
- Static (Core Requirement): The single most important file,
stdlib.cmi, is embedded directly into the mainxocaml.jsbundle at compile time usingppx_blob. This is critical because the OCaml toplevel requires it to initialize its environment (Compmisc.initial_env()). Loading it statically guarantees the kernel can always start correctly. - Dynamic (On Startup): The rest of the standard library's artifacts (all other
.cmi,.cmt, and.cmtifiles) are fetched asynchronously from the server when the kernel first starts. This keeps the initial bundle size small while ensuring full standard library support for completion and documentation is available shortly after launch.
- Static (Core Requirement): The single most important file,
-
Key Files:
src/xcompletion.cpp,src/xinspection.cpp,ocaml/src/xmerlin/xmerlin.ml,ocaml/src/xlibloader/xlibloader.ml,ocaml/src/xlibloader/static/,ocaml/src/xlibloader/dynamic/.
-
Logic Flow:
- During kernel initialization, the C++
interpreter::configure_impltriggers the OCaml setup. After the OCaml setup completes, it callsocaml_engine::mount_fs. ocaml/src/xfs/xfs.ml: Themount_drivefunction is called. It usesjs_of_ocaml's FFI to access Emscripten's globalModule.FSobject. It creates and registers a new device that maps OCamlSyscalls (likeopen,read,readdir) to correspondingFScalls (FS.open,FS.read,FS.readdir).- The kernel's current working directory is changed to the root of this new device (
/drive/). - When a user runs OCaml code like
open_in "file.txt", thejs_of_ocamlruntime intercepts theSyscall and routes it through the device implementation inxfs.ml, which in turn manipulates the in-memory Emscripten filesystem.
- During kernel initialization, the C++
-
Key Files:
ocaml/src/xfs/xfs.ml,src/xinterpreter.cpp,src/xocaml_engine.cpp.
The kernel supports loading third-party libraries through an automated build-time and run-time process.
-
Build-Time (
xbundletool):- A developer adds a library name (e.g.,
ocamlgraph) toocaml/src/xbundle/libs.txt. - During the
dune buildprocess, our customxbundletool is executed. - For each library in
libs.txt,xbundleusesocamlfindto resolve its entire dependency tree. - It then compiles all required OCaml modules into a single JavaScript bundle (
ocamlgraph.js). - Crucially, it also finds and collects all associated Merlin artifacts (
.cmi,.cmt,.cmti) for the entire dependency tree. - Finally, it generates a metadata module (
external_libs.ml) that maps the library name to its JS bundle and list of artifact files.
- A developer adds a library name (e.g.,
-
Run-Time (in the Notebook):
- A user executes a cell with
#require "ocamlgraph";;. ocaml/src/xtoplevel/xtoplevel.ml: Theevalfunction's parser detects the#requiredirective and callsXlibloader.load_on_demand.ocaml/src/xlibloader/xlibloader.ml: This function looks up "ocamlgraph" in theExternal_libsmetadata generated at build time.- It asynchronously fetches
ocamlgraph.jsand executes it usingJs.Unsafe.eval_string. This loads the library's code into thejs_of_ocamlruntime. - It then asynchronously fetches all the artifact files associated with
ocamlgraphand writes them to the virtual filesystem (e.g.,/static/cmis/graph.cmi,/static/cmis/dot.cmti, etc.). - Finally, it calls
Topdirs.dir_directoryto tell the toplevel to rescan its paths, making the new modules available for use and visible to Merlin.
- A user executes a cell with
-
Key Files:
ocaml/src/xtoplevel/xtoplevel.ml,ocaml/src/xlibloader/xlibloader.ml,ocaml/src/xbundle/xbundle.ml(the CLI tool),ocaml/src/xbundle/libs.txt(the library list).
-
Logic Flow:
ocaml/src/xtoplevel/xtoplevel.ml: Duringsetup, the codeopen Xlib;;is executed, making all its functions available globally.ocaml/src/xlib/xlib.ml: This module defines functions likeoutput_html. Each function creates aProtocol.DisplayDatavalue and adds it to a global, mutable list namedextra_outputs.ocaml/src/xtoplevel/xtoplevel.ml: After each phrase is executed, theevalfunction callsXlib.get_and_clear_outputs()to drain this list.- The retrieved display data objects are included in the list of outputs sent back to the C++ side, which then publishes them as
display_datamessages.
-
Key Files:
ocaml/src/xlib/xlib.ml,ocaml/src/xtoplevel/xtoplevel.ml.
This feature enables the rendering of Graphviz DOT language strings into SVG images, powered by a custom JupyterLab MIME renderer extension.
-
Logic Flow:
- A user calls the
output_dotfunction from theXlibmodule with a string containing DOT syntax. ocaml/src/xlib/xlib.ml: Theoutput_dotfunction creates aDisplayDataobject with the custom MIME typeapplication/vnd.graphviz.dotand adds it to the output queue.- The C++ kernel publishes this
display_datamessage to the frontend. - Frontend Extension: The JupyterLab frontend finds the custom MIME renderer extension located in the
extension/directory, which has registered itself to handle this specific MIME type. extension/src/index.ts: The TypeScript code for the extension receives the DOT string. It calls the@viz-js/vizlibrary, which is a WebAssembly port of the Graphviz layout engine.- The
viz.jslibrary parses the DOT string and generates a complete SVG element. - The extension then appends this SVG element directly into the cell's output area, displaying the rendered graph.
- A user calls the
-
Key Files:
ocaml/src/xlib/xlib.ml,extension/src/index.ts,extension/package.json.
This project uses pixi to manage all dependencies and build tasks, providing a consistent environment for both OCaml and C++/WASM development.
- Install
pixiby following the official installation guide. - An internet connection is required for the initial setup to download dependencies.
The entire build process is orchestrated by rattler-build via the recipe/recipe.yaml file. The pixi run build-kernel command is the main entry point.
The build happens in two main phases within the recipe:
-
Phase 1: Build OCaml to JavaScript
- An
opamswitch is initialized, and all OCaml dependencies fromdune-projectare installed. dune buildis executed. This compiles all OCaml source code inocaml/src/into various artifacts, most importantly the final JavaScript bundle:_build/default/src/xocaml/xocaml.bc.js.- This phase also builds the
xbundleutility and uses it to readocaml/src/xbundle/libs.txt, automatically packaging the specified third-party libraries (e.g.,ocamlgraph) and their artifacts into JavaScript bundles. - The build artifacts are cached in a
dune_cachedirectory to speed up subsequent builds.
- An
-
Phase 2: Build C++ Kernel to WebAssembly
cmakeis configured for anemscripten-wasm32target.- The C++ source code in
src/is compiled into object files. - Finally, the C++ objects are linked together. Critically, the
xocaml.bc.jsfile from Phase 1 is included in this linking step via the--pre-jsflag. This bundles the OCaml backend directly with the WASM module's JavaScript loader. - The final outputs (
xocaml.wasm,xocaml.js, and all static assets) are packaged into a.condafile in theoutput/directory.
To build and run a local JupyterLite instance for testing:
- Build the Frontend Extension:
pixi run -e extension build-extension - Build the Kernel Package:
pixi run build-kernel - Install the Kernel for JupyterLite:
pixi run install-kernel - Serve JupyterLite:
pixi run serve-jupyterlite
You can now access the local JupyterLite instance in your browser, typically at http://localhost:8000.
The project includes a Jest test suite for the JavaScript API exported by the OCaml code. These tests verify the core functionality of both the toplevel and Merlin in isolation.
- Location:
ocaml/tests/ - Setup:
ocaml/tests/jest.setup.jsis a crucial file. It loads the compiledxocaml.bc.js, exposes its API to the global scope for tests to use, and mocks browser APIs likeXMLHttpRequestto allow fetching of dynamic Merlin files from the local disk during tests. - Running Tests:
- First, ensure the OCaml backend is built:
pixi run -e ocaml build. - Then, run the Jest suite:
pixi run -e test test.
- First, ensure the OCaml backend is built:
We follow the standard GitHub flow for contributions:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Make your changes and commit them with clear, descriptive messages.
- Push your branch to your fork.
- Open a Pull Request against the
mainbranch of thedavy39/xeus-ocamlrepository.
We will review your PR as soon as possible. Thank you for your contribution