Skip to content

Problem: Cannot Import Mathlib Data Using jixia + LeanSearch (version mismatch? internal name errors, zero-length JSONs) #16

@CobyHE

Description

@CobyHE

Hi, I am trying to deploy a local LeanSearch so I can take Lean compiler errors and query relevant Lean syntax information. However, I cannot get the python -m database jixia importer to work, even after aligning versions.


Environment

  • Python 3.11
  • Lean toolchains tested: 4.24.0, 4.24.0-rc1, 4.26.0-rc1, 4.26.0-rc2
  • mathlib4 tested: v4.26.0-rc2, v4.24.0-rc1
  • jixia tested: PyPI latest + tag matching 4.24.0
  • LeanSearch latest

Problem summary

1. jixia produces many empty JSON files

Running:

jixia /path/to/mathlib4

creates many .jixia/*.json files with 0 bytes.

Example:

Algebra_Group.json (empty)
Data_List_Basic.json (empty)

This seems wrong and later causes missing declarations.


2. LeanSearch import crashes or produces incomplete DB

Crash example:

IndexError: list index out of range in is_internal(name)

name can be an empty list — which should never happen.

After patching this, import finishes but DB is obviously incomplete:

module:      7355
symbol:      6006
declaration:  885   ← clearly far too small

Mathlib entries:

Mathlib declarations: only 209 (!!)


3. Even strict version alignment does not work

I tried:

  • Lean 4.24.0
  • mathlib4 v4.24.0-rc1
  • jixia 4.24.0 branch

Still see errors like:

Unknown constant `Substring.Raw`
Unknown constant `String.Pos.Raw`
Unknown identifier `Std.Stream`

and modules in batteries / proofwidgets / qq fail to build.

Seems like jixia + these mathlib versions are not compatible.


What I need

1. A known-working combination of versions

Could you provide the exact versions of:

  • Lean
  • mathlib4
  • jixia
  • LeanSearch

that are known to work together with the jixia backend?

2. Should jixia generate empty JSON files?

Is this a version mismatch or a bug?

3. Is the jixia backend still expected to work?

Or should I use doc-gen instead?


Thanks! A minimal reproducible version matrix would help a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions