GitHub

SCHISM stands for Semantic Classification of High-resolution Imaging for Scanned Materials. This framework provides tools for the semantic segmentation of CT scanner images of rocks, but it is also applicable to any kind of image as long as semantic segmentation is required. The framework supports both training and inference workflows.

⚙️ Installation

Clone this repository to your local machine: git clone git@github.com:FloFive/SCHISM.git
Navigate to the cloned directory: cd <some path> SCHISM
Install the library (Python 3.12.10 mini is required) pip install -e .

❓ How to use

SCHISM offers three main functionalities: Preprocessing, Training and Inference.

General steps

Organise your data in the required structure (see Data Preparation).
Set up an INI configuration file (see INI File Setup).
Run the main script: python schism.py
Navigate through the command-line menu:
- Option 1- Preprocessing: Customise your data by computing dataset-specific mean and standard deviation for improved normalisation during training and/or reformat your segmentation masks to match the input format required by SCHISM.
- Option 2- Training: Train a new model.
- Option 3- Inference: Make predictions using a trained model.

Preprocessing workflow

Three available options :

Auto brightness/contrast adjustment: Automatically adjust the brightness and contrast of your images. This process rescales pixel values based on histogram minimum/maximum (hmin/hmax). The original images folder will be renamed to raw_images, and the new adjusted images will be saved in a newly created images folder. The function has been inspired by the work of Schindelin et al. 2012 / Fiji. Two options are made available to the user:
- ref image: Use one chosen image to set hmin/hmax for all images (consistent contrast).
- per image: Compute hmin/hmax separately for each image (max local contrast, less consistency).
JSON generation: Compute the mean and standard deviation from part or all of your dataset. The results will be saved as a JSON file in your dataset folder.
Normalisation: Process your data to produce SCHISM-compatible segmentation masks. The original masks folder will be renamed to raw_masks, and the new, normalised masks will be saved in a newly created masks folder.

⚠️ Input data must follow the format described in the Data preparation section of the documentation.

Training workflow

Prepare the dataset: Ensure the dataset is organised according to the required directory structure (presented below).
Create an INI file: Define training parameters such as learning rate, batch size, and model architecture in the INI file (presented below).
Run the training command: Launch the training process, then select the training option and specify:
- The dataset directory: Contains one or more datasets. The ordering and sorting of the data are explained later in this README.
- The output folder: This is the workspace where all generated results are stored. After training, it will include the model weights, along with other relevant outputs. Each file within output/ is described in detail later in this README.
- The path to the INI file (described here).

Inference workflow

To make predictions:

Use trained weights: Ensure the trained model weights are saved from the training phase.
Prepare the dataset for prediction: Ensure your data is structured in the format required by SCHISM for inference. See the Data preparation section for details.
Run the inference command: Launch the prediction process, then select the training option and specify:
- The folder containing trained weights.
- The dataset for prediction.

Predictions on the user's data will be saved in a directory named after the metric used during inference (e.g., preds_X, where X is the name of the selected evaluation metric).

📜 .ini configuration file

Below is an example of an .ini configuration file. For detailed explanations of the network settings and the full INI specification, see the INI file documentation. You can set the parameters manually, or use our LLM-powered solution to automatically generate .ini files.

[Model]
n_block=4
channels=8
num_classes=3
model_type=UnetSegmentor
k_size=3
activation=leakyrelu
 
[Optimizer]
optimizer=Adam
lr=0.01

[Scheduler]
scheduler = ConstantLR

[Loss]
loss= CrossEntropyLoss
ignore_background=True
weights=True

[Training]
batch_size=4
val_split=0.8
epochs=50
metrics=Jaccard, ConfusionMatrix
 
[Data]
crop_size=128
img_res=560
num_samples=7000

👾 Data preparation

The data should be organised as follows:

data/  <--- Select this folder for normalisation, training, or inference
├── dataset_1/
│   ├── images/       # Grayscale TIFF images (e.g., image0000.tif, image0001.tif, ...)
│   ├── masks/        # Corresponding TIFF masks (e.g., mask0000.tif for image0000.tif)
│   ├── raw_images/   # Optional: original, untreated images (renamed after auto brightness/contrast adjustment)    
|   └── raw_masks/    # Optional: original, unnormalised masks (renamed after normalisation)    
├── dataset_2/
│   ├── images/
│   └── masks/
├── ...
├── dataset_n/
│   ├── images/
│   └── masks/
└── data_stats.json   # Optional, generated during JSON creation

Directory descriptions

images/: Contains the grayscale TIFF input images, sequentially named for logical ordering.
masks/: Contains segmentation masks in SCHISM-compatible format (after normalisation, or provided by the user).
raw_masks/: Backup of original masks before normalisation.
data_stats.json: (Optional) Automatically generated during JSON creation. Stores mean and standard deviation values per dataset.

💾 Training Output Files

Upon completing a training session, several files will be generated in the weight folder:

data_stats.json: The standard deviation and mean values used to normalise the images.
hyperparameters.ini: A copy of the INI file used for the training session.
learning_curves.png: Displays the loss and metrics values as a function of the epochs.
model_best_{metric(s)}.pth: Contains the best model weights based on each metric specified in the INI file.
model_best_loss.pth: Contains the best model weights based on the loss value.
test/train/val_indices.txt: Saves the indices of images and masks used for training, validation, and testing. These indices are formatted as [dataset subfolder][image or mask number in the folder]. For example, if you have 5,000 image/mask pairs, but num_samples is set to 3,000 and val_split is 0.8, then 2,400 indices will be recorded in train_indices.txt, 600 in val_indices.txt, and the remaining 2,000 in test_indices.txt.

🐞 Debugging

A constant named DEBUG_MODE is defined in tools/constants.py.

If DEBUG_MODE = True, SCHISM will display the full Python trace when an error occurs.
If DEBUG_MODE = False, only a concise error message is shown.

This allows switching between developer-friendly debugging and cleaner end-user output.

❤️‍🔥 Contributions

Contributions are welcome! Please fork the repository and submit a pull request.

😞 Found a bug?

If you encounter a bug or have an issue running the code, please open an issue. If you have any questions or need further assistance, don't hesitate to contact Florent Brondolo (florent.brondolo@akkodis.com) or Samuel Beaussant (samuel.beaussant@akkodis.com).

📚 Citation / Bibtex

If you use our solution or find our work helpful, please consider citing it as follows:

@misc{schism2025,
  title       = {SCHISM: Semantic Classification of High-resolution Imaging for Scanned Materials},
  author      = {Florent Brondolo and Samuel Beaussant and Mehdi Mankaï and Saïd Ezzedine and Ozan Yazar and Pierre Fancelli},
  year        = {2025},
  howpublished= {\url{https://github.com/FloFive/SCHISM}},
  note        = {GitHub repository}
}

Name		Name	Last commit message	Last commit date
Latest commit History 582 Commits
code		code
docs		docs
garvis		garvis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚙️ Installation

❓ How to use

General steps

Preprocessing workflow

Training workflow

Inference workflow

📜 .ini configuration file

👾 Data preparation

Directory descriptions

💾 Training Output Files

🐞 Debugging

❤️‍🔥 Contributions

😞 Found a bug?

📚 Citation / Bibtex

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 11

Uh oh!

Languages

License

FloFive/SCHISM

Folders and files

Latest commit

History

Repository files navigation

⚙️ Installation

❓ How to use

General steps

Preprocessing workflow

Training workflow

Inference workflow

📜 .ini configuration file

👾 Data preparation

Directory descriptions

💾 Training Output Files

🐞 Debugging

❤️‍🔥 Contributions

😞 Found a bug?

📚 Citation / Bibtex

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 11

Uh oh!

Languages

Packages