Skip to content
This repository was archived by the owner on Jan 30, 2026. It is now read-only.
This repository was archived by the owner on Jan 30, 2026. It is now read-only.

Support binary format discovery #69

@pleonex

Description

@pleonex

Implement an API to support auto-discovery of binary format compatible converters. The use case is to open a file (BinaryFormat / IBinary) and be able to recognize its underlying format by being able to convert it. For instance, we will be able to autodetect palette, images, script files and more.

The proposal is to create something like this:

public interface IBinaryConverterDiscovery
{
    // Analyze a format (i.e.: by reading the first bytes or checking some fields)
    // and return a converter that can be used or null if it's not compatible.
    IConverter Analyze(IBinary binary);
}

I don't like the return type of IConverter as it means we create the converter instance. Some times it could require initialization / parameters from the user or sometimes this analyzer will fill those parameters. Other options are: return the Type of the converter to use.

It should be also easy to get the converter type from the return type to give the choice to the user before converting (for instance if several IConverterDiscovery report that they can convert to a format).

It's worth to remember how this was implemented in the first version of Yarhl (LibGame by that time) with a FormatValidation abstract class with three method to implement:

protected abstract ValidationResult TestByTags(IDictionary<string, object> tags);
protected abstract ValidationResult TestByData(DataStream stream);
protected abstract ValidationResult TestByRegexp(string filepath, string filename);

This allowed to inspect not only the format but also the file path inside a container like a ROM file or some tags given to the file from another process (like ID from ROM unpacking). At the end, the result was aggregated and a final score was given for the file.

To handle the situation of inspecting Nodes we could implement another interface:

public interface INodeFormatDiscovery
{
    // Analyze a Node (i.e.: the full path, the parent node) and return
    // the convert that can transform the instance.
    IConverter Analyze(Node node);
}

Usually, discovery methods from Node information (like an ID or the path in a ROM) have are more accurate and faster than than inspecting the Format.

There could be a DiscoveryManager that handles the full process of checking first with one interface and then other, having a cache and more.

Update:
We should output the converter type and optionally its parameters as well. We should pass contexts like: name, fourCC/header 16 bytes, tree of nodes readonly (to analyze tags upto parent or search for nearby files / paths).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions