Skip to content

Introduce Parser interface. #13

@OmarMGaber

Description

@OmarMGaber

Description

Introduce a Parser interface to support different file types such as .txt, .html, .pdf, etc. This makes the engine extensible and ready for multi-format parsing.


Why

  • Currently, the engine only supports parsing plain .txt files.
  • Introducing a common interface allows us to plug in support for other formats cleanly.

What to Add

  • Define a Parser interface with a method like Parse(path string) ([]string, error) that extracts indexable results.
  • Refactor the existing .txt parsing logic to implement this interface.
  • Update the engine to use the interface instead of hardcoding .txt logic.
  • Introduce a parser factory that select parser based on file extension or file headers.

Example

type Parser interface {
    Parse(path string) ([]string, error)
}

// Usage
parser := parserFactory.GetParser("file.html")
words, err := parser.Parse("file.html")

Metadata

Metadata

Labels

coreCore feature must be implmented

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions