Skip to content

🚀 Feature Request: Extract JSON from noisy or wrapped AI model output #11

@ErfanBahramali

Description

@ErfanBahramali

Hi
First, thank you for this great library — it’s very helpful.

I’d like to suggest an enhancement based on common challenges when working with AI models, especially LLMs that aren’t always very consistent.


✏️ Problem

When using LLMs to generate JSON, the output often includes extra text before or after the actual JSON.
For example:

Hello! Here is your result:
{"key":"value"}

Or sometimes the output is formatted as code blocks:

```json
[{ "key": "value" }]```

In such cases, calling parse directly fails, since the string isn’t valid JSON.


đź’ˇ Proposed solution

Add an option (like other enums) that allows the parser to:

  • Detect and extract the first JSON snippet from the text, ignoring non‑JSON prefixes or suffixes.
  • Optionally specify whether to search for an object ({...}) or an array ([...]) depending on what the user expects.
  • Optionally fix slightly malformed JSON, e.g. removing extra characters between braces or brackets.

Example utility function to extract JSON from text:

export function extractJsonFromText(text: string): object {
  const matches = text.match(/[{\[]{1}([,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]|".*?")+[}\]]{1}/gis);
  return matches.map((m) => JSON.parse(m)).flat();
}

âś… Example usage

import { parse } from "partial-json";

const result = parse('```{"key":"value"}```');
// Ideally: result => { key: 'value' }

Looking forward to your feedback! 🌱

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixThis will not be worked on

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions