Skip to content

My experiment with langextract - a Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Notifications You must be signed in to change notification settings

suddeb/langextract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LangExtract

Introduction

LangExtract is a Python library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions. It processes materials such as clinical notes or reports, identifying and organizing key details while ensuring the extracted data corresponds to the source text.

Here in this Github repo, I am storing all of my experiments to extract structured information from different unstructured sources.

Code

  • experiment-1.py - Python program using langextract to process simple unstructured text and provide structured information.
  • experiment-2.py - Python program using langextract to process medical prescription (unstructured text) and provide structured data.
  • alice_in_wonderland.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).
  • romeo_juliet.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).
  • spiderman.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).

Screenshots

Experiment 1: Experiment 1

Experiment 2: Experiment 2

Alice In Wonderland: Alice In Wonderland

Romeo Juliet: Romeo Juliet

Spiderman: Spiderman

Helper Program

Below Python program is used to copy contents from Wiki Page and create txt file.

wiki-to-text.py

Authors

About

My experiment with langextract - a Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published