LangExtract is a Python library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions. It processes materials such as clinical notes or reports, identifying and organizing key details while ensuring the extracted data corresponds to the source text.
Here in this Github repo, I am storing all of my experiments to extract structured information from different unstructured sources.
- experiment-1.py - Python program using langextract to process simple unstructured text and provide structured information.
- experiment-2.py - Python program using langextract to process medical prescription (unstructured text) and provide structured data.
- alice_in_wonderland.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).
- romeo_juliet.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).
- spiderman.py - Python program using langextract to process a short story from online (unstructured text) and provide userful information (structured data).
Below Python program is used to copy contents from Wiki Page and create txt file.
wiki-to-text.py




