After testing it on my own data, my impressions are mixed, but mostly positive. Some responses are very good, but there are occasional unclear artifacts, and sometimes the model can't answer a question, even though the answer is present in the output sources.
Pros:
- Open source
- Minimal setup required
- Runs on CPU
- Privacy and security
- Supports a wide range of document types
- Doesn't require internet connection
Cons:
- Speed (on my local machine, responses take anywhere from 20 to 60 seconds)
- The base model distorts unfamiliar terms (names, surnames, specialized terms)
- Performs poorly if the prompt is not in the form of a question
For now, I can conclude that it's not a replacement for ChatGPT and its counterparts, but the speed and accuracy could be improved.
Repo: https://github.com/imartinez/privateGPT
Built with LangChain, LlamaIndex, GPT4All, LlamaCpp, Chroma and SentenceTransformers.
In order to set your environment up to run the code here, first install all requirements:
pip3 install -r requirements.txtDownload the LLM model and place it in the llm_models directory:
- LLM: default to ggml-gpt4all-j-v1.3-groovy.bin. If you prefer a different GPT4All-J compatible model, just download it and reference it in your
.envfile.
Note: because of the way langchain loads the SentenceTransformers embeddings, the first time you run the script it will require internet connection to download the embeddings model itself.
This repo uses a state of the union transcript as an example.
Run the following command to ingest all the data.
python ingest.pyOutput should look like this:
Creating new vectorstore
Loading documents from source_documents
Loading new documents: 100%|██████████████████████| 1/1 [00:01<00:00, 1.73s/it]
Loaded 1 new documents from source_documents
Split into 90 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Using embedded DuckDB with persistence: data will be stored in: db
Ingestion complete! You can now run privateGPT.py to query your documentsIt will create a db folder containing the local vectorstore. Will take 20-30 seconds per document, depending on the size of the document.
You can ingest as many documents as you want, and all will be accumulated in the local embeddings database.
If you want to start from an empty database, delete the db folder.
In order to ask a question, run a command like:
python privateGPT.pyAnd wait for the script to require your input.
> Enter a query:
Hit enter. You'll need to wait 20-30 seconds (depending on your machine) while the LLM model consumes the prompt and prepares the answer. Once done, it will print the answer and the 4 sources it used as context from your documents; you can then ask another question without re-running the script, just wait for the prompt again.
Note: you could turn off your internet connection, and the script inference would still work. No data gets out of your local environment.
Type exit to finish the script.
The script also supports optional command-line arguments to modify its behavior. You can see a full list of these arguments by running the command python privateGPT.py --help in your terminal.
To use this software, you must have Python 3.10 or later installed. Earlier versions of Python will not compile.
If you encounter an error while building a wheel during the pip install process, you may need to install a C++ compiler on your computer.
To install a C++ compiler on Windows 10/11, follow these steps:
- Install Visual Studio 2022.
- Make sure the following components are selected:
- Universal Windows Platform development
- C++ CMake tools for Windows
- Download the MinGW installer from the MinGW website.
- Run the installer and select the
gcccomponent.