diff --git a/.env.example b/.env.example index 77bb563..778ad55 100644 --- a/.env.example +++ b/.env.example @@ -1,6 +1,6 @@ OPENAI_API_KEY= -# Update these with your Supabase details from your project settings > API +# Update these with your Supabase details from your project settings > API and dashboard settings PINECONE_API_KEY= PINECONE_ENVIRONMENT= - +PINECONE_INDEX_NAME= diff --git a/README.md b/README.md index e4d7969..93442d5 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ -# GPT-4 & LangChain - Create a ChatGPT Chatbot for Your PDF Docs +# GPT-4 & LangChain - Create a ChatGPT Chatbot for Your PDF Files -Use the new GPT-4 api to build a chatGPT chatbot for Large PDF docs (56 pages used in this example). +Use the new GPT-4 api to build a chatGPT chatbot for multiple Large PDF files. Tech stack used includes LangChain, Pinecone, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps and chatbots. Pinecone is a vectorstore for storing embeddings and your PDF in text to later retrieve similar docs. @@ -37,28 +37,30 @@ OPENAI_API_KEY= PINECONE_API_KEY= PINECONE_ENVIRONMENT= +PINECONE_INDEX_NAME= + ``` - Visit [openai](https://help.openai.com/en/articles/4936850-where-do-i-find-my-secret-api-key) to retrieve API keys and insert into your `.env` file. -- Visit [pinecone](https://pinecone.io/) to create and retrieve your API keys. +- Visit [pinecone](https://pinecone.io/) to create and retrieve your API keys, and also retrieve your environment and index name from the dashboard. -4. In the `config` folder, replace the `PINECONE_INDEX_NAME` and `PINECONE_NAME_SPACE` with your own details from your pinecone dashboard. +4. In the `config` folder, replace the `PINECONE_NAME_SPACE` with a `namespace` where you'd like to store your embeddings on Pinecone when you run `pnpm run ingest`. This namespace will later be used for queries and retrieval. -5. In `utils/makechain.ts` chain change the `QA_PROMPT` for your own usecase. Change `modelName` in `new OpenAIChat` to a different api model if you don't have access to `gpt-4`. See [the OpenAI docs](https://platform.openai.com/docs/models/model-endpoint-compatibility) for a list of supported `modelName`s. For example you could use `gpt-3.5-turbo` if you do not have access to `gpt-4`, yet. +5. In `utils/makechain.ts` chain change the `QA_PROMPT` for your own usecase. Change `modelName` in `new OpenAIChat` to `gpt-3.5-turbo`, if you don't have access to `gpt-4`. Please verify outside this repo that you have access to `gpt-4`, otherwise the application will not work with it. -## Convert your PDF to embeddings +## Convert your PDF files to embeddings -1. In `docs` folder replace the pdf with your own pdf doc. +**This repo can load multiple PDF files** -2. In `scripts/ingest-data.ts` replace `filePath` with `docs/{yourdocname}.pdf` +1. Inside `docs` folder, add your pdf files or folders that contain pdf files. -3. Run the script `npm run ingest` to 'ingest' and embed your docs +2. Run the script `npm run ingest` to 'ingest' and embed your docs. If you run into errors troubleshoot below. -4. Check Pinecone dashboard to verify your namespace and vectors have been added. +3. Check Pinecone dashboard to verify your namespace and vectors have been added. ## Run the app -Once you've verified that the embeddings and content have been successfully added to your Pinecone, you can run the app `npm run dev` to launch the local dev environment and then type a question in the chat interface. +Once you've verified that the embeddings and content have been successfully added to your Pinecone, you can run the app `pnpm run dev` to launch the local dev environment, and then type a question in the chat interface. ## Troubleshooting @@ -68,17 +70,18 @@ In general, keep an eye out in the `issues` and `discussions` section of this re - Make sure you're running the latest Node version. Run `node -v` - Make sure you're using the same versions of LangChain and Pinecone as this repo. -- Check that you've created an `.env` file that contains your valid (and working) API keys. +- Check that you've created an `.env` file that contains your valid (and working) API keys, environment and index name. - If you change `modelName` in `OpenAIChat` note that the correct name of the alternative model is `gpt-3.5-turbo` -- Pinecone indexes of users on the Starter(free) plan are deleted after 7 days of inactivity. To prevent this, send an API request to Pinecone to reset the counter. +- Make sure you have access to `gpt-4` if you decide to use. Test your openAI keys outside the repo and make sure it works and that you have enough API credits. +- Your pdf file is corrupted and cannot be parsed. **Pinecone errors** -- Make sure your pinecone dashboard `environment` and `index` matches the one in your `config` folder. +- Make sure your pinecone dashboard `environment` and `index` matches the one in the `pinecone.ts` and `.env` files. - Check that you've set the vector dimensions to `1536`. -- Switch your Environment in pinecone to `us-east1-gcp` if the other environment is causing issues. - -If you're stuck after trying all these steps, delete `node_modules`, restart your computer, then `pnpm install` again. +- Make sure your pinecone namespace is in lowercase. +- Pinecone indexes of users on the Starter(free) plan are deleted after 7 days of inactivity. To prevent this, send an API request to Pinecone to reset the counter. +- Retry from scratch with a new Pinecone index and cloned repo. ## Credit diff --git a/components/layout.tsx b/components/layout.tsx index 4481b4d..5e3d207 100644 --- a/components/layout.tsx +++ b/components/layout.tsx @@ -14,7 +14,7 @@ export default function Layout({ children }: LayoutProps) { -
{error}
+