Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
13 changes: 13 additions & 0 deletions Rohit C/Internship request letter.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
I am writing to express my interest in the LLM internship opportunity at Hardpoint Consulting, as mentioned in regards to the International Hackathon 2024. With a strong academic background and a passion for AI, LLM and Web Development, I am eager to contribute to your esteemed organization and further develop my skills in any area required.

I have been honing my knowledge in using LLM integrated with Python frameworks. Through my projects and hackathons, I have developed a keen understanding of LLM training, which I am eager to apply in a professional setting.

During my internship, I am keen to immerse myself in any task assigned, with the goal of gaining practical experience and contributing to the success of your team. I am particularly drawn to the vision of this organization and am excited about this opportunity.

In the short term, my one-year goal is to gain as much novel experiences as possible. Over the next three years, I aim to venture into integrating AI and LLMs into day-to-day uses of people. Looking further ahead, my five-year goal is to lead a team of passionate developers to conquer every corner of AI and to develop versions competing even today's tech giants.

Thank you for considering my application. I am excited about the opportunity to contribute to Hardpoint Consulting and look forward to the possibility of discussing how my skills and experience align with your internship program.

Sincerely,

Rohit C.
9 changes: 9 additions & 0 deletions Rohit C/XTRKT/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
SECRET_KEY = #Your Django secret key
ENGINE = #Your database engine
NAME = #Your database name
USER = #Your database username
PASSWORD = #Your database password
HOST = #Your database host name
PORT = #Your database port number
API_SECRET = #Your convertapi secret key
API_KEY = #Your gemini api secret key
72 changes: 72 additions & 0 deletions Rohit C/XTRKT/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# XTRKT

## Work Flow

The landing page of the website is given below. It contains a brief description of the project and also gives the user the option to upload a folder containing the dataset to be processed.

<img src="images/home.png" alt="Home Page" width="500" height="250">

The upload button gives the user the option to choose the specified pdf containing the data as depicted below.

<img src="images/upload.png" alt="Home Page" width="500" height="250">

The processing of the dataset takes place inside the upload() function in the views.py file within the task folder of the project.
The images are parsed one by one and passed to the OCR module of the project and the text extracted by it to the Gemini API via prompts.
LLM returns the data that need to be generated related to the content provided.
These values are then stored in the Output table of the database.

<img src="images/database.png" alt="Database" width="500" height="250">

Further on this data is integrated into the frontend to display the output in the specified format to the users where they can edit it accordingly.

<img src="images/extract.png" alt="Extract" width="500" height="250">

Now the users can either trash the extracts or continue to enter their research.

<img src="images/research.png" alt="Research" width="500" height="250">

The users now obtain an additional view of the extracted data along with its relation to their research.

<img src="images/additional.png" alt="Research" width="500" height="250">

The users can now choose to save the data permanently or trash it entirely.

<img src="images/store.png" alt="Research" width="500" height="250">

The user is also provided with an option to create an Excel sheet format of the output being displayed by the click of a button.

<img src="images/create.png" alt="Create" width="500" height="250">

The excel sheet is created in the excel folder within the media directory of the project.
The user now gets the option to download the excel file onto his/her local system.

<img src="images/download.png" alt="Download" width="500" height="250">

The downloaded .xlsx file can be viewed on any excel sheet view supported platforms.

<img src="images/excel.png" alt="Excel" width="500" height="250">

The users can also view all the data saved so far by clicking the 'View saved data' button on the home page.

<img src="images/saved.png" alt="Research" width="500" height="250">

Thus XTRKT brings forth a seamless integration of AI, Prompt engineering, Web development and Database management to provide the user with the best results for a variety of datasets.

## Technology Stack
- The project the built using the Django Model-View-Template(MVT) Architecture for full stack web development.
- Tesseract module is being used for text detection from images.
- Google Gemini 1.0 Pro API is used for prompting.
- A local MySQL database is used to store the data extracted through prompts.

## Where to Look

- Refer the 'views.py' file in the task folder to get a well documented code of the backend.
- Refer the 'templates' folder to see the HTML pages developed for this project.
- Refer the 'media' folder to view sample dataset and the excel sheet generated for it.
- Refer the '.env' file to know the list of environment variables and API keys required to run this code.
- Install the requirements.txt using the command 'pip install -r requirements.txt'.
- Run the project using the command 'py manage.py runserver'

Regards,

Rohit C.
Loading