Skip to content

rachanavarsha/Text-to-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Text-to-SQL Pipeline and Conversational AI Chatbot at OnTrac This repository contains the work I completed during my internship at OnTrac, where I collaborated with the Data Engineering team on building a Text-to-SQL pipeline and developing a conversational AI chatbot using Vertex AI. The project focused on enabling seamless natural language processing, query generation, and data interaction through optimized SQL statements.

Project Overview During this project, I contributed to the development of a robust Text-to-SQL solution designed to streamline data querying for the data analytics team. The pipeline allows users to interact with the database using natural language, automating the translation of user queries into efficient SQL statements. This approach significantly enhanced data accessibility and usability within the organization.

Key Contributions

Data Dictionary Development: Collaborated with the Data Engineering team to create a comprehensive data dictionary that served as a foundation for the Text-to-SQL pipeline.

Text-to-SQL Pipeline: Designed and developed an NLP-powered pipeline that translates natural language queries into SQL commands, enabling automated query generation.

Conversational AI Chatbot: Implemented a chatbot using Vertex AI to facilitate interaction with the database via natural language, improving user experience and accessibility.

ETL Pipeline Implementation: Built end-to-end ETL (Extract, Transform, Load) pipelines to move data from the landing zone to the hub, ensuring data quality, high availability, and minimal disruptions.

Deployment on GCP: Deployed the Text-to-SQL solution on Google Cloud Platform (GCP) for enhanced scalability, performance optimization, and seamless integration with cloud-based data services.

Technologies Used Google Cloud Platform (GCP): For deploying the chatbot and data pipelines. Vertex AI: Used for building and training NLP models to generate SQL queries from natural language. Python: Primary language for implementing ETL processes and data handling. Machine Learning Frameworks: Leveraged NLP techniques to enhance natural language understanding. Data Processing Libraries: Utilized pandas and other Python libraries for data manipulation and analysis.

Acknowledgments I would like to thank the Data Engineering team at OnTrac for their guidance and support during the development of this project.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published