MindY is a concept and exploration project that examines how people search for and engage with mindfulness and relaxation content. It uses natural language processing and information retrieval techniques as a lens to explore personalization, language, and user intent, rather than to deliver a production-ready wellness tool.
This project integrates YouTube metadata retrieval and natural language processing (NLP) with TF-IDF vectorization and cosine similarity to rank and recommend relevant relaxation and mindfulness content.
MindY was developed as part of a learning journey focused on human-centered questions around content discovery. Its purpose is to explore how people search for and engage with mindfulness and relaxation content, rather than to deliver a production-ready tool.
The project focuses on experimentation, reflection, and design considerations, with attention to language, intent, and ethical use of technology in wellbeing contexts.
- Project Overview
- Key Features
- How It Works
- Project Structure
- Requirements
- Usage
- Why This System?
- Next Steps
MindY consists of two core functionalities:
-
π§ AI-Powered Chatbot for Personalized Techniques
- Uses GPT-4 to generate actionable wellness techniques based on user queries.
- Allows users to select wellness categories for more refined recommendations.
-
π₯ Intelligent YouTube Video Search
- Retrieves relevant wellness videos from YouTube API.
- Ranks videos by semantic similarity using TF-IDF & cosine similarity.
- Ensures users get high-quality, meaningful video recommendations.
- Uses GPT-4 to suggest specific techniques tailored to user queries.
- Example: A query like "How can I reduce stress?" may generate:
- Guided Breathing Exercise
- Mindfulness Meditation for Anxiety Relief
- Queries YouTube API and ranks videos by similarity to the user's query.
- Prevents generic recommendations by matching video descriptions to techniques.
- Users can select predefined wellness categories, such as:
- π§ Mindfulness & Meditation
- π¨ Breathing Exercises
- π€Έ Somatic Practices
-
π¬ User Input
- The app collects a query (e.g., "How do I sleep better?").
- If a category is selected, the response is focused on that domain.
-
π§ AI Response Generation
- GPT-4 generates a personalized list of techniques.
- Extracts keywords and themes from the response.
-
π YouTube Search & Ranking
- The system queries YouTube API for videos using refined keywords.
- Uses TF-IDF & cosine similarity to re-rank videos based on relevance.
-
π Results Display
- The app presents AI-generated techniques alongside top-ranked videos.
app.pyβ Main Streamlit app (chatbot & video recommendations)requirements/β Contains dependencies for different functionalities:mindy_app.txtβ Dependencies for Streamlit appgenai.txtβ Dependencies for query simulation & clusteringnlp.txtβ Dependencies for data analysis
simulate_queries/β Query simulations & GPT data extraction- Contains the following key files:
app_logic.py: Implements core logic for GPT-4 recommendations, YouTube video fetching, ranking, and query processing.simulate_queries.py: Automates the simulation of user queries usingapp_logic.pyto extract GPT-recommended techniques and descriptions. Outputs results tosimulation_results.csv.
- Contains the following key files:
data_analysis/β NLP analysis & clustering scripts:- Contains scripts for text preprocessing, clustering, and visualization of results.
- Includes
simulated_results.csv, which holds the output of simulated user queries for analysis.
MindyPresentation.pdfβ Project presentationREADME.mdβ Project documentation
-
API Keys:
- YouTube API Key (store your Key in a
.envfile asYOUTUBE_API_KEY). - OpenAI API Key (store your Key in the
.envfile asOPENAI_API_KEY).
- YouTube API Key (store your Key in a
-
Dependencies:
- Python libraries:
streamlit,openai,scikit-learn,google-api-python-client,python-dotenv.
- Python libraries:
-
Install Dependencies:
- To keep dependencies isolated and avoid conflicts, create separate virtual environments for each functional area.
- Create a virtual environment for the Streamlit app:
python3 -m venv mindy_app_env source mindy_app_env/bin/activate # On Windows, use mindy_app_env\Scripts\activate
- Install the dependencies:
pip install -r requirements/mindy_requirements.txt
- Create a virtual environment for query simulation and clustering:
python3 -m venv genai_env source genai_env/bin/activate # On Windows, use genai_env\Scripts\activate
- Install the dependencies:
pip install -r requirements/genai.txt
- Create a virtual environment for data analysis:
python3 -m venv nlp_env source nlp_env/bin/activate # On Windows, use nlp_env\Scripts\activate
- Install the dependencies:
pip install -r requirements/nlp.txt
- Activate the virtual environment for the MindY App:
source mindy_app.env/bin/activate # On Windows: mindy_app.env\Scripts\activate- Launch the chatbot and video recommender:
streamlit run app.py- Enter a wellness-related query (e.g., "How can I improve my focus?").
- Optionally, select a category for more refined suggestions.
- The app displays customized techniques & top-ranked YouTube videos.
This system improves recommendations by:
- π― Understanding query intent (semantic search, not just keywords).
- π οΈ Filtering & ranking YouTube videos for better quality.
- π‘ Providing AI-powered guidance, not just links.
- Leverage clustered techniques to provide more cohesive recommendations.
- Map user queries to cluster themes for more personalized outputs.
- Improve text preprocessing by refining techniques like removing stop words and applying lemmatization.
- Shift from a narrow focus on Mindfulness and Relaxation to a broader range of well-being areas.
- Expand prompts to include diverse domains like emotional management and cognitive behavioral therapy (CBT).
- Introduce new categories (e.g., journaling prompts) to generate more varied and meaningful recommendations.
- Simulate queries using a generalized system, testing a free model like DeepSeek.
- Repeat clustering processes and analyze how the new implementations influence results.