🌍 ClimateLens

Climate change is driving rising anxiety, yet we lack clear insight into how it appears in everyday language and have few tools for early detection. By analyzing linguistic patterns with NLP/LLM methods, ClimateLens aims to identify climate anxiety early, reveal how it manifests among youth, and provide a reusable, scalable detection model with an interactive platform for applying and visualizing results. The goal is to enable timely support, strengthen resilience, and turn climate-related fears into constructive engagement.

The production app is deployed on HuggingFace Spaces using Streamlit. All visualizations and explanations are present in the app.

✨ Features

Data Collection – tools for gathering and cleaning social media datasets.
NLP Models – topic modeling and classification for detecting climate-related emotions.
Visualization – interactive graphics and dashboards.
WebApp – HuggingFace Space using Streamlit.

🔐 Required Environment Variables

# Cohere
COHERE_API_KEY=your_cohere_key

# Directories
DATA_DIR=your_data_directory_here
CODE_DIR=your_code_directory_here

Moreover, topic_modeling.py and emotion_classification.py both also require a manual entry for the .env file.

📂 Project Structure

ClimateLens/
├── azureml/                         # Azure Machine Learning job + environment setup
│   ├── AML_job.py                   # Defines AML job configuration and execution
│   ├── environment.yml              # Conda environment used for AML compute
│   ├── run_scripts.sh               # Shell script for running AML jobs end-to-end
│   └── test_run_scripts.sh          # Test script to validate AML job execution
│
├── data/                            # Sample input datasets
│   ├── climate_twitter_sample.csv   # Example climate-related Twitter posts
│   ├── filtered_anticonsumption_comments.csv  # Cleaned Reddit/Twitter anti-consumption data
│   └── README.md                     # Notes describing sample data contents/format
│
├── src/
│   ├── LDA/                          # Baseline LDA topic modeling implementation
│   │   └── ...                       # (LDA model scripts, topic extraction helpers, etc.)
│   ├── utils/                        # Helper files used throughout the process (some are optional)
│   │   └── ...
│   ├── data_preprocessing.py         # Cleans raw social media text, normalizes fields, removes noise
│   ├── dynamic_topic_modeling.py     # Implements dynamic/temporal topic modeling (e.g., DTM/BERT-based)
│   ├── emotion_classification.py     # Emotion classifier pipeline (e.g., emotion embeddings + model)
│   ├── emotion_visualizations.ipynb  # Notebook for plotting emotion trends and visual insights
│   ├── reddit_data_filtering.py      # Filtering + preprocessing logic specialized for Reddit datasets
│   ├── topic_modeling.py             # Main topic modeling pipeline (BERTopic, LDA, clustering, etc.)
│   └── twitter_data_cleaner.py       # Specialized cleaning for Twitter text (URLs, mentions, tokens)
│   └── README.md                     # Explanation of source code structure & how to run modules
│
├── .gitignore
├── LICENSE
├── Makefile                         # Automation commands (e.g., setup, run, clean)
├── pyproject.toml                   # Build system + project metadata (modern Python packaging)
├── README.md                        # Main project documentation
├── requirements.txt                 # Python dependencies (runtime)
└── setup.cfg                        # Linting, formatting, and packaging configuration

⚙️ Azure ML Execution

ClimateLens supports cloud execution using Azure Machine Learning (AzureML). All code and data should already live inside your AzureML Workspace, the jobs simply run the pipeline on a compute cluster without needing a web connection (AzureML compute instances are VMs, but JupyterNotebook requires a job to run without the web connection). Note that you must keep AML_job.py in the root directory outside of the azureml folder for everything to work as is.

How it works

AzureML mounts your existing workspace code and data
A job runs your scripts in sequence using run_scripts.sh
No local uploads or .env access are required
Logs stream back to your terminal

run_scripts.sh defines the order of your pipeline steps and AML_job.py submits the job to AzureML.

🤝 Contributing

This is an organization-only project for now, but efforts are underway to make this fully open-source.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 ClimateLens

✨ Features

🔐 Required Environment Variables

📂 Project Structure

⚙️ Azure ML Execution

How it works

🤝 Contributing

License

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
azureml		azureml
data		data
docs		docs
khp_climate_anxiety		khp_climate_anxiety
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

License

Climate-Resilient-Communities/ClimateLens

Folders and files

Latest commit

History

Repository files navigation

🌍 ClimateLens

✨ Features

🔐 Required Environment Variables

📂 Project Structure

⚙️ Azure ML Execution

How it works

🤝 Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages