Skip to content

BayAreaMetro/mtc_pba50_engagement_analysis

Repository files navigation

PBA2050+ Public Engagement Analysis Framework

Last Updated: 08/02/2024

Purpose

Plan Bay Area (PBA) is the region’s visionary long-range plan, including 35 strategies spread across the elements of transportation, housing, the economy and the environment that collectively seek to make the Bay Area more equitable for all residents and more resilient in the face of unexpected challenges. Over the next two and a half years, the plan will be updated in consultation with a wide range of partners, including federal, state, regional, county, local and Tribal governments, as well as community organizations, other stakeholders and the public. This project aims to facilitate a more informed and responsive approach to public feedback by developing and implementing a framework using embeddings and Large Language Model (LLM) prompt engineering techniques.

The 5 main components of this pipeline:

  • Component 1. Data Ingestion: Collecting and formatting public engagement comment data for analysis.
  • Component 2. Topic/Subtopic Analysis: Utilizing LLMs in the classification of public comments into main and subtopics.
  • Component 3. Theme Analysis using Embeddings: Extracting embeddings from comments to cluster into groups with alike meaning, accompanied by automatically generated topic name recommendations.
  • Component 4. Final Topic/Subtopic/Theme Assignment: A manual process of reviewing and finalizing the categories of topics given the sugesstions the earlier methods.
  • Component 5. Present Results: Compiling and presenting the analysis methodologies as user-friendly tools for future public engagement initiatives.

Contents (process references below):

  • .vscode: Configuration settings for Visual Studio Code.
  • Analysis Modules:
    • Data Preparation(Component 1): Processes leverages OpenAI to generate synthetic comments based on NextGen comments.
    • Theme Analysis(Component 3): Processes leveraging embeddings in the grouping and naming of public comments.
    • OpenAI Topic Tagging(Component 2): Leveraging OpenAI LLM's to automatically tag public comments.
    • Prompt Setup(Components 2-4): Materials to assist in creating prompts for new uses of these tools.
  • Demonstrations: Quick coding examples.
  • configs(Components 2-4): Process configuration files (e.g. lists of tags).
  • utilities(All Components): Utility functions supporting the processes.
  • .env-example: Example definition of the OpenAI API key (filename must be changed to .env in practice).
  • PBA50 Comment Processing Pipeline.ipynb: Notebook containing example functionality for all processes included in this repo.
  • environment.yml: Conda environment configuration.

Getting Started

After cloning the repository create and activate your virtual environment by running the following (more comprehensive directions in the project design document.

$ conda env create -f environment.yml

then

$ conda activate comment-analysis-env

Then configure a .env file like .env-example containing your OpenAI API key and you should be all set to run the processes!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published