A human-in-the-loop AI assistant for user interview transcription and insight analysis.
User Research Helper is an AI-augmented tool that streamlines the process of analyzing user research interviews. It combines automated audio transcription (via AssemblyAI) with OpenAI-powered analysis to generate organized, question-based insights. By blending automated data processing with manual oversight, User Research Helper allows UX researchers and analysts to focus on deeper insights rather than tedious transcription and data wrangling.
- Save Time: Automated transcription and analysis reduce manual overhead.
- Stay Flexible: Human-in-the-loop oversight ensures you can refine results or override AI suggestions.
- Centralize Insights: Organized Excel sheets and Word reports keep all findings in one place.
- Scalable: Easily handle multiple interviews without losing structure or clarity.
-
Automated Audio Transcription
Quickly convert interview recordings to text using the AssemblyAI API. -
Question-Based Analysis
Automatically map interview responses to a predefined set of questions for organized insights. -
Segment-Based Insights
Define user segments (e.g., demographics, behavior groups) and tag each interview's responses accordingly. -
Excel Report Generation
Generate structured Excel files that summarize findings per question, per interview. -
Quote Extraction
Automatically pinpoint and extract key quotes for easy reference in the analysis report. -
Cross-Interview Insights
Combine data from multiple interviews to uncover broader trends, patterns, and outliers. -
Multi-Language Support
Process interviews in various languages without sacrificing structure or clarity.
To get started with User Research Helper, you will need the following:
- OpenAI API Key: Obtain one here
- AssemblyAI API Key: Obtain one here
You have two main options to use this tool:
- Colab Option: Use Google Colab to run the tool without any local installations (Google account needed).
- Python Option: Run the tool locally on your machine.
If you prefer to use this tool on Google Colab, follow these steps:
-
Requirements:
- A Google account with access to Google Drive.
-
Prepare Your Data:
- Organize your data as explained in the Data Setup section and add them to your Google Drive.
-
Open the Colab Notebook:
- Access the Colab Notebook.
-
Follow the Instructions:
- Execute the notebook cells step-by-step as per the provided instructions to set up and run the tool.
-
Share Your Findings:
- Use the generated Word document located at
analysis/results_with_quotes.docxon your Google Drive to share your insights.
- Use the generated Word document located at
If you are comfortable running Python on your local machine, follow these steps:
-
Clone the Repository:
git clone https://github.com/nagoli/user-research-helper.git cd user-research-helper -
Install Dependencies:
pip install -e . -
Configure API Keys:
-
Copy the example environment file and set your API keys:
cp .envExample .env
-
Open the
.envfile in a text editor and add yourOPENAI_API_KEYandASSEMBLYAI_API_KEY:OPENAI_API_KEY=your-openai-api-key ASSEMBLYAI_API_KEY=your-assemblyai-api-key
-
-
Prepare Your Data:
- Organize your data as explained in the Data Setup section.
-
Process Transcripts:
python process_transcripts.py your/project/folder
-
Define Segments:
- Open and edit
analysis/transcript_analysis_report.xlsxto define segments for each interview.
- Open and edit
-
Process Analysis:
python process_analysis.py your/project/folder
-
Share Your Findings:
- Use the generated Word document located at
analysis/results_with_quotes.docxto share your insights.
- Use the generated Word document located at
Here is an overview of how User Research Helper operates:
-
Input
- Audio recordings of interviews
- A set of predefined interview questions
-
Process
- Transcribes each audio file automatically
- Maps responses to corresponding interview questions
- Provides an Excel report with initial analysis per question, per interview
-
Human Review
- Lets you manually segment or categorize interviews in the Excel file
- Offers the flexibility to refine automated analyses
-
Output
- Generates a Word report that summarizes cross-interview findings by question
- Integrates direct quotes from transcripts
- Enables a high-level, human-vetted view of key insights
This Usage guide explains how to organize your data, run each processing step, and refine interview segments. It also describes the intermediate files that the tool generates to avoid unnecessary re-computation. If you ever need to re-run a specific analysis step, simply delete the relevant intermediate files before running the script again.
Create or select a project folder with the following structure:
your/project/folder/
│
├── audios/ # Place your interview audio files here
├── config.json # Configuration file for the analysis
└── questions.txt # List of interview questions
- You can copy or reference the
data_skeleton/folder included in this repository. - It contains a ready-to-use
config.jsonthat shows which settings you can define for your project—such as special instructions for the language model (LLM). - Feel free to customize
config.jsonfor flags like"do_transcribe_audio": true/falseor other advanced settings related to LLM instructions.
questions.txt: One interview question per line.config.json: Stores project configuration (e.g., transcription toggle, advanced LLM parameters).audios/: Folder containing your audio files to be transcribed.
Tip: The
demo/folder offers a complete walk-through with sample audio files, questions, and configuration. Use it as a reference to get started quickly.
The demo includes a sample of 5 interviews exploring users’ habits while drinking coffee.
Once your data folder is prepared:
python process_transcripts.py your/project/folderThis script performs three main tasks and writes intermediate files to help avoid re-processing:
- Reads
questions.txtand transcribes every audio file inaudios/ - Saves raw transcripts in
transcripts/raw/
In the demo folder (5 sample audios), you'll see 5 raw transcripts:
![]()
- Organizes raw transcripts by your predefined questions
- Saves structured transcripts in
transcripts/structured/
Example from demo: each transcript is now sectioned by question:
- Generates two Excel files in the
transcripts/directory:transcript_analysis_report.xlsxtranscript_analysis_report_quotes.xlsx
- These files list interview responses and notable quotes per question, per interview
Note on Intermediate Files
- If you add a new interview (audio file) later, simply place it in
audios/and rerun the script - Existing transcripts are not overwritten unless you manually remove them. This design helps you avoid re-transcribing interviews every time.
After the initial analysis, the generated Excel files in the transcripts/ folder will be automatically copied into the analysis/ folder. These are the files you will modify to add segments or adjust AI-generated insights.
- Open
transcript_analysis_report.xlsxlocated in theanalysis/folder - Under the Segment column, assign one or two segments (e.g., "Beginner", "Expert") to each interview. Separate multiple segments with commas
- Update or refine any AI-generated text if needed
- Repeat the same segment assignments in
transcript_analysis_report_quotes.xlsxto keep quotes aligned
In the demo project, segments have been added to both files:
- The tool automatically generates fresh reports in the
transcripts/folder if you re-runprocess_transcripts.py - These new reports do not overwrite your
analysis/folder files by default. This safeguard keeps your manual modifications safe - If you want to incorporate newly generated data from
transcripts/intoanalysis/, you'll need to manually overwrite the existing analysis files inanalysis/—and re-add segments or edits as needed.
Next, refine your insights further:
python process_analysis.py your/project/folderThis script uses intermediate files to avoid repeating costly computations:
- Reads your edited
transcript_analysis_report.xlsxfromanalysis/ - Produces detailed, segment-focused Excel sheets in
analysis/segments/
Demo shows segment analyses with key observations per group:
- Examines the segment-based output to identify patterns across multiple interviews
- Saves consolidated data in
analysis/results_report.xlsx
- Combines quotes from
transcript_analysis_report_quotes.xlsxwith the aggregated data inresults_report.xlsx - Outputs a final
analysis_report.docxin theanalysis/folder
- Each stage above saves its results in distinct directories (e.g.,
transcripts/raw,analysis/segments). - This design ensures you don’t need to re-run the entire workflow every time.
- Delete (or rename) the relevant intermediate files or folders. For instance, remove a specific raw transcript to re-transcribe an audio file.
- Then re-run the associated script (
process_transcripts.pyorprocess_analysis.py). - The tool will regenerate only what’s missing, preventing unnecessary overhead.
- Python 3.9+
- Required API keys:
- AssemblyAI API key (for transcription)
- OpenAI API key (for analysis)
- Clone the Repository
git clone https://github.com/nagoli/user-research-helper.git
cd user-research-helper- Install Dependencies
pip install -e .- Set up Environment Variables
cp .envExample .envOpen the .env file and add your keys:
OPENAI_API_KEY=<your-openai-api-key>
ASSEMBLYAI_API_KEY=<your-assemblyai-api-key>- src/ - Main source code directory
- demo/ - Example project with sample files and configuration
- data_skeleton/ - Template directory structure for new projects
- assets/ - Documentation images and resources
pandas- For data processingoutlines- For structured text processingopenai- For AI-powered analysisassemblyai- For audio transcriptionpython-dotenv- For environment variable managementtransformers- For text processingopenpyxl- For Excel report generationpython-docx- For Word document handling
This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
What this means:
- You can freely use, modify, and distribute this software.
- Any modifications or software including this code must also be released under the GPL-3.0.
- You must disclose the source code when distributing the software.
- Changes made to the code must be documented.
For more details, see the GNU GPL v3.0 license terms.
We welcome contributions from the community!
- Contributing: Feel free to fork the repository, make your changes, and submit a pull request. For major changes, open an issue to discuss them first.
- Acknowledgments: This project was partially funded by Access42, a major supporter of web accessibility in France. We sincerely thank them for their support and trust in the development of this software.










