Skip to content

Feature: LLM-based intelligent forum post selection for NotebookLM export #4

@DarrenZal

Description

@DarrenZal

Problem

Currently, the NotebookLM export includes only the latest post from forum threads in . This works well for some use cases (e.g., weekly meeting updates where each post is independent), but fails to capture important context when posts are part of ongoing conversations.

For example:

  • Works well: Weekly meeting thread where each post is a standalone update
  • Missing context: Question/answer threads where the latest post references previous posts

Proposed Solution

Add an LLM-based intelligent post selection step before including forum content in the NotebookLM export.

Workflow

  1. Fetch last X posts from the thread (e.g., last 5-10 posts)
  2. Pass to LLM with a prompt like:
    Given these forum posts from a thread, determine which posts should be included 
    in a weekly digest to provide complete context for understanding the latest post.
    
    Consider:
    - Is the latest post responding to a previous post?
    - Are there unresolved questions or ongoing discussions?
    - Which posts provide essential context?
    
    Return: List of post IDs to include
    
  3. Include selected posts in the NotebookLM export with proper ordering

Implementation Location

  • File: src/content/weekly_curator_llm.py
  • Method: export_notebooklm_enhanced (around line 1780-1827)
  • Current behavior: Includes all posts from thread_groups[url]
  • Proposed behavior: Filter posts using LLM before including

Example Scenarios

Scenario 1: Standalone weekly update

  • Input: Last 5 posts from weekly meeting thread
  • LLM decision: Include only the latest post (independent update)
  • Result: Clean digest with just the current week's notes

Scenario 2: Ongoing discussion

  • Input: Last 5 posts with questions and answers
  • LLM decision: Include last 3 posts (original question + 2 follow-up answers)
  • Result: Complete context for understanding the discussion

Scenario 3: Announcement with clarifications

  • Input: Original announcement + 4 clarification posts
  • LLM decision: Include original announcement + last 2 clarifications
  • Result: Announcement with essential context

Benefits

  • Better context: Users get complete understanding of forum discussions
  • Reduced noise: Skip irrelevant historical posts
  • Flexible: Adapts to different thread types automatically
  • Smart: Uses LLM reasoning instead of rigid rules

Technical Considerations

  • Cost: Additional LLM API call per forum thread (minimal for weekly digest)
  • Performance: Parallel processing of threads to minimize latency
  • Fallback: If LLM fails, include last N posts (e.g., 3) as default
  • Caching: Consider caching LLM decisions for recently analyzed threads

Related Code

  • Forum content extraction: weekly_curator_llm.py:1780-1827
  • Thread groups structure: Digest contains thread_groups[url] = [list of post items]
  • Content format: Each post has content.text field

Acceptance Criteria

  • LLM analyzes last 5-10 posts from each forum thread
  • LLM selects relevant posts based on context and dependencies
  • Selected posts are included in NotebookLM export with proper ordering
  • Fallback behavior if LLM analysis fails
  • Logging shows which posts were selected and why
  • Performance impact is minimal (< 5 seconds per digest generation)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions