Voice entries by travisdock · Pull Request #239 · travisdock/sum

travisdock · 2025-12-02T12:55:38Z

No description provided.

Copilot

Pull request overview

This PR introduces voice entry functionality, allowing users to create financial entries by recording their voice. The feature uses Google Gemini AI via the RubyLLM gem to transcribe and extract structured transaction data from audio recordings.

Key Changes:

Adds voice recording capability with AI-powered transaction extraction using Google Gemini
Updates Ruby version from 3.4.4 to 3.4.7 across all configuration files
Implements rate limiting (5 requests/minute) and feature flag for voice entries

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
db/migrate/20251202132403_add_voice_entries_enabled_to_users.rb	Adds `voice_entries_enabled` boolean flag to users table
db/schema.rb	Updates schema version to 8.1 and adds voice_entries_enabled column (plus alphabetical reordering)
config/routes.rb	Adds voice_entries resource endpoint (create only)
config/initializers/ruby_llm.rb	Configures RubyLLM with Gemini API key from Rails credentials
config/credentials.yml.enc	Updates encrypted credentials (likely adding Gemini API key)
app/controllers/voice_entries_controller.rb	New controller handling voice entry creation with validation and rate limiting
app/services/voice_entry_processor.rb	New service processing audio files via Gemini AI to extract transaction data
app/javascript/controllers/voice_recorder_controller.js	New Stimulus controller managing voice recording UI and form population
app/views/entries/new.html.haml	Conditionally displays voice recorder interface for enabled users
app/views/entries/_voice_recorder.html.haml	Voice recorder button partial
app/views/entries/_form.html.haml	Adds data targets to form fields for voice recorder integration
Gemfile	Updates Ruby version to 3.4.7 and adds ruby_llm gem
Gemfile.lock	Reflects ruby_llm dependencies and Ruby version update
Dockerfile	Updates base image to Ruby 3.4.7
Dockerfile.prod	Updates base image to Ruby 3.4.7

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-02T13:47:57Z

app/services/voice_entry_processor.rb

+      - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
+      - notes: Any additional context or description
+
+      USER'S AVAILABLE CATEGORIES: #{categories_list}
+
+      Return ONLY a JSON object like:
+      {"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}


[nitpick] The hardcoded year "2025" in the example date format may cause confusion when this code runs in different years. Consider using a more generic example or dynamically generating the current year:

current_year = Date.today.year <<~PROMPT ... - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{current_year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null. ... {"amount": 45, "category_name": "Groceries", "date": "#{current_year}-10-15", "notes": "groceries at Trader Joe's"} ... PROMPT

Suggested change

- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.

- notes: Any additional context or description

USER'S AVAILABLE CATEGORIES: #{categories_list}

Return ONLY a JSON object like:

{"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}

- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{Date.today.year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.

- notes: Any additional context or description

USER'S AVAILABLE CATEGORIES: #{categories_list}

Return ONLY a JSON object like:

{"amount": 45, "category_name": "Groceries", "date": "#{Date.today.year}-10-15", "notes": "groceries at Trader Joe's"}

db/schema.rb

Copilot · 2025-12-02T13:47:57Z

config/initializers/ruby_llm.rb

+  # Then add:
+  #   google:
+  #     gemini_api_key: YOUR_API_KEY_HERE
+  config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key)


The initializer will fail silently if the Gemini API key is not configured, which could lead to runtime errors when voice entries are used. Consider adding a warning or error if the API key is missing:

api_key = Rails.application.credentials.dig(:google, :gemini_api_key) Rails.logger.warn "Gemini API key not configured" if api_key.blank? config.gemini_api_key = api_key

Suggested change

config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key)

api_key = Rails.application.credentials.dig(:google, :gemini_api_key)

Rails.logger.warn "Gemini API key not configured" if api_key.blank?

config.gemini_api_key = api_key

Copilot · 2025-12-02T13:47:58Z

app/controllers/voice_entries_controller.rb

+class VoiceEntriesController < ApplicationController
+  # Rate limiting: 5 requests per minute per user
+  before_action :check_voice_entries_access
+  before_action :check_rate_limit
+
+  def create
+    validate_audio!
+
+    processor = VoiceEntryProcessor.new(
+      user: current_user,
+      audio_file: params[:audio_file].tempfile
+    )
+
+    result = processor.process
+
+    if result[:success]
+      render json: result
+    else
+      render json: result, status: :unprocessable_entity
+    end
+  rescue => e
+    Rails.logger.error "Voice entry creation failed: #{e.message}"
+    render json: {
+      success: false,
+      error: e.message,
+      transcription: nil
+    }, status: :unprocessable_entity
+  end
+
+  private
+
+  def check_voice_entries_access
+    unless current_user.voice_entries_enabled?
+      render json: {
+        success: false,
+        error: 'Voice entries feature is not enabled for your account.',
+        transcription: nil
+      }, status: :forbidden
+    end
+  end
+
+  def validate_audio!
+    audio = params[:audio_file]
+
+    raise 'No audio file provided' unless audio.present?
+
+    # Check content type
+    unless audio.content_type&.start_with?('audio/')
+      raise 'Invalid file type. Must be audio file.'
+    end
+
+    # Check file size (max 5MB)
+    if audio.size > 5.megabytes
+      raise 'Audio file too large. Maximum size is 5MB.'
+    end
+  end
+
+  def check_rate_limit
+    cache_key = "voice_entries:#{current_user.id}:#{Time.current.to_i / 60}"
+    count = Rails.cache.read(cache_key) || 0
+
+    if count >= 5
+      render json: {
+        success: false,
+        error: 'Too many requests. Please wait a moment.',
+        transcription: nil
+      }, status: :too_many_requests
+      return
+    end
+
+    Rails.cache.write(cache_key, count + 1, expires_in: 1.minute)
+  end
+end


There's no test coverage for the new VoiceEntriesController. Given that the repository has comprehensive test coverage for other controllers (e.g., spec/controllers/entries_spec.rb), tests should be added to cover:

Authentication and authorization checks

Rate limiting behavior

Audio file validation

Success and error scenarios

Voice entries feature flag check

Copilot · 2025-12-02T13:47:58Z

app/services/voice_entry_processor.rb

+class VoiceEntryProcessor
+  attr_reader :user, :audio_file
+
+  def initialize(user:, audio_file:)
+    @user = user
+    @audio_file = audio_file
+  end
+
+  def process
+    # Create a chat instance with Gemini
+    chat = RubyLLM.chat(model: 'gemini-2.5-flash-lite')
+
+    # Ask the question with the audio file
+    response = chat.ask(build_system_prompt, with: audio_file.path)
+
+    # Extract text from RubyLLM::Message object
+    response_text = if response.respond_to?(:content)
+      response.content
+    elsif response.respond_to?(:text)
+      response.text
+    else
+      response.to_s
+    end
+
+    # Extract JSON from response (in case there's extra text)
+    json_match = response_text.match(/\{.*\}/m)
+    raise 'No JSON found in response' unless json_match
+
+    data = JSON.parse(json_match[0])
+
+    {
+      success: true,
+      data: {
+        amount: data['amount'],
+        category_name: data['category_name'],
+        date: data['date'] || Date.today.to_s,
+        notes: data['notes']
+      },
+      transcription: response_text
+    }
+  rescue => e
+    Rails.logger.error "Voice entry processing failed: #{e.message}"
+    Rails.logger.error e.backtrace.join("\n")
+
+    {
+      success: false,
+      error: "Failed to process voice recording: #{e.message}",
+      transcription: nil
+    }
+  end
+
+  private
+
+  def build_system_prompt
+    categories_list = user.categories.pluck(:name).join(', ')
+
+    <<~PROMPT
+      You are a financial entry assistant. Extract transaction information from speech and return as JSON.
+
+      Extract:
+      - amount: Dollar amount as a number (e.g., 45 or 45.50)
+      - category_name: Category that best matches the description from the available categories
+      - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
+      - notes: Any additional context or description
+
+      USER'S AVAILABLE CATEGORIES: #{categories_list}
+
+      Return ONLY a JSON object like:
+      {"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}
+
+      IMPORTANT: Only include a date if the user explicitly mentions one. If no date is mentioned, use null for the date field.
+
+      If you cannot extract amount or category, set them to null.
+    PROMPT
+  end
+end


There's no test coverage for the VoiceEntryProcessor service. Given that the repository has comprehensive test coverage for controllers, tests should be added to cover:

Successful audio processing and JSON extraction

Error handling when JSON is not found in the response

Error handling for API failures

Date parsing logic (explicit dates vs. null)

Category matching from user's available categories

Copilot · 2025-12-02T13:48:01Z

app/views/entries/new.html.haml

+    %div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" }
+      %p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually:


[nitpick] Inline styles in the view reduce maintainability and make it harder to enforce consistent styling. Consider moving these styles to a CSS class:

%div.voice-recorder-divider %p.voice-recorder-divider-text Or enter manually:

And define in CSS:

.voice-recorder-divider { margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0); } .voice-recorder-divider-text { text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem; }

Suggested change

%div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" }

%p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually:

%div.voice-recorder-divider

%p.voice-recorder-divider-text Or enter manually:

Copilot · 2025-12-02T13:48:01Z

app/controllers/voice_entries_controller.rb

+    processor = VoiceEntryProcessor.new(
+      user: current_user,
+      audio_file: params[:audio_file].tempfile


The tempfile from the uploaded audio could be deleted before the VoiceEntryProcessor completes if the request processing is slow or if garbage collection occurs. Consider reading the file content into memory first or ensuring the tempfile persists for the duration of processing:

processor = VoiceEntryProcessor.new( user: current_user, audio_file: params[:audio_file].tempfile.tap(&:rewind) )

Alternatively, handle the file content directly rather than passing the path.

Suggested change

processor = VoiceEntryProcessor.new(

user: current_user,

audio_file: params[:audio_file].tempfile

# Read the uploaded audio file into memory to avoid tempfile deletion issues

require 'stringio'

audio_io = StringIO.new(params[:audio_file].read)

audio_io.set_encoding(params[:audio_file].tempfile.external_encoding) if params[:audio_file].tempfile.respond_to?(:external_encoding)

processor = VoiceEntryProcessor.new(

user: current_user,

audio_file: audio_io

Copilot · 2025-12-02T13:48:02Z

app/services/voice_entry_processor.rb

@@ -0,0 +1,76 @@
+class VoiceEntryProcessor


[nitpick] The VoiceEntryProcessor class lacks documentation. Add a class-level comment explaining its purpose, parameters, and return value structure:

# Processes audio files containing voice descriptions of financial entries # and extracts structured transaction data using Google Gemini AI. # # @example # processor = VoiceEntryProcessor.new(user: current_user, audio_file: tempfile) # result = processor.process # # => { success: true, data: { amount: 45.0, category_name: "Groceries", ... }, transcription: "..." } # class VoiceEntryProcessor

Copilot · 2025-12-02T13:48:02Z

db/migrate/20251202132403_add_voice_entries_enabled_to_users.rb

@@ -0,0 +1,5 @@
+class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1]


The migration class uses ActiveRecord::Migration[8.1], but the Rails version in the Gemfile is ~> 8.0. Ensure this migration version is compatible with Rails 8.0, or update to ActiveRecord::Migration[8.0] to match the Rails version being used.

Suggested change

class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1]

class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.0]

Copilot · 2025-12-02T13:48:02Z

app/javascript/controllers/voice_recorder_controller.js

+
+  populateForm(data) {
+    // Populate amount field
+    if (data.amount && this.hasAmountFieldTarget) {


The populateForm method doesn't handle null values for amount correctly. When data.amount is null (as documented in the prompt at line 73 of voice_entry_processor.rb), the truthy check if (data.amount && ...) will skip setting the field, but a better approach would be to explicitly handle null vs. zero vs. missing values. Consider:

if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) { this.amountFieldTarget.value = data.amount }

This ensures zero values are properly populated while null values are not.

Suggested change

if (data.amount && this.hasAmountFieldTarget) {

if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) {

travisdock self-assigned this Dec 2, 2025

travisdock requested a review from Copilot December 2, 2025 13:39

Copilot started reviewing on behalf of travisdock December 2, 2025 13:39 View session

Copilot finished reviewing on behalf of travisdock December 2, 2025 13:43

Copilot AI reviewed Dec 2, 2025

View reviewed changes

travisdock added 4 commits December 2, 2025 06:58

Voice entries

13e9c81

Improve routes, add feature flag to users, remove frontend messaging

1512d6e

Use newer model

da86196

Dynamic current year

abd9fd2

travisdock force-pushed the voice branch from 2c0832d to abd9fd2 Compare December 2, 2025 13:59

Update audio prompt

99b5ea5

travisdock force-pushed the voice branch from bdccdbe to 99b5ea5 Compare December 3, 2025 12:37

travisdock merged commit 14e91c4 into main Dec 3, 2025

travisdock deleted the voice branch December 3, 2025 12:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice entries#239

Voice entries#239
travisdock merged 5 commits intomainfrom
voice

travisdock commented Dec 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		%div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" }
		%p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually:

		@@ -0,0 +1,5 @@
		class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1]

	if (data.amount && this.hasAmountFieldTarget) {
	if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) {

Conversation

travisdock commented Dec 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant