Skip to content

Voice entries#239

Merged
travisdock merged 5 commits intomainfrom
voice
Dec 3, 2025
Merged

Voice entries#239
travisdock merged 5 commits intomainfrom
voice

Conversation

@travisdock
Copy link
Owner

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces voice entry functionality, allowing users to create financial entries by recording their voice. The feature uses Google Gemini AI via the RubyLLM gem to transcribe and extract structured transaction data from audio recordings.

Key Changes:

  • Adds voice recording capability with AI-powered transaction extraction using Google Gemini
  • Updates Ruby version from 3.4.4 to 3.4.7 across all configuration files
  • Implements rate limiting (5 requests/minute) and feature flag for voice entries

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 19 comments.

Show a summary per file
File Description
db/migrate/20251202132403_add_voice_entries_enabled_to_users.rb Adds voice_entries_enabled boolean flag to users table
db/schema.rb Updates schema version to 8.1 and adds voice_entries_enabled column (plus alphabetical reordering)
config/routes.rb Adds voice_entries resource endpoint (create only)
config/initializers/ruby_llm.rb Configures RubyLLM with Gemini API key from Rails credentials
config/credentials.yml.enc Updates encrypted credentials (likely adding Gemini API key)
app/controllers/voice_entries_controller.rb New controller handling voice entry creation with validation and rate limiting
app/services/voice_entry_processor.rb New service processing audio files via Gemini AI to extract transaction data
app/javascript/controllers/voice_recorder_controller.js New Stimulus controller managing voice recording UI and form population
app/views/entries/new.html.haml Conditionally displays voice recorder interface for enabled users
app/views/entries/_voice_recorder.html.haml Voice recorder button partial
app/views/entries/_form.html.haml Adds data targets to form fields for voice recorder integration
Gemfile Updates Ruby version to 3.4.7 and adds ruby_llm gem
Gemfile.lock Reflects ruby_llm dependencies and Ruby version update
Dockerfile Updates base image to Ruby 3.4.7
Dockerfile.prod Updates base image to Ruby 3.4.7

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 63 to 69
- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
- notes: Any additional context or description

USER'S AVAILABLE CATEGORIES: #{categories_list}

Return ONLY a JSON object like:
{"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The hardcoded year "2025" in the example date format may cause confusion when this code runs in different years. Consider using a more generic example or dynamically generating the current year:

current_year = Date.today.year
<<~PROMPT
  ...
  - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{current_year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
  ...
  {"amount": 45, "category_name": "Groceries", "date": "#{current_year}-10-15", "notes": "groceries at Trader Joe's"}
  ...
PROMPT
Suggested change
- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
- notes: Any additional context or description
USER'S AVAILABLE CATEGORIES: #{categories_list}
Return ONLY a JSON object like:
{"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}
- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{Date.today.year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
- notes: Any additional context or description
USER'S AVAILABLE CATEGORIES: #{categories_list}
Return ONLY a JSON object like:
{"amount": 45, "category_name": "Groceries", "date": "#{Date.today.year}-10-15", "notes": "groceries at Trader Joe's"}

Copilot uses AI. Check for mistakes.
# Then add:
# google:
# gemini_api_key: YOUR_API_KEY_HERE
config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key)
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initializer will fail silently if the Gemini API key is not configured, which could lead to runtime errors when voice entries are used. Consider adding a warning or error if the API key is missing:

api_key = Rails.application.credentials.dig(:google, :gemini_api_key)
Rails.logger.warn "Gemini API key not configured" if api_key.blank?
config.gemini_api_key = api_key
Suggested change
config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key)
api_key = Rails.application.credentials.dig(:google, :gemini_api_key)
Rails.logger.warn "Gemini API key not configured" if api_key.blank?
config.gemini_api_key = api_key

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +73
class VoiceEntriesController < ApplicationController
# Rate limiting: 5 requests per minute per user
before_action :check_voice_entries_access
before_action :check_rate_limit

def create
validate_audio!

processor = VoiceEntryProcessor.new(
user: current_user,
audio_file: params[:audio_file].tempfile
)

result = processor.process

if result[:success]
render json: result
else
render json: result, status: :unprocessable_entity
end
rescue => e
Rails.logger.error "Voice entry creation failed: #{e.message}"
render json: {
success: false,
error: e.message,
transcription: nil
}, status: :unprocessable_entity
end

private

def check_voice_entries_access
unless current_user.voice_entries_enabled?
render json: {
success: false,
error: 'Voice entries feature is not enabled for your account.',
transcription: nil
}, status: :forbidden
end
end

def validate_audio!
audio = params[:audio_file]

raise 'No audio file provided' unless audio.present?

# Check content type
unless audio.content_type&.start_with?('audio/')
raise 'Invalid file type. Must be audio file.'
end

# Check file size (max 5MB)
if audio.size > 5.megabytes
raise 'Audio file too large. Maximum size is 5MB.'
end
end

def check_rate_limit
cache_key = "voice_entries:#{current_user.id}:#{Time.current.to_i / 60}"
count = Rails.cache.read(cache_key) || 0

if count >= 5
render json: {
success: false,
error: 'Too many requests. Please wait a moment.',
transcription: nil
}, status: :too_many_requests
return
end

Rails.cache.write(cache_key, count + 1, expires_in: 1.minute)
end
end
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no test coverage for the new VoiceEntriesController. Given that the repository has comprehensive test coverage for other controllers (e.g., spec/controllers/entries_spec.rb), tests should be added to cover:

  • Authentication and authorization checks
  • Rate limiting behavior
  • Audio file validation
  • Success and error scenarios
  • Voice entries feature flag check

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 76
class VoiceEntryProcessor
attr_reader :user, :audio_file

def initialize(user:, audio_file:)
@user = user
@audio_file = audio_file
end

def process
# Create a chat instance with Gemini
chat = RubyLLM.chat(model: 'gemini-2.5-flash-lite')

# Ask the question with the audio file
response = chat.ask(build_system_prompt, with: audio_file.path)

# Extract text from RubyLLM::Message object
response_text = if response.respond_to?(:content)
response.content
elsif response.respond_to?(:text)
response.text
else
response.to_s
end

# Extract JSON from response (in case there's extra text)
json_match = response_text.match(/\{.*\}/m)
raise 'No JSON found in response' unless json_match

data = JSON.parse(json_match[0])

{
success: true,
data: {
amount: data['amount'],
category_name: data['category_name'],
date: data['date'] || Date.today.to_s,
notes: data['notes']
},
transcription: response_text
}
rescue => e
Rails.logger.error "Voice entry processing failed: #{e.message}"
Rails.logger.error e.backtrace.join("\n")

{
success: false,
error: "Failed to process voice recording: #{e.message}",
transcription: nil
}
end

private

def build_system_prompt
categories_list = user.categories.pluck(:name).join(', ')

<<~PROMPT
You are a financial entry assistant. Extract transaction information from speech and return as JSON.

Extract:
- amount: Dollar amount as a number (e.g., 45 or 45.50)
- category_name: Category that best matches the description from the available categories
- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
- notes: Any additional context or description

USER'S AVAILABLE CATEGORIES: #{categories_list}

Return ONLY a JSON object like:
{"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"}

IMPORTANT: Only include a date if the user explicitly mentions one. If no date is mentioned, use null for the date field.

If you cannot extract amount or category, set them to null.
PROMPT
end
end
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no test coverage for the VoiceEntryProcessor service. Given that the repository has comprehensive test coverage for controllers, tests should be added to cover:

  • Successful audio processing and JSON extraction
  • Error handling when JSON is not found in the response
  • Error handling for API failures
  • Date parsing logic (explicit dates vs. null)
  • Category matching from user's available categories

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +8
%div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" }
%p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually:
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inline styles in the view reduce maintainability and make it harder to enforce consistent styling. Consider moving these styles to a CSS class:

%div.voice-recorder-divider
  %p.voice-recorder-divider-text Or enter manually:

And define in CSS:

.voice-recorder-divider {
  margin-top: 1.5rem;
  padding-top: 1.5rem;
  border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);
}

.voice-recorder-divider-text {
  text-align: center;
  color: var(--pico-muted-color, #666);
  font-size: 0.9rem;
}
Suggested change
%div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" }
%p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually:
%div.voice-recorder-divider
%p.voice-recorder-divider-text Or enter manually:

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +11
processor = VoiceEntryProcessor.new(
user: current_user,
audio_file: params[:audio_file].tempfile
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tempfile from the uploaded audio could be deleted before the VoiceEntryProcessor completes if the request processing is slow or if garbage collection occurs. Consider reading the file content into memory first or ensuring the tempfile persists for the duration of processing:

processor = VoiceEntryProcessor.new(
  user: current_user,
  audio_file: params[:audio_file].tempfile.tap(&:rewind)
)

Alternatively, handle the file content directly rather than passing the path.

Suggested change
processor = VoiceEntryProcessor.new(
user: current_user,
audio_file: params[:audio_file].tempfile
# Read the uploaded audio file into memory to avoid tempfile deletion issues
require 'stringio'
audio_io = StringIO.new(params[:audio_file].read)
audio_io.set_encoding(params[:audio_file].tempfile.external_encoding) if params[:audio_file].tempfile.respond_to?(:external_encoding)
processor = VoiceEntryProcessor.new(
user: current_user,
audio_file: audio_io

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,76 @@
class VoiceEntryProcessor
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The VoiceEntryProcessor class lacks documentation. Add a class-level comment explaining its purpose, parameters, and return value structure:

# Processes audio files containing voice descriptions of financial entries
# and extracts structured transaction data using Google Gemini AI.
#
# @example
#   processor = VoiceEntryProcessor.new(user: current_user, audio_file: tempfile)
#   result = processor.process
#   # => { success: true, data: { amount: 45.0, category_name: "Groceries", ... }, transcription: "..." }
#
class VoiceEntryProcessor

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,5 @@
class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1]
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration class uses ActiveRecord::Migration[8.1], but the Rails version in the Gemfile is ~> 8.0. Ensure this migration version is compatible with Rails 8.0, or update to ActiveRecord::Migration[8.0] to match the Rails version being used.

Suggested change
class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1]
class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.0]

Copilot uses AI. Check for mistakes.

populateForm(data) {
// Populate amount field
if (data.amount && this.hasAmountFieldTarget) {
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The populateForm method doesn't handle null values for amount correctly. When data.amount is null (as documented in the prompt at line 73 of voice_entry_processor.rb), the truthy check if (data.amount && ...) will skip setting the field, but a better approach would be to explicitly handle null vs. zero vs. missing values. Consider:

if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) {
  this.amountFieldTarget.value = data.amount
}

This ensures zero values are properly populated while null values are not.

Suggested change
if (data.amount && this.hasAmountFieldTarget) {
if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) {

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant