Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces voice entry functionality, allowing users to create financial entries by recording their voice. The feature uses Google Gemini AI via the RubyLLM gem to transcribe and extract structured transaction data from audio recordings.
Key Changes:
- Adds voice recording capability with AI-powered transaction extraction using Google Gemini
- Updates Ruby version from 3.4.4 to 3.4.7 across all configuration files
- Implements rate limiting (5 requests/minute) and feature flag for voice entries
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| db/migrate/20251202132403_add_voice_entries_enabled_to_users.rb | Adds voice_entries_enabled boolean flag to users table |
| db/schema.rb | Updates schema version to 8.1 and adds voice_entries_enabled column (plus alphabetical reordering) |
| config/routes.rb | Adds voice_entries resource endpoint (create only) |
| config/initializers/ruby_llm.rb | Configures RubyLLM with Gemini API key from Rails credentials |
| config/credentials.yml.enc | Updates encrypted credentials (likely adding Gemini API key) |
| app/controllers/voice_entries_controller.rb | New controller handling voice entry creation with validation and rate limiting |
| app/services/voice_entry_processor.rb | New service processing audio files via Gemini AI to extract transaction data |
| app/javascript/controllers/voice_recorder_controller.js | New Stimulus controller managing voice recording UI and form population |
| app/views/entries/new.html.haml | Conditionally displays voice recorder interface for enabled users |
| app/views/entries/_voice_recorder.html.haml | Voice recorder button partial |
| app/views/entries/_form.html.haml | Adds data targets to form fields for voice recorder integration |
| Gemfile | Updates Ruby version to 3.4.7 and adds ruby_llm gem |
| Gemfile.lock | Reflects ruby_llm dependencies and Ruby version update |
| Dockerfile | Updates base image to Ruby 3.4.7 |
| Dockerfile.prod | Updates base image to Ruby 3.4.7 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null. | ||
| - notes: Any additional context or description | ||
|
|
||
| USER'S AVAILABLE CATEGORIES: #{categories_list} | ||
|
|
||
| Return ONLY a JSON object like: | ||
| {"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"} |
There was a problem hiding this comment.
[nitpick] The hardcoded year "2025" in the example date format may cause confusion when this code runs in different years. Consider using a more generic example or dynamically generating the current year:
current_year = Date.today.year
<<~PROMPT
...
- date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{current_year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null.
...
{"amount": 45, "category_name": "Groceries", "date": "#{current_year}-10-15", "notes": "groceries at Trader Joe's"}
...
PROMPT| - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null. | |
| - notes: Any additional context or description | |
| USER'S AVAILABLE CATEGORIES: #{categories_list} | |
| Return ONLY a JSON object like: | |
| {"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"} | |
| - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "#{Date.today.year}-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null. | |
| - notes: Any additional context or description | |
| USER'S AVAILABLE CATEGORIES: #{categories_list} | |
| Return ONLY a JSON object like: | |
| {"amount": 45, "category_name": "Groceries", "date": "#{Date.today.year}-10-15", "notes": "groceries at Trader Joe's"} |
| # Then add: | ||
| # google: | ||
| # gemini_api_key: YOUR_API_KEY_HERE | ||
| config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key) |
There was a problem hiding this comment.
The initializer will fail silently if the Gemini API key is not configured, which could lead to runtime errors when voice entries are used. Consider adding a warning or error if the API key is missing:
api_key = Rails.application.credentials.dig(:google, :gemini_api_key)
Rails.logger.warn "Gemini API key not configured" if api_key.blank?
config.gemini_api_key = api_key| config.gemini_api_key = Rails.application.credentials.dig(:google, :gemini_api_key) | |
| api_key = Rails.application.credentials.dig(:google, :gemini_api_key) | |
| Rails.logger.warn "Gemini API key not configured" if api_key.blank? | |
| config.gemini_api_key = api_key |
| class VoiceEntriesController < ApplicationController | ||
| # Rate limiting: 5 requests per minute per user | ||
| before_action :check_voice_entries_access | ||
| before_action :check_rate_limit | ||
|
|
||
| def create | ||
| validate_audio! | ||
|
|
||
| processor = VoiceEntryProcessor.new( | ||
| user: current_user, | ||
| audio_file: params[:audio_file].tempfile | ||
| ) | ||
|
|
||
| result = processor.process | ||
|
|
||
| if result[:success] | ||
| render json: result | ||
| else | ||
| render json: result, status: :unprocessable_entity | ||
| end | ||
| rescue => e | ||
| Rails.logger.error "Voice entry creation failed: #{e.message}" | ||
| render json: { | ||
| success: false, | ||
| error: e.message, | ||
| transcription: nil | ||
| }, status: :unprocessable_entity | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def check_voice_entries_access | ||
| unless current_user.voice_entries_enabled? | ||
| render json: { | ||
| success: false, | ||
| error: 'Voice entries feature is not enabled for your account.', | ||
| transcription: nil | ||
| }, status: :forbidden | ||
| end | ||
| end | ||
|
|
||
| def validate_audio! | ||
| audio = params[:audio_file] | ||
|
|
||
| raise 'No audio file provided' unless audio.present? | ||
|
|
||
| # Check content type | ||
| unless audio.content_type&.start_with?('audio/') | ||
| raise 'Invalid file type. Must be audio file.' | ||
| end | ||
|
|
||
| # Check file size (max 5MB) | ||
| if audio.size > 5.megabytes | ||
| raise 'Audio file too large. Maximum size is 5MB.' | ||
| end | ||
| end | ||
|
|
||
| def check_rate_limit | ||
| cache_key = "voice_entries:#{current_user.id}:#{Time.current.to_i / 60}" | ||
| count = Rails.cache.read(cache_key) || 0 | ||
|
|
||
| if count >= 5 | ||
| render json: { | ||
| success: false, | ||
| error: 'Too many requests. Please wait a moment.', | ||
| transcription: nil | ||
| }, status: :too_many_requests | ||
| return | ||
| end | ||
|
|
||
| Rails.cache.write(cache_key, count + 1, expires_in: 1.minute) | ||
| end | ||
| end |
There was a problem hiding this comment.
There's no test coverage for the new VoiceEntriesController. Given that the repository has comprehensive test coverage for other controllers (e.g., spec/controllers/entries_spec.rb), tests should be added to cover:
- Authentication and authorization checks
- Rate limiting behavior
- Audio file validation
- Success and error scenarios
- Voice entries feature flag check
| class VoiceEntryProcessor | ||
| attr_reader :user, :audio_file | ||
|
|
||
| def initialize(user:, audio_file:) | ||
| @user = user | ||
| @audio_file = audio_file | ||
| end | ||
|
|
||
| def process | ||
| # Create a chat instance with Gemini | ||
| chat = RubyLLM.chat(model: 'gemini-2.5-flash-lite') | ||
|
|
||
| # Ask the question with the audio file | ||
| response = chat.ask(build_system_prompt, with: audio_file.path) | ||
|
|
||
| # Extract text from RubyLLM::Message object | ||
| response_text = if response.respond_to?(:content) | ||
| response.content | ||
| elsif response.respond_to?(:text) | ||
| response.text | ||
| else | ||
| response.to_s | ||
| end | ||
|
|
||
| # Extract JSON from response (in case there's extra text) | ||
| json_match = response_text.match(/\{.*\}/m) | ||
| raise 'No JSON found in response' unless json_match | ||
|
|
||
| data = JSON.parse(json_match[0]) | ||
|
|
||
| { | ||
| success: true, | ||
| data: { | ||
| amount: data['amount'], | ||
| category_name: data['category_name'], | ||
| date: data['date'] || Date.today.to_s, | ||
| notes: data['notes'] | ||
| }, | ||
| transcription: response_text | ||
| } | ||
| rescue => e | ||
| Rails.logger.error "Voice entry processing failed: #{e.message}" | ||
| Rails.logger.error e.backtrace.join("\n") | ||
|
|
||
| { | ||
| success: false, | ||
| error: "Failed to process voice recording: #{e.message}", | ||
| transcription: nil | ||
| } | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def build_system_prompt | ||
| categories_list = user.categories.pluck(:name).join(', ') | ||
|
|
||
| <<~PROMPT | ||
| You are a financial entry assistant. Extract transaction information from speech and return as JSON. | ||
|
|
||
| Extract: | ||
| - amount: Dollar amount as a number (e.g., 45 or 45.50) | ||
| - category_name: Category that best matches the description from the available categories | ||
| - date: ONLY if a date is explicitly mentioned. Parse dates like "October 15th" as "2025-10-15", "yesterday", "last Tuesday", etc. If NO date is mentioned, set to null. | ||
| - notes: Any additional context or description | ||
|
|
||
| USER'S AVAILABLE CATEGORIES: #{categories_list} | ||
|
|
||
| Return ONLY a JSON object like: | ||
| {"amount": 45, "category_name": "Groceries", "date": "2025-10-15", "notes": "groceries at Trader Joe's"} | ||
|
|
||
| IMPORTANT: Only include a date if the user explicitly mentions one. If no date is mentioned, use null for the date field. | ||
|
|
||
| If you cannot extract amount or category, set them to null. | ||
| PROMPT | ||
| end | ||
| end |
There was a problem hiding this comment.
There's no test coverage for the VoiceEntryProcessor service. Given that the repository has comprehensive test coverage for controllers, tests should be added to cover:
- Successful audio processing and JSON extraction
- Error handling when JSON is not found in the response
- Error handling for API failures
- Date parsing logic (explicit dates vs. null)
- Category matching from user's available categories
| %div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" } | ||
| %p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually: |
There was a problem hiding this comment.
[nitpick] Inline styles in the view reduce maintainability and make it harder to enforce consistent styling. Consider moving these styles to a CSS class:
%div.voice-recorder-divider
%p.voice-recorder-divider-text Or enter manually:And define in CSS:
.voice-recorder-divider {
margin-top: 1.5rem;
padding-top: 1.5rem;
border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);
}
.voice-recorder-divider-text {
text-align: center;
color: var(--pico-muted-color, #666);
font-size: 0.9rem;
}| %div{ style: "margin-top: 1.5rem; padding-top: 1.5rem; border-top: 1px solid var(--pico-muted-border-color, #e0e0e0);" } | |
| %p{ style: "text-align: center; color: var(--pico-muted-color, #666); font-size: 0.9rem;" } Or enter manually: | |
| %div.voice-recorder-divider | |
| %p.voice-recorder-divider-text Or enter manually: |
| processor = VoiceEntryProcessor.new( | ||
| user: current_user, | ||
| audio_file: params[:audio_file].tempfile |
There was a problem hiding this comment.
The tempfile from the uploaded audio could be deleted before the VoiceEntryProcessor completes if the request processing is slow or if garbage collection occurs. Consider reading the file content into memory first or ensuring the tempfile persists for the duration of processing:
processor = VoiceEntryProcessor.new(
user: current_user,
audio_file: params[:audio_file].tempfile.tap(&:rewind)
)Alternatively, handle the file content directly rather than passing the path.
| processor = VoiceEntryProcessor.new( | |
| user: current_user, | |
| audio_file: params[:audio_file].tempfile | |
| # Read the uploaded audio file into memory to avoid tempfile deletion issues | |
| require 'stringio' | |
| audio_io = StringIO.new(params[:audio_file].read) | |
| audio_io.set_encoding(params[:audio_file].tempfile.external_encoding) if params[:audio_file].tempfile.respond_to?(:external_encoding) | |
| processor = VoiceEntryProcessor.new( | |
| user: current_user, | |
| audio_file: audio_io |
| @@ -0,0 +1,76 @@ | |||
| class VoiceEntryProcessor | |||
There was a problem hiding this comment.
[nitpick] The VoiceEntryProcessor class lacks documentation. Add a class-level comment explaining its purpose, parameters, and return value structure:
# Processes audio files containing voice descriptions of financial entries
# and extracts structured transaction data using Google Gemini AI.
#
# @example
# processor = VoiceEntryProcessor.new(user: current_user, audio_file: tempfile)
# result = processor.process
# # => { success: true, data: { amount: 45.0, category_name: "Groceries", ... }, transcription: "..." }
#
class VoiceEntryProcessor| @@ -0,0 +1,5 @@ | |||
| class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1] | |||
There was a problem hiding this comment.
The migration class uses ActiveRecord::Migration[8.1], but the Rails version in the Gemfile is ~> 8.0. Ensure this migration version is compatible with Rails 8.0, or update to ActiveRecord::Migration[8.0] to match the Rails version being used.
| class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.1] | |
| class AddVoiceEntriesEnabledToUsers < ActiveRecord::Migration[8.0] |
|
|
||
| populateForm(data) { | ||
| // Populate amount field | ||
| if (data.amount && this.hasAmountFieldTarget) { |
There was a problem hiding this comment.
The populateForm method doesn't handle null values for amount correctly. When data.amount is null (as documented in the prompt at line 73 of voice_entry_processor.rb), the truthy check if (data.amount && ...) will skip setting the field, but a better approach would be to explicitly handle null vs. zero vs. missing values. Consider:
if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) {
this.amountFieldTarget.value = data.amount
}This ensures zero values are properly populated while null values are not.
| if (data.amount && this.hasAmountFieldTarget) { | |
| if (data.amount !== null && data.amount !== undefined && this.hasAmountFieldTarget) { |
No description provided.