Skip to content

Add support for custom selection fields#25

Merged
renatomassaro merged 5 commits intomainfrom
add-support-for-custom-selection-fields
May 4, 2025
Merged

Add support for custom selection fields#25
renatomassaro merged 5 commits intomainfrom
add-support-for-custom-selection-fields

Conversation

@renatomassaro
Copy link
Copy Markdown
Owner

@renatomassaro renatomassaro commented May 3, 2025

Now you no longer have to always SELECT *, you can fetch a subset of the entire row with an API that looks like this:

DB.one({:invoices, :fetch}, [1], select: [:id, :total])

DB.all(Invoice, [], select: [:id, :total])

Future improvements (still in the scope of this PR):

  • Make sure custom selection works for user-defined queries (not only templated queries)
  • Expand coverage to include format: :type
  • Figure out a better name for format: :type. Maybe format: :map or format: :subset?

IMO user-defined queries should almost always SELECT * and have their selection narrowed at runtime via adhoc queries.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Added support for selecting specific fields when fetching or inserting data, allowing users to customize which fields are returned in query results.
    • Introduced dynamic query compilation to support ad-hoc queries with custom selected fields.
    • Added a new query to filter records by integer values.
  • Bug Fixes

    • Improved handling of partially loaded data, with unselected fields now returned as not loaded.
  • Tests

    • Expanded test coverage for custom field selection and composite primary keys to ensure correct behavior with new selection features.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented May 3, 2025

Walkthrough

The changes introduce support for selecting specific fields in database queries by allowing a :select option to specify target fields, defaulting to [:*]. Core query functions extract these fields and pass them to the query ID generator, which now creates suffixed query IDs and compiles queries dynamically when needed. SQL compilation logic was enhanced to replace SELECT * with the selected fields and to generate appropriate SQL for SELECT and UPDATE statements. The cast macro’s default target fields were updated to [:*]. Tests were expanded to verify correct behavior with custom selections, composite primary keys, and partially loaded structs. The result formatting in the repo was adjusted to use :map instead of :type.

Changes

File(s) Change Summary
lib/feeb/db.ex Updated one/3, all/3, and insert/2 to extract target_fields from opts[:select] (default [:*]), passing them to Query.get_templated_query_id/3. Added private helper get_query_id_for_select_query/2 to conditionally compile ad-hoc queries based on selected fields.
lib/feeb/db/query.ex Implemented compile_adhoc_query/2 to compile queries selecting subsets of fields from SELECT * originals, validating fields. Refactored get_templated_query_id/3 to handle :__all and :__fetch with suffixed query IDs based on sorted target fields and compile queries if missing. Modified insert and delete handling to use [:*] and simplified delete clause. Added helpers generate_select_clause/2, generate_update_set_clause/1, and get_query_name_suffix/1.
lib/feeb/db/schema.ex Changed default target_fields in cast/2 macro from :all to [:*]. Added a comment questioning why cast is a macro.
lib/feeb/db/repo.ex Replaced all :type format options with :map in result formatting. Renamed and updated private helper from create_types_from_rows/3 to create_maps_from_rows/3 to produce maps of selected fields instead of types.
test/db/db_test.exs Renamed and expanded tests for DB.one/3 and DB.all/3 to cover behavior with custom field selections and composite primary keys, including checks for partially loaded structs. Added tests for user-defined queries with custom selections.
test/db/query_test.exs Added tests for compile_adhoc_query/2 verifying subset field compilation and error handling. Updated all tests to use [:*] consistently for all-fields queries. Added tests for custom target fields in :__all and :__fetch queries, verifying SQL, target fields, bindings, and errors.
priv/test/queries/test/all_types.sql Added new query :get_by_integer selecting all columns from all_types where integer matches a parameter.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant DB
    participant Query

    Caller->>DB: one({domain, :fetch}, bindings, opts)
    DB->>DB: Extract target_fields from opts[:select] (default [:*])
    DB->>Query: get_templated_query_id(domain, :__fetch, target_fields)
    Query->>Query: Generate suffixed query ID from target_fields
    Query->>Query: Compile query if not already compiled
    Query-->>DB: Return query ID
    DB-->>Caller: Return query result with selected fields
Loading
sequenceDiagram
    participant Caller
    participant DB
    participant Query

    Caller->>DB: all(schema, bindings, opts)
    DB->>DB: Extract target_fields from opts[:select] (default [:*])
    DB->>Query: get_templated_query_id(schema, :__all, target_fields)
    Query->>Query: Generate suffixed query ID from target_fields
    Query->>Query: Compile query if not already compiled
    Query-->>DB: Return query ID
    DB-->>Caller: Return query results with selected fields
Loading

Poem

In the warren where queries hop and play,
Now fields can be picked in a custom array!
With suffixes sorted and SQL anew,
Only what’s needed is fetched for you.
Tests leap with delight, partial structs in tow—
Oh, what a field day for Feeb’s bunny show!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
lib/feeb/db/query.ex (2)

166-171: Same validation concern as above for SELECT queries – see earlier comment on lines 61-74.


173-184: Same validation concern as above for SELECT … WHERE queries – see earlier comment on lines 61-74.

🧹 Nitpick comments (8)
lib/feeb/db/schema.ex (2)

129-130: Consider documenting the reason for the macro vs function question

The TODO comment asks why this is implemented as a macro. While this is a good question, it would be helpful to either document the rationale or convert it to a regular function if there's no clear reason for it to be a macro.


145-147: Consider implementing the TODO validations

There are TODOs for validating that the arguments match target fields exactly and that required fields are present. Consider implementing these validations as they would improve data integrity.

lib/feeb/db.ex (1)

145-149: Consider adding error handling for invalid field selections

While the implementation correctly handles the happy path, consider adding validation to ensure the selected fields actually exist in the schema and throwing meaningful errors when invalid fields are requested.

def all({_, domain, query_name}, bindings, opts) do
  bindings = if is_list(bindings), do: bindings, else: [bindings]

+ # Validate selected fields if schema is available
+ if schema = opts[:schema] && opts[:select] do
+   invalid_fields = Enum.reject(opts[:select], fn field -> 
+     field == :* || Map.has_key?(schema.__schema__(), field)
+   end)
+   unless invalid_fields == [], do: 
+     raise "Invalid fields selected: #{inspect invalid_fields}"
+ end

  case GenServer.call(get_pid!(), {:query, :all, {domain, query_name}, bindings, opts}) do
    {:ok, rows} -> rows
    {:error, reason} -> raise reason
  end
end
lib/feeb/db/query.ex (5)

80-86: Dead code after raise makes intent unclear

The else branch both raises and then “returns” target_fields, but that expression is never reached:

else
  raise "Not supported for now, add & test it once needed"
  target_fields   # ← unreachable
end

Delete the unreachable line or convert the branch to a proper {:error, reason} tuple if you plan to handle it later.


124-133: Minor: Avoid unnecessary work when a compiled :__delete query already exists

model = Schema.get_model_from_query_id(query_id) is executed even when the query is found in the cache and immediately returned.
Moving the model lookup inside the nil -> clause saves a lookup on the hot path:

-  model = Schema.get_model_from_query_id(query_id)
-
   case get(query_id) do
     {_, _, _} ->
       query_id
     nil ->
+      model = Schema.get_model_from_query_id(query_id)
       compile_templated_query(:__delete, query_id, target_fields, model)
   end

247-255: Empty target_fields produces awkward query-id :$

get_query_name_suffix([]) returns an empty string, so the resulting atom becomes :"__all$" (note the trailing $).
Either drop the suffix altogether for []/[:*] or fall back to "*" to keep the atom readable and unique.

-  |> Enum.join("$")
+  |> case do
+       "" -> "*"
+       suffix -> suffix
+     end

278-291: Implement generate_select_clause/1 more idiomatically

Enum.reduce/3 is overkill here; Enum.join/2 is clearer and allocates less:

-    select_conditions =
-      fields
-      |> Enum.reduce([], fn field, acc ->
-        ["#{field}" | acc]
-      end)
-      |> Enum.reverse()
-      |> Enum.join(", ")
-
-    "SELECT #{select_conditions}"
+    "SELECT " <> Enum.join(fields, ", ")

292-303: generate_update_set_clause/1 can be simplified and hardened

  1. Use Enum.map_join/3 for clarity.
  2. Quote identifiers with " to avoid clashes with reserved words (even after earlier validation).
-defp generate_update_set_clause(fields) when is_list(fields) do
-  set_conditions =
-    fields
-    |> Enum.reduce([], fn field, acc ->
-      ["#{field} = ?" | acc]
-    end)
-    |> Enum.reverse()
-    |> Enum.join(", ")
-
-  "SET #{set_conditions}"
-end
+defp generate_update_set_clause(fields) do
+  "SET " <>
+    Enum.map_join(fields, ", ", fn field ->
+      ~s("#{field}" = ?)
+    end)
+end
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 55926a0 and 7674392.

📒 Files selected for processing (5)
  • lib/feeb/db.ex (3 hunks)
  • lib/feeb/db/query.ex (5 hunks)
  • lib/feeb/db/schema.ex (1 hunks)
  • test/db/db_test.exs (3 hunks)
  • test/db/query_test.exs (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
test/db/db_test.exs (4)
lib/feeb/db.ex (3)
  • all (138-140)
  • all (142-149)
  • insert! (157-160)
lib/feeb/db/sqlite.ex (2)
  • one (59-66)
  • all (55-57)
test/support/db/schemas/post.ex (1)
  • new (17-21)
test/support/db/schemas/friend.ex (1)
  • new (17-21)
🔇 Additional comments (18)
lib/feeb/db/schema.ex (1)

130-130: Approved default parameter change from :all to [:*]

Good change to update the default parameter to use the new consistent convention [:*] for representing all fields, which aligns with the changes in query handling throughout the codebase.

test/db/query_test.exs (7)

57-57: Updated parameter to use the new convention

Good change to use the new [:*] convention instead of the previous empty list or :all. This makes the code more consistent throughout the codebase.


66-76: Good test coverage for custom field selection

The new test verifies that specifying a subset of fields for selection works correctly, generating the appropriate SQL query with only the selected fields in the SELECT clause.


82-82: Updated parameter to use the new convention

Consistent usage of the new [:*] convention.


95-95: Updated parameter to use the new convention

Consistent usage of the new [:*] convention.


104-114: Good test for composite primary keys with custom field selection

The test effectively verifies that custom field selection works correctly with schemas that have composite primary keys, ensuring that the correct SQL is generated.


123-123: Updated parameter to use the new convention

Consistent usage of the new [:*] convention.


135-135: Updated parameter to use the new convention

Consistent usage of the new [:*] convention.

test/db/db_test.exs (7)

5-5: Added NotLoaded alias for testing with custom field selection

Good addition of the NotLoaded alias, which is used to verify that unselected fields are properly marked as not loaded.


402-402: Updated test description to match the function's actual arity

Good update to the test description, changing from "one/1" to "one/3" to accurately reflect the function's arity and the additional options parameter being tested.


410-410: Updated test to use the new template query naming convention

Updated from :fetch to :__fetch to match the internal naming convention for templated queries.


417-422: Good test for custom field selection with the fetch query

This test properly verifies that when only specific fields are requested, unselected fields are returned as %NotLoaded{} values. It demonstrates that the select: option works as expected.


424-438: Good test for composite primary keys

This test verifies that the :__fetch templated query works correctly with schemas that have composite primary keys, which is an important edge case to cover.


516-527: Good test for the :__all templated query

This test ensures that the DB.all function works correctly with the :__all templated query, both for the Friend schema (which has pre-existing data) and for a newly inserted Post.


529-537: Good test for custom field selection with the :__all query

This test verifies that when only specific fields are requested in the DB.all function, unselected fields are properly marked as NotLoaded, demonstrating that the select: option works as expected.

lib/feeb/db.ex (3)

100-103: Good implementation of field selection for fetch queries

The code now correctly extracts target_fields from the options or defaults to [:*], and passes them to the query ID generator. This enables custom field selection for fetch queries.


131-134: Good implementation of field selection for all queries

Similar to the fetch queries, this code now supports custom field selection for the DB.all function with schema targets.


153-153: Updated parameter to use the new convention

Changed from empty list to [:*] to maintain consistency with the new convention for specifying all fields.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/feeb/db/query.ex (2)

60-61: String.replace/4 can replace the wrong asterisk

String.replace(sql, "*", …, global: false) replaces the first *, but that might not be the star in SELECT * (e.g. json_extract(data,'$.foo*')).
Safer to target the exact SELECT * FROM pattern:

-new_sql = String.replace(sql, "*", Enum.join(sorted_target_fields, ", "), global: false)
+new_sql =
+  Regex.replace(
+    ~r/SELECT\s+\*\s+FROM/i,
+    sql,
+    "SELECT #{Enum.join(sorted_target_fields, ", ")} FROM",
+    global: false
+  )

This avoids accidental mutation of other * occurrences.


255-263: Minor: Enum.reduce/3 here is over-engineering

Since you already sort the list, building the suffix can be simpler and clearer:

-defp get_query_name_suffix(target_fields) when is_list(target_fields) do
-  target_fields
-  |> Enum.sort()
-  |> Enum.reduce([], fn field, acc ->
-    ["#{field}" | acc]
-  end)
-  |> Enum.reverse()
-  |> Enum.join("$")
-end
+defp get_query_name_suffix(target_fields) do
+  target_fields
+  |> Enum.sort()
+  |> Enum.map(&to_string/1)
+  |> Enum.join("$")
+end

This reduces two passes over the list to one and is more idiomatic.

priv/test/queries/test/all_types.sql (1)

10-12: Consider adding an index on integer to support the new :get_by_integer query

The newly-added query will benefit from a database index if the all_types table is large:

CREATE INDEX IF NOT EXISTS all_types_integer_idx ON all_types(integer);

Not part of this PR’s code, but worth noting for deployment.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7674392 and 8ff4e9c.

📒 Files selected for processing (5)
  • lib/feeb/db.ex (4 hunks)
  • lib/feeb/db/query.ex (5 hunks)
  • priv/test/queries/test/all_types.sql (1 hunks)
  • test/db/db_test.exs (4 hunks)
  • test/db/query_test.exs (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/db/db_test.exs
  • test/db/query_test.exs
  • lib/feeb/db.ex
🧰 Additional context used
🧠 Learnings (1)
lib/feeb/db/query.ex (1)
Learnt from: renatomassaro
PR: renatomassaro/FeebDB#25
File: lib/feeb/db/query.ex:61-74
Timestamp: 2025-05-03T20:33:34.811Z
Learning: In FeebDB, when implementing field validation for custom selection fields, consider that validation errors would still be runtime errors, but can provide clearer error messages closer to the source of the problem.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ff4e9c and 86dca78.

📒 Files selected for processing (2)
  • lib/feeb/db/repo.ex (3 hunks)
  • test/db/db_test.exs (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/db/db_test.exs
🔇 Additional comments (4)
lib/feeb/db/repo.ex (4)

286-287: Function signature change to reflect format option renaming.

The format option has been renamed from :type to :map to better describe its purpose in returning maps with selected fields. This change aligns with the PR objective of supporting custom field selection.


298-299: Consistent renaming of format option for list results.

This change maintains consistency with the single-row case by using :map format instead of :type.


304-305: Format check updated for insert results.

The format check has been updated to use :map instead of :type, keeping it consistent with other format changes throughout the file.


367-372:

Details

✅ Verification successful

Implementation of create_maps_from_rows has future performance considerations.

The implementation creates full schema results before extracting only the requested fields, which is noted as not being optimal for performance. While prioritizing readability is reasonable for now, this might become a bottleneck for large datasets or complex schemas.

Consider optimizing this in the future by directly creating maps with only the requested fields rather than building the full schema first. This would avoid the overhead of constructing and then filtering data.

defp create_maps_from_rows(query_id, {_, {fields_bindings, _}, :select} = query, rows) do
  # Performance-wise, not the best solution, but I'd rather keep the code readable for a bit
  # longer. Simply create the full schema and use only the fields the user selected
  create_schema_from_rows(query_id, query, rows)
  |> Enum.map(fn full_result -> Map.take(full_result, fields_bindings) end)
end

🌐 Web query:

What is the performance impact of Map.take in Elixir compared to direct map creation?

💡 Result:

Based on benchmarking results, Map.take in Elixir is notably slower than alternative methods for retrieving multiple values from a map. When comparing different approaches for extracting multiple values from maps, some interesting performance patterns emerge.

Performance Comparison

According to benchmarks, Map.take is significantly slower than using repeated Map.fetch operations:

  • Map.take(map, [key_1, ..., key_100]) is approximately 2.5 times slower than using Enum.map([key_1, ..., key_100], &Map.fetch(map, &1))[1]
  • For larger key sets with 1000 keys, the performance gap widens to about 3.5 times slower[1]

This result is counterintuitive, as one might expect a specialized function like Map.take to be more optimized than making repeated individual calls.

Implementation Details

Looking at the Map.take implementation in Elixir's source code, we can see it works by iterating through the provided keys, checking if each key exists in the map, and building an accumulator list of key-value pairs which is eventually converted to a new map using :maps.from_list/1[6]. This implementation might explain some of the performance characteristics.

Alternatives and Their Performance

When considering different methods for accessing map values:

  1. Pattern matching is consistently the fastest approach - about 3 times faster than Map.get/2 and approximately 6 times faster than the Access behavior (square bracket syntax)[7]

  2. Map.get/2 performs better than the Access behavior (square bracket syntax) but is still significantly slower than direct pattern matching[7]

  3. Direct map creation through pattern matching or other techniques outperforms Map.take in most scenarios

Performance Considerations for Multiple Operations

For applications that require multiple map transformations, the Elixir documentation itself recommends:

"If you find yourself doing multiple calls to Map.filter/2 and Map.reject/2 in a pipeline, it is likely more efficient to use Enum.map/2 and Enum.filter/2 instead and convert to a map at the end using Map.new/1."[10]

This suggests that for complex map manipulations, working with enumerable functions and only converting to a map at the end can be more efficient than using the specialized Map functions.

When to Worry About These Differences

While these performance differences exist, it's important to note that in many real-world applications, the difference may be negligible. For example, when comparing Map.get/2 with the Access behavior, the difference was only about 0.5% in some benchmarks[5].

For most applications, code readability and maintainability should take precedence over these performance considerations unless you're dealing with performance-critical code paths that are executed frequently with large data structures.

Citations:


Performance Impact of Map.take in create_maps_from_rows

Based on recent benchmarks, Map.take/2 is about 2.5× slower for 100 keys and up to 3.5× slower for 1 000 keys compared to repeated fetches or direct map creation. While this overhead is usually negligible in most code paths, it’s worth keeping in mind for hot loops or very large datasets.

File: lib/feeb/db/repo.ex
Lines: 367–372

Current implementation:

defp create_maps_from_rows(query_id, {_, {fields_bindings, _}, :select} = query, rows) do
  # Performance-wise, not the best solution, but I'd rather keep the code readable for a bit
  # longer. Simply create the full schema and use only the fields the user selected
  create_schema_from_rows(query_id, query, rows)
  |> Enum.map(fn full_result -> Map.take(full_result, fields_bindings) end)
end

Recommendations for future optimization:

  • If this path becomes performance-critical, consider building the result maps directly:
    • Use pattern matching in the fn row -> %{key1: row.key1, key2: row.key2, …} end style
    • Or use Enum.map(fields, &{&1, Map.fetch!(row, &1)}) |> Map.new() to avoid the extra conversion step
  • Measure and profile before and after to ensure the change yields real-world benefits without sacrificing maintainability.

@renatomassaro renatomassaro merged commit 4b0bae6 into main May 4, 2025
5 checks passed
@renatomassaro renatomassaro deleted the add-support-for-custom-selection-fields branch May 4, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant