Add support for custom selection fields#25
Conversation
WalkthroughThe changes introduce support for selecting specific fields in database queries by allowing a Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant DB
participant Query
Caller->>DB: one({domain, :fetch}, bindings, opts)
DB->>DB: Extract target_fields from opts[:select] (default [:*])
DB->>Query: get_templated_query_id(domain, :__fetch, target_fields)
Query->>Query: Generate suffixed query ID from target_fields
Query->>Query: Compile query if not already compiled
Query-->>DB: Return query ID
DB-->>Caller: Return query result with selected fields
sequenceDiagram
participant Caller
participant DB
participant Query
Caller->>DB: all(schema, bindings, opts)
DB->>DB: Extract target_fields from opts[:select] (default [:*])
DB->>Query: get_templated_query_id(schema, :__all, target_fields)
Query->>Query: Generate suffixed query ID from target_fields
Query->>Query: Compile query if not already compiled
Query-->>DB: Return query ID
DB-->>Caller: Return query results with selected fields
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
lib/feeb/db/query.ex (2)
166-171: Same validation concern as above forSELECTqueries – see earlier comment on lines 61-74.
173-184: Same validation concern as above forSELECT … WHEREqueries – see earlier comment on lines 61-74.
🧹 Nitpick comments (8)
lib/feeb/db/schema.ex (2)
129-130: Consider documenting the reason for the macro vs function questionThe TODO comment asks why this is implemented as a macro. While this is a good question, it would be helpful to either document the rationale or convert it to a regular function if there's no clear reason for it to be a macro.
145-147: Consider implementing the TODO validationsThere are TODOs for validating that the arguments match target fields exactly and that required fields are present. Consider implementing these validations as they would improve data integrity.
lib/feeb/db.ex (1)
145-149: Consider adding error handling for invalid field selectionsWhile the implementation correctly handles the happy path, consider adding validation to ensure the selected fields actually exist in the schema and throwing meaningful errors when invalid fields are requested.
def all({_, domain, query_name}, bindings, opts) do bindings = if is_list(bindings), do: bindings, else: [bindings] + # Validate selected fields if schema is available + if schema = opts[:schema] && opts[:select] do + invalid_fields = Enum.reject(opts[:select], fn field -> + field == :* || Map.has_key?(schema.__schema__(), field) + end) + unless invalid_fields == [], do: + raise "Invalid fields selected: #{inspect invalid_fields}" + end case GenServer.call(get_pid!(), {:query, :all, {domain, query_name}, bindings, opts}) do {:ok, rows} -> rows {:error, reason} -> raise reason end endlib/feeb/db/query.ex (5)
80-86: Dead code afterraisemakes intent unclearThe
elsebranch both raises and then “returns”target_fields, but that expression is never reached:else raise "Not supported for now, add & test it once needed" target_fields # ← unreachable endDelete the unreachable line or convert the branch to a proper
{:error, reason}tuple if you plan to handle it later.
124-133: Minor: Avoid unnecessary work when a compiled:__deletequery already exists
model = Schema.get_model_from_query_id(query_id)is executed even when the query is found in the cache and immediately returned.
Moving the model lookup inside thenil ->clause saves a lookup on the hot path:- model = Schema.get_model_from_query_id(query_id) - case get(query_id) do {_, _, _} -> query_id nil -> + model = Schema.get_model_from_query_id(query_id) compile_templated_query(:__delete, query_id, target_fields, model) end
247-255: Emptytarget_fieldsproduces awkward query-id:$
get_query_name_suffix([])returns an empty string, so the resulting atom becomes:"__all$"(note the trailing$).
Either drop the suffix altogether for[]/[:*]or fall back to"*"to keep the atom readable and unique.- |> Enum.join("$") + |> case do + "" -> "*" + suffix -> suffix + end
278-291: Implementgenerate_select_clause/1more idiomatically
Enum.reduce/3is overkill here;Enum.join/2is clearer and allocates less:- select_conditions = - fields - |> Enum.reduce([], fn field, acc -> - ["#{field}" | acc] - end) - |> Enum.reverse() - |> Enum.join(", ") - - "SELECT #{select_conditions}" + "SELECT " <> Enum.join(fields, ", ")
292-303:generate_update_set_clause/1can be simplified and hardened
- Use
Enum.map_join/3for clarity.- Quote identifiers with
"to avoid clashes with reserved words (even after earlier validation).-defp generate_update_set_clause(fields) when is_list(fields) do - set_conditions = - fields - |> Enum.reduce([], fn field, acc -> - ["#{field} = ?" | acc] - end) - |> Enum.reverse() - |> Enum.join(", ") - - "SET #{set_conditions}" -end +defp generate_update_set_clause(fields) do + "SET " <> + Enum.map_join(fields, ", ", fn field -> + ~s("#{field}" = ?) + end) +end
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/feeb/db.ex(3 hunks)lib/feeb/db/query.ex(5 hunks)lib/feeb/db/schema.ex(1 hunks)test/db/db_test.exs(3 hunks)test/db/query_test.exs(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
test/db/db_test.exs (4)
lib/feeb/db.ex (3)
all(138-140)all(142-149)insert!(157-160)lib/feeb/db/sqlite.ex (2)
one(59-66)all(55-57)test/support/db/schemas/post.ex (1)
new(17-21)test/support/db/schemas/friend.ex (1)
new(17-21)
🔇 Additional comments (18)
lib/feeb/db/schema.ex (1)
130-130: Approved default parameter change from:allto[:*]Good change to update the default parameter to use the new consistent convention
[:*]for representing all fields, which aligns with the changes in query handling throughout the codebase.test/db/query_test.exs (7)
57-57: Updated parameter to use the new conventionGood change to use the new
[:*]convention instead of the previous empty list or:all. This makes the code more consistent throughout the codebase.
66-76: Good test coverage for custom field selectionThe new test verifies that specifying a subset of fields for selection works correctly, generating the appropriate SQL query with only the selected fields in the SELECT clause.
82-82: Updated parameter to use the new conventionConsistent usage of the new
[:*]convention.
95-95: Updated parameter to use the new conventionConsistent usage of the new
[:*]convention.
104-114: Good test for composite primary keys with custom field selectionThe test effectively verifies that custom field selection works correctly with schemas that have composite primary keys, ensuring that the correct SQL is generated.
123-123: Updated parameter to use the new conventionConsistent usage of the new
[:*]convention.
135-135: Updated parameter to use the new conventionConsistent usage of the new
[:*]convention.test/db/db_test.exs (7)
5-5: Added NotLoaded alias for testing with custom field selectionGood addition of the NotLoaded alias, which is used to verify that unselected fields are properly marked as not loaded.
402-402: Updated test description to match the function's actual arityGood update to the test description, changing from "one/1" to "one/3" to accurately reflect the function's arity and the additional options parameter being tested.
410-410: Updated test to use the new template query naming conventionUpdated from
:fetchto:__fetchto match the internal naming convention for templated queries.
417-422: Good test for custom field selection with the fetch queryThis test properly verifies that when only specific fields are requested, unselected fields are returned as
%NotLoaded{}values. It demonstrates that theselect:option works as expected.
424-438: Good test for composite primary keysThis test verifies that the
:__fetchtemplated query works correctly with schemas that have composite primary keys, which is an important edge case to cover.
516-527: Good test for the :__all templated queryThis test ensures that the DB.all function works correctly with the :__all templated query, both for the Friend schema (which has pre-existing data) and for a newly inserted Post.
529-537: Good test for custom field selection with the :__all queryThis test verifies that when only specific fields are requested in the DB.all function, unselected fields are properly marked as NotLoaded, demonstrating that the
select:option works as expected.lib/feeb/db.ex (3)
100-103: Good implementation of field selection for fetch queriesThe code now correctly extracts target_fields from the options or defaults to [:*], and passes them to the query ID generator. This enables custom field selection for fetch queries.
131-134: Good implementation of field selection for all queriesSimilar to the fetch queries, this code now supports custom field selection for the DB.all function with schema targets.
153-153: Updated parameter to use the new conventionChanged from empty list to
[:*]to maintain consistency with the new convention for specifying all fields.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
lib/feeb/db/query.ex (2)
60-61:String.replace/4can replace the wrong asterisk
String.replace(sql, "*", …, global: false)replaces the first*, but that might not be the star inSELECT *(e.g.json_extract(data,'$.foo*')).
Safer to target the exactSELECT * FROMpattern:-new_sql = String.replace(sql, "*", Enum.join(sorted_target_fields, ", "), global: false) +new_sql = + Regex.replace( + ~r/SELECT\s+\*\s+FROM/i, + sql, + "SELECT #{Enum.join(sorted_target_fields, ", ")} FROM", + global: false + )This avoids accidental mutation of other
*occurrences.
255-263: Minor:Enum.reduce/3here is over-engineeringSince you already sort the list, building the suffix can be simpler and clearer:
-defp get_query_name_suffix(target_fields) when is_list(target_fields) do - target_fields - |> Enum.sort() - |> Enum.reduce([], fn field, acc -> - ["#{field}" | acc] - end) - |> Enum.reverse() - |> Enum.join("$") -end +defp get_query_name_suffix(target_fields) do + target_fields + |> Enum.sort() + |> Enum.map(&to_string/1) + |> Enum.join("$") +endThis reduces two passes over the list to one and is more idiomatic.
priv/test/queries/test/all_types.sql (1)
10-12: Consider adding an index onintegerto support the new:get_by_integerqueryThe newly-added query will benefit from a database index if the
all_typestable is large:CREATE INDEX IF NOT EXISTS all_types_integer_idx ON all_types(integer);Not part of this PR’s code, but worth noting for deployment.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/feeb/db.ex(4 hunks)lib/feeb/db/query.ex(5 hunks)priv/test/queries/test/all_types.sql(1 hunks)test/db/db_test.exs(4 hunks)test/db/query_test.exs(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- test/db/db_test.exs
- test/db/query_test.exs
- lib/feeb/db.ex
🧰 Additional context used
🧠 Learnings (1)
lib/feeb/db/query.ex (1)
Learnt from: renatomassaro
PR: renatomassaro/FeebDB#25
File: lib/feeb/db/query.ex:61-74
Timestamp: 2025-05-03T20:33:34.811Z
Learning: In FeebDB, when implementing field validation for custom selection fields, consider that validation errors would still be runtime errors, but can provide clearer error messages closer to the source of the problem.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lib/feeb/db/repo.ex(3 hunks)test/db/db_test.exs(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/db/db_test.exs
🔇 Additional comments (4)
lib/feeb/db/repo.ex (4)
286-287: Function signature change to reflect format option renaming.The format option has been renamed from
:typeto:mapto better describe its purpose in returning maps with selected fields. This change aligns with the PR objective of supporting custom field selection.
298-299: Consistent renaming of format option for list results.This change maintains consistency with the single-row case by using
:mapformat instead of:type.
304-305: Format check updated for insert results.The format check has been updated to use
:mapinstead of:type, keeping it consistent with other format changes throughout the file.
367-372:Details
✅ Verification successful
Implementation of
create_maps_from_rowshas future performance considerations.The implementation creates full schema results before extracting only the requested fields, which is noted as not being optimal for performance. While prioritizing readability is reasonable for now, this might become a bottleneck for large datasets or complex schemas.
Consider optimizing this in the future by directly creating maps with only the requested fields rather than building the full schema first. This would avoid the overhead of constructing and then filtering data.
defp create_maps_from_rows(query_id, {_, {fields_bindings, _}, :select} = query, rows) do # Performance-wise, not the best solution, but I'd rather keep the code readable for a bit # longer. Simply create the full schema and use only the fields the user selected create_schema_from_rows(query_id, query, rows) |> Enum.map(fn full_result -> Map.take(full_result, fields_bindings) end) end
🌐 Web query:
What is the performance impact of Map.take in Elixir compared to direct map creation?💡 Result:
Based on benchmarking results, Map.take in Elixir is notably slower than alternative methods for retrieving multiple values from a map. When comparing different approaches for extracting multiple values from maps, some interesting performance patterns emerge.
Performance Comparison
According to benchmarks, Map.take is significantly slower than using repeated Map.fetch operations:
- Map.take(map, [key_1, ..., key_100]) is approximately 2.5 times slower than using Enum.map([key_1, ..., key_100], &Map.fetch(map, &1))[1]
- For larger key sets with 1000 keys, the performance gap widens to about 3.5 times slower[1]
This result is counterintuitive, as one might expect a specialized function like Map.take to be more optimized than making repeated individual calls.
Implementation Details
Looking at the Map.take implementation in Elixir's source code, we can see it works by iterating through the provided keys, checking if each key exists in the map, and building an accumulator list of key-value pairs which is eventually converted to a new map using :maps.from_list/1[6]. This implementation might explain some of the performance characteristics.
Alternatives and Their Performance
When considering different methods for accessing map values:
Pattern matching is consistently the fastest approach - about 3 times faster than Map.get/2 and approximately 6 times faster than the Access behavior (square bracket syntax)[7]
Map.get/2 performs better than the Access behavior (square bracket syntax) but is still significantly slower than direct pattern matching[7]
Direct map creation through pattern matching or other techniques outperforms Map.take in most scenarios
Performance Considerations for Multiple Operations
For applications that require multiple map transformations, the Elixir documentation itself recommends:
"If you find yourself doing multiple calls to Map.filter/2 and Map.reject/2 in a pipeline, it is likely more efficient to use Enum.map/2 and Enum.filter/2 instead and convert to a map at the end using Map.new/1."[10]
This suggests that for complex map manipulations, working with enumerable functions and only converting to a map at the end can be more efficient than using the specialized Map functions.
When to Worry About These Differences
While these performance differences exist, it's important to note that in many real-world applications, the difference may be negligible. For example, when comparing Map.get/2 with the Access behavior, the difference was only about 0.5% in some benchmarks[5].
For most applications, code readability and maintainability should take precedence over these performance considerations unless you're dealing with performance-critical code paths that are executed frequently with large data structures.
Citations:
- 1: https://elixirforum.com/t/is-map-take-slower-than-running-map-fetch-repeatedly/59421
- 2: https://blog.appsignal.com/2022/09/06/benchmark-your-elixir-apps-performance-with-benchee.html
- 3: https://elixirschool.com/en/lessons/misc/benchee
- 4: https://elixirforum.com/t/benchmarking-dot-in-map-field/58170
- 5: https://stratus3d.com/blog/2019/06/29/performance-of-elixirs-access-behavior/
- 6: https://github.com/elixir-lang/elixir/blob/main/lib/elixir/lib/map.ex
- 7: https://stratus3d.com/blog/2019/07/13/pattern-matching-versus-elixirs-access-behavior/
- 8: https://hexdocs.pm/elixir/Map.html
- 9: https://elixirforum.com/t/map-over-a-map-and-create-a-new-map/37739
- 10: https://hexdocs.pm/elixir/1.18.1/Map.html
Performance Impact of Map.take in create_maps_from_rows
Based on recent benchmarks,
Map.take/2is about 2.5× slower for 100 keys and up to 3.5× slower for 1 000 keys compared to repeated fetches or direct map creation. While this overhead is usually negligible in most code paths, it’s worth keeping in mind for hot loops or very large datasets.File: lib/feeb/db/repo.ex
Lines: 367–372Current implementation:
defp create_maps_from_rows(query_id, {_, {fields_bindings, _}, :select} = query, rows) do # Performance-wise, not the best solution, but I'd rather keep the code readable for a bit # longer. Simply create the full schema and use only the fields the user selected create_schema_from_rows(query_id, query, rows) |> Enum.map(fn full_result -> Map.take(full_result, fields_bindings) end) endRecommendations for future optimization:
- If this path becomes performance-critical, consider building the result maps directly:
• Use pattern matching in thefn row -> %{key1: row.key1, key2: row.key2, …} endstyle
• Or useEnum.map(fields, &{&1, Map.fetch!(row, &1)}) |> Map.new()to avoid the extra conversion step- Measure and profile before and after to ensure the change yields real-world benefits without sacrificing maintainability.
Now you no longer have to always
SELECT *, you can fetch a subset of the entire row with an API that looks like this:Future improvements (still in the scope of this PR):
format: :typeformat: :type. Maybeformat: :maporformat: :subset?IMO user-defined queries should almost always
SELECT *and have their selection narrowed at runtime via adhoc queries.Summary by CodeRabbit
Summary by CodeRabbit
New Features
Bug Fixes
Tests