Skip to content

Conversation

@mtran-code
Copy link
Collaborator

@mtran-code mtran-code commented Oct 21, 2025

Closes #50

  • Accepts vectorized input for queryUnichem
  • if more than 1 input, will query in parallel
  • added 2 tests related to new logic

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for querying multiple compounds in a single call; results returned as a named list.
    • Added progress parameter to display status updates when processing multiple queries.
  • Enhancements

    • sourceID argument now accepts vectors to specify different IDs for each compound.
    • Improved documentation for multi-query functionality and parameter handling.

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

Walkthrough

The PR refactors queryUnichemCompound to accept multiple compounds as vectors or lists for batch processing with optional parallel execution, introduces a new progress parameter for user feedback, adds per-compound sourceID validation, and refactors getUnichemSources with improved error handling and column management. Test coverage expands to validate new batch and parallel functionality.

Changes

Cohort / File(s) Summary
Core Function Updates
R/unichem.R
getUnichemSources: Refactored internal flow with early API response error checking, updated column names and default ordering, enhanced debug logging support. queryUnichemCompound: Major refactor introducing multi-query batch framework with vector/list compound support, progress parameter for feedback, per-compound sourceID validation, modular request/response utilities, and parallel execution path (.perform_request_parallel) returning named lists; preserves raw output and single-query fallback.
Documentation Updates
man/queryUnichemCompound.Rd
Added progress parameter documentation (default: "Querying UniChem..."). Expanded compound argument to accept vector/list with results returned as named list. Updated sourceID to support vector matching compound length or single value with recycling (default: NA). Updated value description reflecting multi-query named-list returns.
Test Suite Expansion
tests/testthat/test_unichem.R
Added test blocks for vector inputs and non-sourceID list scenarios. Reformatted expected columns into multiline list. Added expect_setequal(names(sources), expected_columns) assertion for column validation. Converted inline calls to multiline formatting for consistency; expanded coverage for batch query paths.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant queryUnichemCompound
    participant Validator
    participant RequestBuilder
    participant ParallelExec
    participant API
    participant ResponseParser

    rect rgb(200, 230, 255)
    Note over User,ResponseParser: Single Compound (Existing Path)
    User->>queryUnichemCompound: compound="CHEMBL123"
    queryUnichemCompound->>Validator: validate sourceID
    Validator-->>queryUnichemCompound: ✓
    queryUnichemCompound->>RequestBuilder: build request
    RequestBuilder-->>queryUnichemCompound: request object
    queryUnichemCompound->>API: execute query
    API-->>queryUnichemCompound: response
    queryUnichemCompound->>ResponseParser: parse response
    ResponseParser-->>queryUnichemCompound: data frame
    queryUnichemCompound-->>User: return result
    end

    rect rgb(230, 255, 230)
    Note over User,ResponseParser: Multiple Compounds (New Parallel Path)
    User->>queryUnichemCompound: compound=c("CHEMBL123", "CHEMBL456"), progress="Querying..."
    queryUnichemCompound->>Validator: validate per-compound sourceID
    Validator-->>queryUnichemCompound: sourceID vector recycled/validated
    queryUnichemCompound->>RequestBuilder: build requests (per compound)
    RequestBuilder-->>queryUnichemCompound: request list
    queryUnichemCompound->>ParallelExec: .perform_request_parallel(progress="Querying...")
    par Parallel Execution
        ParallelExec->>API: compound 1 query
        ParallelExec->>API: compound 2 query
    and
        API-->>ParallelExec: response 1
        API-->>ParallelExec: response 2
    end
    ParallelExec->>ResponseParser: parse all responses
    ResponseParser-->>ParallelExec: list of data frames
    ParallelExec-->>queryUnichemCompound: named list {CHEMBL123=..., CHEMBL456=...}
    queryUnichemCompound-->>User: return named list
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The refactor introduces a parallel execution framework with per-compound validation logic, modularizes request/response handling, and significantly expands API surface (batch queries, progress feedback). While changes are focused within UniChem-related files, the density of new logic (compound batching, sourceID recycling, parallel dispatch) and interconnected updates across function signatures, documentation, and test expansion warrant careful review across logic pathways and edge cases.

Poem

A rabbit hops through compounds in parallel stride, 🐰
Where queries run together, side by side,
No more one-at-a-time in endless loops,
Batch processing now in speedy troops!
With progress whispers and named-list treasure,
UniChem searches bring the rabbit pleasure. 🔍✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes Check ❓ Inconclusive While the majority of changes in queryUnichemCompound directly implement the issue #50 requirements, the PR also includes modifications to getUnichemSources that are not explicitly mentioned in the linked issue objectives. The getUnichemSources changes involve reworking internal flow, error handling, column renaming, and adding debug logging. Although these could potentially be supporting infrastructure for the parallel execution framework or part of a broader modernization effort, they are not directly required by the stated objectives and warrant clarification on whether they should be included in this PR or separated into a distinct change. Verify that the getUnichemSources modifications are necessary supporting infrastructure for the parallel query framework, or consider moving them to a separate PR focused on internal refactoring. If these changes are essential for the parallel execution functionality, document this rationale in the PR description or commit messages. Otherwise, prioritize the core queryUnichemCompound changes that directly fulfill the issue #50 requirements.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: enable parallel queries for unichem" clearly and concisely summarizes the main change in the changeset. It is specific about the feature being added (parallel queries) and the target (unichem), making it immediately understandable to teammates reviewing the repository history. The title directly corresponds to the primary objective implemented in this PR.
Linked Issues Check ✅ Passed The code changes successfully implement the core objectives from issue #50. The queryUnichemCompound function now accepts vectorized input (vector/list of compounds) as required, and parallel execution is enabled through the new progress parameter and .perform_request_parallel() framework. The function returns a named list of results when multiple compounds are supplied, and the documentation and tests have been updated accordingly. These changes directly address both stated requirements of accepting a list input and running multiple queries in parallel.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mtran/parallelize-queryUnichem

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 88.05031% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.20%. Comparing base (e1086fe) to head (aaa1c43).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
R/unichem.R 88.05% 19 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #68      +/-   ##
==========================================
+ Coverage   77.76%   78.20%   +0.43%     
==========================================
  Files          20       20              
  Lines        1309     1413     +104     
==========================================
+ Hits         1018     1105      +87     
- Misses        291      308      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (6)
man/queryUnichemCompound.Rd (2)

18-20: Clarify vector/list handling and “raw” return semantics.

  • Specify whether nested lists are flattened or rejected; current code flattens, which can affect naming. Consider documenting “list of scalars (no nesting)” or that names come from the flattened vector.
  • “raw = TRUE” returns the parsed JSON list, not the HTTP response; say “parsed API JSON” to avoid confusion.

Also applies to: 40-43


32-35: Document progress usage scope.

Note that progress is only used when length(compound) > 1 and is passed to .perform_request_parallel(). Mention how to disable (progress = FALSE) and that single-query path ignores it.

tests/testthat/test_unichem.R (1)

55-71: Stabilize network tests and boolean style.

  • Guard external API tests to avoid CI flakes: add skip_if_offline("www.ebi.ac.uk") and skip_on_cran() at the start of each networked test.
  • Prefer TRUE/FALSE over T/F.
 test_that("queryUnichemCompound handles vector inputs", {
+  skip_if_offline(host = "www.ebi.ac.uk")
+  skip_on_cran()
   fallback <- c(
     "444795",
     "444796"
   )
@@
   results <- queryUnichemCompound(
     type = "sourceID",
     compound = fallback,
     sourceID = 22,
-    progress = FALSE
+    progress = FALSE
   )
@@
 test_that("queryUnichemCompound handles non source id lists", {
+  skip_if_offline(host = "www.ebi.ac.uk")
+  skip_on_cran()
   compounds <- c("538323", "538324")
@@
 })
 test_that("queryUnichemCompound returns the expected results 2", {
+  skip_if_offline(host = "www.ebi.ac.uk")
+  skip_on_cran()
   # Test case 1
   result1 <- queryUnichemCompound(
     type = "inchikey",
     compound = "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
-    raw = T
+    raw = TRUE
   )
@@
   result2 <- queryUnichemCompound(
     type = "inchikey",
     compound = "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
-    raw = F
+    raw = FALSE
   )

Also applies to: 73-82, 85-107

R/unichem.R (3)

142-148: Validate allowed identifier types early.

Add an explicit choice assertion to fail fast on invalid types.

-    checkmate::assert_string(type)
+    checkmate::assert_string(type)
+    checkmate::assert_choice(type, c("uci", "inchi", "inchikey", "sourceID"))
     checkmate::assert_flag(request_only)
     checkmate::assert_flag(raw)

190-233: Harden parsing against missing fields; keep shape stable.

The API may omit compounds$inchi or sources for some identifiers. Guard for NULL and return empty data.tables/lists with expected names to avoid subsetting or $ access errors. Also, consider returning NA for missing inchi-derived fields.


165-166: Avoid repeated source list fetches within hot paths.

Memoize getUnichemSources() (already importing memoise) to reduce repeated API calls when validating source IDs across runs/sessions.

Also applies to: 179-180

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a77bef and 13090aa.

📒 Files selected for processing (4)
  • DESCRIPTION (1 hunks)
  • R/unichem.R (2 hunks)
  • man/queryUnichemCompound.Rd (1 hunks)
  • tests/testthat/test_unichem.R (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: ubuntu-latest (release)
  • GitHub Check: ubuntu-latest (oldrel-1)
  • GitHub Check: macos-latest (release)
  • GitHub Check: Test-Docker-Build
  • GitHub Check: ubuntu-latest (devel)
  • GitHub Check: Code-Coverage
  • GitHub Check: pkgdown
🔇 Additional comments (2)
DESCRIPTION (1)

44-44: RoxygenNote bump looks fine; align CI/tooling.

Ensure roxygen2 7.3.3 is used in dev/CI to avoid churn in generated Rd files.

R/unichem.R (1)

31-35: Switch to .parse_resp_json looks good.

No concerns here.

R/unichem.R Outdated
Comment on lines 247 to 255
responses <- .perform_request_parallel(requests, progress = progress)
names(responses) <- names(requests)

parsed_responses <- Map(.parse_resp_json, responses)

}
if (raw) {
names(parsed_responses) <- names(responses)
return(parsed_responses)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Bound parallelism, set timeouts/backoff, and isolate per-request errors.

Ensure .perform_request_parallel:

  • caps concurrency (e.g., via a worker pool),
  • sets per-request timeouts and retry with backoff for 5xx/429,
  • collects per-request errors without failing the whole batch unless requested (e.g., stop_on_error flag).

If already implemented, ignore; otherwise consider adding.


🏁 Script executed:

#!/bin/bash
# Inspect implementation details of .perform_request_parallel and related request defaults.
rg -nC3 --type=r '\.perform_request_parallel|\.(build|perform)_request|httr2::req_timeout|req_retry'

Length of output: 11237


Based on my verification of the implementation against httr2 documentation:

Add per-request timeouts and consider retrying all transient 5xx errors.

The current implementation already addresses some concerns:

  • Concurrency is capped with a default maximum of 10 concurrent requests
  • The on_error = "continue" setting collects per-request errors without failing the entire batch

However, two gaps remain:

  1. Timeouts are missing: req_timeout() can be applied to set a maximum number of seconds to wait for a request, but is not used in .build_request().

  2. Limited retry coverage: By default, retry only covers 429 and 503 responses. Consider extending .build_request() to also retry other transient 5xx errors (e.g., 500, 502, 504) via the is_transient parameter if your target APIs return them.

🤖 Prompt for AI Agents
In R/unichem.R around lines 247-255, the code returns parsed responses but the
underlying .build_request() lacks per-request timeouts and broader 5xx retry
handling; update .build_request() to attach a req_timeout(timeout_seconds) to
each httr2 request (expose timeout as a parameter or use a sensible default) and
expand the retry policy to include other transient 5xx errors (e.g., 500, 502,
504) by using the is_transient parameter or equivalent so retries cover all
transient server errors in addition to 429/503; ensure these settings are
applied per-request so the parallel runner benefits from timeouts and extended
retries.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
R/unichem.R (1)

164-175: Validate sourceID by membership, not range.

The range check upper = total_sources accepts non-existent sourceID values that fall within the range. For example, if valid IDs are {1, 2, 5, 10}, the current code would incorrectly accept 3, 4, 6, 7, 8, or 9.

Replace the range assertion with membership validation:

-    checkmate::assert_integerish(src_ids, any.missing = FALSE)
-    total_sources <- max(getUnichemSources()$SourceID)
+    checkmate::assert_integerish(src_ids, any.missing = FALSE)
+    valid_sources <- getUnichemSources()$SourceID
     
     if (length(src_ids) == 1L) {
       src_ids <- rep(src_ids, length(compounds))
     } else if (length(src_ids) != length(compounds)) {
       stop(
         "`sourceID` must be length 1 or match the number of compounds ",
         "when type = 'sourceID'"
       )
     }
-    checkmate::assert_integerish(src_ids, lower = 1, upper = total_sources)
-    src_ids
+    
+    if (!all(src_ids %in% valid_sources)) {
+      invalid_ids <- unique(src_ids[!src_ids %in% valid_sources])
+      stop(
+        sprintf(
+          "Invalid sourceID(s): %s. Use getUnichemSources() to see valid IDs.",
+          paste(invalid_ids, collapse = ", ")
+        )
+      )
+    }
+    src_ids
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13090aa and 191ea37.

📒 Files selected for processing (1)
  • R/unichem.R (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: ubuntu-latest (release)
  • GitHub Check: Code-Coverage
  • GitHub Check: ubuntu-latest (oldrel-1)
  • GitHub Check: ubuntu-latest (devel)
  • GitHub Check: macos-latest (release)
  • GitHub Check: pkgdown
🔇 Additional comments (3)
R/unichem.R (3)

28-104: LGTM! Formatting improvements with no functional changes.

The refactoring improves readability through consistent formatting and alignment while preserving the original logic.


142-156: LGTM! Parameter validation and input flattening are solid.

The validation correctly handles the new progress parameter (flag or string), and use.names = TRUE preserves names when flattening lists.


237-244: LGTM! Naming logic and single-query path are well-implemented.

The naming strategy correctly validates name availability and length before applying names to requests, with a sensible fallback. The single-query path maintains consistency by reusing the same helper functions.

Also applies to: 270-283

@mtran-code mtran-code force-pushed the mtran/parallelize-queryUnichem branch from c0f8a3f to 9af9a80 Compare October 21, 2025 19:47
@mtran-code mtran-code force-pushed the mtran/parallelize-queryUnichem branch from 9af9a80 to aaa1c43 Compare October 21, 2025 20:42
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
R/unichem.R (1)

9-23: Clarify return docs for all_columns default.

Docs imply all columns are always returned. Note the default 2‑column path.

-#' Returns a `data.table` with the following columns:
+#' When `all_columns = FALSE` (default), returns a two‑column `data.table`
+#' with `Name` and `SourceID`. When `all_columns = TRUE`, returns a
+#' `data.table` with the following columns:

Also applies to: 25-26

♻️ Duplicate comments (1)
R/unichem.R (1)

260-264: Verify per-request timeouts and broader transient 5xx retries are configured.

Ensure .build_request() (or helpers) set an explicit req_timeout() and retry on common transient 5xx (500/502/504) in addition to 429/503. If already present elsewhere, ignore.

#!/bin/bash
# Search for timeout and retry usage across the package.
rg -nP --type=r 'req_timeout|req_retry|is_transient|\.(build|perform)_request|\.perform_request_parallel|\.build_unichem_'
🧹 Nitpick comments (8)
tests/testthat/test_unichem.R (4)

5-36: Stabilize networked tests: skip on CRAN/offline to avoid flakiness.

These tests hit UniChem. Add standard skips so CI/CRAN/offline runs don’t fail due to network/service variance.

 test_that("getUnichemSources returns a data.table with the correct columns", {
+  skip_on_cran()
+  skip_if_offline()
   sources <- getUnichemSources(all_columns = TRUE)
 test_that("queryUnichemCompound returns the expected results", {
+  skip_on_cran()
+  skip_if_offline()
 test_that("queryUnichemCompound handles vector inputs", {
+  skip_on_cran()
+  skip_if_offline()
 test_that("queryUnichemCompound handles non source id lists", {
+  skip_on_cran()
+  skip_if_offline()
 test_that("queryUnichemCompound returns the expected results 2", {
+  skip_on_cran()
+  skip_if_offline()

Also applies to: 39-55, 57-73, 75-84, 85-129


57-73: Assert vector-path returns have no per-compound errors.

Strengthen to fail if any element is a captured error from the parallel path.

   expect_length(results, length(fallback))
   expect_named(results, fallback)
   expect_true(all(vapply(results, is.list, logical(1))))
+  # Ensure no per-compound errors were returned
+  expect_true(
+    all(!vapply(results, inherits, logical(1), "unichem_error")),
+    info = "No per-compound errors expected"
+  )

75-84: Broaden checks for non-sourceID list case.

Also assert length and absence of error wrappers for each element.

 test_that("queryUnichemCompound handles non source id lists", {
   compounds <- c("538323", "538324")
   results <- queryUnichemCompound(
     compound = compounds,
     type = "uci",
     progress = FALSE
   )
 
   expect_named(results, compounds)
+  expect_length(results, length(compounds))
+  expect_true(all(vapply(results, is.list, logical(1))))
+  expect_true(all(!vapply(results, inherits, logical(1), "unichem_error")))
 })

87-91: Use TRUE/FALSE instead of T/F in tests.

R CMD check flags T/F usage; prefer explicit logicals.

-  result1 <- queryUnichemCompound(
+  result1 <- queryUnichemCompound(
     type = "inchikey",
     compound = "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
-    raw = T
+    raw = TRUE
   )
-  result2 <- queryUnichemCompound(
+  result2 <- queryUnichemCompound(
     type = "inchikey",
     compound = "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
-    raw = F
+    raw = FALSE
   )

Also applies to: 105-109

R/unichem.R (4)

79-79: Defensive rename against UniChem schema drift; fail fast with a clear error.

Current setnames will error or silently misalign if fields change. Guard for missing fields and then rename.

-  data.table::setnames(sources_dt, old_names, new_names)
+  missing <- setdiff(old_names, names(sources_dt))
+  if (length(missing)) {
+    .err(
+      funContext,
+      sprintf(
+        "UniChem 'sources' schema changed; missing fields: %s",
+        paste(missing, collapse = ", ")
+      )
+    )
+  }
+  data.table::setnames(sources_dt, old = old_names, new = new_names)

191-198: Coerce compound IDs to character when building requests.

Avoids surprises with factors/integers in API calls.

   build_request <- function(cmp, src) {
     .build_unichem_compound_req(
       type = type,
-      compound = cmp,
+      compound = as.character(cmp),
       sourceID = if (is.na(src)) NULL else src,
       ...
     )
   }

213-229: Handle compounds with no external mappings (empty sources) without error.

If parsed$compounds$sources is empty, setnames/subset will fail. Initialize a typed empty table.

-    mapped_sources_dt <- .asDT(parsed$compounds$sources)
-    old_names <- c("compoundId", "shortName", "longName", "id", "url")
-    new_names <- c(
-      "compoundID",
-      "Name",
-      "NameLong",
-      "sourceID",
-      "sourceURL"
-    )
-    data.table::setnames(
-      mapped_sources_dt,
-      old = old_names,
-      new = new_names
-    )
+    old_names <- c("compoundId", "shortName", "longName", "id", "url")
+    new_names <- c("compoundID", "Name", "NameLong", "sourceID", "sourceURL")
+    if (is.null(parsed$compounds$sources) || length(parsed$compounds$sources) == 0L) {
+      mapped_sources_dt <- data.table::as.data.table(list(
+        compoundID = integer(),
+        Name = character(),
+        NameLong = character(),
+        sourceID = integer(),
+        sourceURL = character()
+      ))
+    } else {
+      mapped_sources_dt <- .asDT(parsed$compounds$sources)
+      data.table::setnames(mapped_sources_dt, old = old_names, new = new_names)
+    }

125-128: Document multi-query error semantics in @return.

Make it clear that per‑compound errors are wrapped (class unichem_error) instead of stopping.

-#'   compound). If `raw = TRUE`, raw responses are returned instead.
+#'   compound). If `raw = TRUE`, raw responses are returned instead.
+#'   For multiple queries, per‑compound failures are returned as elements
+#'   of class `"unichem_error"` with an `error` message, without stopping.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 191ea37 and aaa1c43.

📒 Files selected for processing (3)
  • R/unichem.R (2 hunks)
  • man/queryUnichemCompound.Rd (1 hunks)
  • tests/testthat/test_unichem.R (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • man/queryUnichemCompound.Rd
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: pkgdown
  • GitHub Check: macos-latest (release)
  • GitHub Check: Code-Coverage
  • GitHub Check: ubuntu-latest (release)
  • GitHub Check: ubuntu-latest (devel)
  • GitHub Check: ubuntu-latest (oldrel-1)

Comment on lines +142 to +149
checkmate::assert_string(type)
checkmate::assert_flag(request_only)
checkmate::assert_flag(raw)
checkmate::assert(
checkmate::check_flag(progress),
checkmate::check_string(progress, min.chars = 1)
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate type against allowed choices early.

Prevents ambiguous errors downstream.

   checkmate::assert_string(type)
+  checkmate::assert_choice(type, c("uci", "inchi", "inchikey", "sourceID"))
🤖 Prompt for AI Agents
In R/unichem.R around lines 142 to 149, the parameter `type` is asserted as a
string but not checked against the allowed set; add an explicit choice
validation early (e.g. define an `allowed_types` vector and call
checkmate::assert_choice(type, choices = allowed_types)) so invalid values fail
fast with a clear message; place this check immediately after the existing
checkmate assertions for `type` and before any downstream logic that depends on
specific `type` values.

@mtran-code mtran-code merged commit c3fbb34 into main Oct 21, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: queryUnichem pass in list to run in parallel

2 participants