refactor: Update Base64 to match coding standard by Joe-Abraham · Pull Request #32 · Joe-Abraham/velox

Joe-Abraham · 2026-02-27T09:50:03Z

Summary by Sourcery

Refactor Base64 encoding/decoding APIs to use std::string_view and std::string-based interfaces and update all callers accordingly.

Enhancements:

Simplify Base64 and Base64 URL encode/decode APIs to operate on std::string_view inputs and std::string outputs instead of raw pointer/size buffers and pairs.
Unify Base64 decoded-size and MIME decoded-size helpers to string_view-based signatures and internal size management, improving safety and clarity.
Adjust PrestoSQL and SparkSQL Base64/UnBase64 functions, text reader/writer, and plan/expr serialization code to use the updated Base64 interfaces.
Update Base64 unit tests to match the new string_view-based APIs and validation behavior for padding and MIME decoding.

# Conflicts: # velox/functions/prestosql/BinaryFunctions.h

sourcery-ai · 2026-02-27T09:50:09Z

Reviewer's Guide

Refactors the Base64 encoding/decoding API across Velox to use std::string_view + std::string-based interfaces, removes raw buffer/size variants, and updates all call sites and tests to match the new, safer API and padding semantics.

Class diagram for refactored Base64 encoding/decoding API

classDiagram
  namespace facebook_velox_encoding {
    class Base64 {
      <<utility>>
      +static string encode(string_view input, bool includePadding)
      +static void encode(string_view input, string& output)
      +static string encodeUrl(string_view input, bool includePadding)
      +static void encodeUrl(string_view input, string& output)
      +static string encode(const folly_IOBuf* inputBuffer)
      +static string encodeUrl(const folly_IOBuf* inputBuffer)

      +static string decode(string_view input)
      +static Status decode(string_view input, string& output)

      +static string decodeUrl(string_view input)
      +static Status decodeUrl(string_view input, string& output)

      +static Status decodeMime(string_view input, string& output)
      +static void encodeMime(string_view input, string& output)

      +static size_t calculateEncodedSize(size_t inputSize, bool withPadding)
      +static Expected_size_t calculateDecodedSize(string_view input)
      +static Expected_size_t calculateMimeDecodedSize(string_view input)
      +static size_t calculateMimeEncodedSize(size_t binarySize)

      -static bool isPadded(string_view input)
      -static size_t numPadding(string_view input)

      -static Expected_size_t decodeImpl(string_view input, string& output, ReverseIndex& reverseIndex)
      -static Expected_uint8_t base64ReverseLookup(uint8_t character, ReverseIndex& reverseIndex)

      -static const size_t kEncodedBlockByteSize
      -static const size_t kBinaryBlockByteSize
      -static const size_t kReverseIndexSize
      -static const char kPadding
      -static const size_t kMaxLineLength
      -static const char[] kBase64Charset
      -static const char[] kBase64UrlCharset
      -static const uint8_t[] kBase64ReverseIndexTable
      -static const uint8_t[] kBase64UrlReverseIndexTable
    }
  }

File-Level Changes

Change	Details	Files
Refactor Base64 core API to use std::string_view and std::string outputs instead of raw char buffers and sizes, adjusting encode/decode implementations and padding helpers accordingly.	Add include for memcpy usage where needed. Change Base64::encode/encodeUrl/encodeMime and corresponding decode methods to accept std::string_view input and std::string& output, removing pointer+size overloads and calculateDecodedSize signatures that mutated inputSize. Update encodeImpl to clear/reserve the output std::string, compute encoded size internally, and use push_back instead of direct pointer arithmetic for both standard and URL encodings, with optional padding support. Update decodeImpl and calculateDecodedSize to operate on std::string_view, remove explicit output buffer size checks, manage padding via numPadding(string_view), and build output using push_back. Adjust isPadded and numPadding helpers to work on std::string_view instead of (char*, size_t), and expose calculateEncodedSize/calculateDecodedSize as private helpers consistent with new signatures. Update Base64 MIME encode/decode helpers (encodeMime/decodeMime/calculateMimeDecodedSize) to use std::string_view and std::string& with push_back, simplifying pointer handling.	`velox/common/encode/Base64.cpp` `velox/common/encode/Base64.h`
Update unit tests to align with new Base64 API signatures and behavior, especially around decoded-size calculation and MIME helpers.	Simplify calculateDecodedSizeProperSize test to use the new calculateDecodedSize(std::string_view) without mutating input size and to assert only on returned decoded lengths and error cases. Update checksPadding and countsPaddingCorrectly tests to call isPadded/numPadding with string literals instead of pointer+length pairs. Update calculateMimeDecodedSize test to use string_view-based API and error expectations. Adjust decodeMime and encodeMime test helpers to construct std::string outputs, call the new MIME APIs, and use calculateMimeDecodedSize/calculateMimeEncodedSize solely for capacity reservations.	`velox/common/encode/tests/Base64Test.cpp`
Adapt Presto and Spark SQL function implementations to the new Base64 API and improve safety by using intermediate std::string buffers and memcpy into result vectors.	Change FromBase64Function and FromBase64UrlFunction to call encoding::Base64::decode/decodeUrl with std::string_view and std::string decodedStr, then resize and memcpy into out_type on success instead of pre-sizing based on calculateDecodedSize and passing raw buffers. Update UnBase64Function to use calculateMimeDecodedSize(std::string_view) for validation only, then call decodeMime with std::string_view and std::string decodedStr and memcpy into the result. Change ToBase64UrlFunction to build its result via Base64::encodeUrl(std::string_view,bool) instead of manual buffer sizing and encodeUrl(char,size_t,char). Change Base64Function (Spark SQL) to use encodeMime(std::string_view,std::string&) and memcpy into the pre-sized Varchar result buffer. Adjust FromHexFunction and FromBase64UrlFunction templates to use concrete arg_type/arg_type where needed, simplifying the type parameters.	`velox/functions/prestosql/BinaryFunctions.h` `velox/functions/sparksql/UnBase64Function.h` `velox/functions/sparksql/Base64Function.h`
Update DWIO text reader/writer and core serialization to use the new Base64 API directly with std::string_view rather than pointer+size overloads and manual buffer management.	In TextRowReader::readElement, replace manual decoded-size calculation and Base64::decode into preallocated varBinBuf_ with a single Base64::decode(std::string_view,std::string&) call, then copy the decoded string into varBinBuf_ and use StringView on the decoded content; fall back path for invalid Base64 now uses StringView(str) instead of manually constructing from the buffer. In TextWriter::writeCellValue for VARBINARY, replace encode(Base64) pointer+size call with Base64::encode(std::string_view,bool) with padding set to true. In ConstantTypedExpr::serialize and ValuesNode::serialize, replace encode(char*,size_t) with encode(std::string_view) to serialize vectors and plan data via Base64 in a more idiomatic way.	`velox/dwio/text/reader/TextReader.cpp` `velox/dwio/text/writer/TextWriter.cpp` `velox/core/Expressions.cpp` `velox/core/PlanNode.cpp`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

Joe-Abraham · 2026-02-27T09:51:28Z

/claude-review

sourcery-ai

Hey - I've found 3 issues, and left some high level feedback:

In TextRowReader::readElement, both the success and error paths construct StringView from temporaries (decodedStr and str) rather than from varBinBuf_, which means the StringView can outlive its backing storage; you should instead build the StringView over varBinBuf_->data() with the appropriate size to preserve the existing ownership semantics.
The new Base64::encode(std::string_view input, bool includePadding = false) and encodeUrl(..., bool includePadding = false) change the default from padded to unpadded output compared to the previous API; please double‑check all call sites (e.g., ConstantTypedExpr::serialize, PlanNode::serialize) to ensure this behavioral change is intentional and, if not, pass includePadding = true explicitly.
For the MIME helpers, callers like UnBase64Function now compute calculateMimeDecodedSize and then decodeMime still grows output incrementally; consider reserving output to the precomputed size inside decodeMime (or passing the expected size in) to avoid repeated reallocations and keep the size logic centralized.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In TextRowReader::readElement, both the success and error paths construct StringView from temporaries (`decodedStr` and `str`) rather than from `varBinBuf_`, which means the StringView can outlive its backing storage; you should instead build the StringView over `varBinBuf_->data()` with the appropriate size to preserve the existing ownership semantics.
- The new `Base64::encode(std::string_view input, bool includePadding = false)` and `encodeUrl(..., bool includePadding = false)` change the default from padded to unpadded output compared to the previous API; please double‑check all call sites (e.g., ConstantTypedExpr::serialize, PlanNode::serialize) to ensure this behavioral change is intentional and, if not, pass `includePadding = true` explicitly.
- For the MIME helpers, callers like UnBase64Function now compute `calculateMimeDecodedSize` and then `decodeMime` still grows `output` incrementally; consider reserving `output` to the precomputed size inside `decodeMime` (or passing the expected size in) to avoid repeated reallocations and keep the size logic centralized.

## Individual Comments

### Comment 1
<location path="velox/dwio/text/reader/TextReader.cpp" line_range="1108-1114" />
<code_context>
-      Status status = encoding::Base64::decode(
-          str.data(), str.size(), varBinBuf_->data(), blen.value_or(0));
+      // Decode from base64 to the blob buffer in one call.
+      std::string decodedStr;
+      Status status = encoding::Base64::decode(str, decodedStr);

       if (status.code() == StatusCode::kOK) {
-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(blen.value())));
+        varBinBuf_->resize(decodedStr.size());
+        std::memcpy(varBinBuf_->data(), decodedStr.data(), decodedStr.size());
+        flatVector->set(insertionRow, StringView(decodedStr));
       } else {
         // Not valid base64:  just copy as-is for compatibility.
</code_context>
<issue_to_address>
**issue (bug_risk):** StringView now refers to a temporary std::string which will be destroyed, leading to a dangling pointer.

`decodedStr` is a local that is destroyed at the end of `readElement`, so `flatVector->set(insertionRow, StringView(decodedStr))` (and similarly `StringView(str)` in the error path) leaves the vector holding a dangling pointer. Please keep `StringView` backed by `varBinBuf_` (e.g., construct it from `varBinBuf_->data()` and its size) or otherwise ensure the underlying storage outlives the vector.
</issue_to_address>

### Comment 2
<location path="velox/dwio/text/reader/TextReader.cpp" line_range="1117-1122" />
<code_context>
+      Status status = encoding::Base64::decode(str, decodedStr);

       if (status.code() == StatusCode::kOK) {
-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(blen.value())));
</code_context>
<issue_to_address>
**issue (bug_risk):** Error-path StringView also points to stack-local storage and will dangle.

In the non-base64 case, you copy into `varBinBuf_` but still build `StringView(str)`, where `str` is local to `readElement`. That `StringView` will dangle just like on the success path. Instead, construct the `StringView` over `varBinBuf_->data()` (or another buffer with a longer lifetime) to avoid use-after-free.
</issue_to_address>

### Comment 3
<location path="velox/common/encode/Base64.cpp" line_range="492-496" />
<code_context>

 // static
-Status Base64::decodeMime(const char* input, size_t inputSize, char* output) {
+Status Base64::decodeMime(std::string_view input, std::string& output) {
+  size_t inputSize = input.size();
   if (inputSize == 0) {
</code_context>
<issue_to_address>
**issue (bug_risk):** decodeMime appends to `output` without clearing it, which can cause data from previous calls to leak into the result.

Previously, callers passed a fresh buffer, so there was no risk of stale data. With `std::string& output`, the function neither clears nor modifies `output` when `inputSize == 0`, and otherwise always appends. Callers that reuse the same `std::string` will see leftover or accumulated data. To preserve prior behavior, clear `output` at the start and optionally `reserve(calculateMimeDecodedSize(inputSize))` for efficiency.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-02-27T09:52:42Z

velox/dwio/text/reader/TextReader.cpp

+      std::string decodedStr;
+      Status status = encoding::Base64::decode(str, decodedStr);

      if (status.code() == StatusCode::kOK) {
-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(blen.value())));
+        varBinBuf_->resize(decodedStr.size());
+        std::memcpy(varBinBuf_->data(), decodedStr.data(), decodedStr.size());
+        flatVector->set(insertionRow, StringView(decodedStr));


issue (bug_risk): StringView now refers to a temporary std::string which will be destroyed, leading to a dangling pointer.

decodedStr is a local that is destroyed at the end of readElement, so flatVector->set(insertionRow, StringView(decodedStr)) (and similarly StringView(str) in the error path) leaves the vector holding a dangling pointer. Please keep StringView backed by varBinBuf_ (e.g., construct it from varBinBuf_->data() and its size) or otherwise ensure the underlying storage outlives the vector.

sourcery-ai · 2026-02-27T09:52:42Z

velox/dwio/text/reader/TextReader.cpp

-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(blen.value())));
+        varBinBuf_->resize(decodedStr.size());
+        std::memcpy(varBinBuf_->data(), decodedStr.data(), decodedStr.size());
+        flatVector->set(insertionRow, StringView(decodedStr));
      } else {
        // Not valid base64:  just copy as-is for compatibility.
        //


issue (bug_risk): Error-path StringView also points to stack-local storage and will dangle.

In the non-base64 case, you copy into varBinBuf_ but still build StringView(str), where str is local to readElement. That StringView will dangle just like on the success path. Instead, construct the StringView over varBinBuf_->data() (or another buffer with a longer lifetime) to avoid use-after-free.

sourcery-ai · 2026-02-27T09:52:42Z

velox/common/encode/Base64.cpp

+Status Base64::decodeMime(std::string_view input, std::string& output) {
+  size_t inputSize = input.size();
  if (inputSize == 0) {
    return Status::OK();
  }


issue (bug_risk): decodeMime appends to output without clearing it, which can cause data from previous calls to leak into the result.

Previously, callers passed a fresh buffer, so there was no risk of stale data. With std::string& output, the function neither clears nor modifies output when inputSize == 0, and otherwise always appends. Callers that reuse the same std::string will see leftover or accumulated data. To preserve prior behavior, clear output at the start and optionally reserve(calculateMimeDecodedSize(inputSize)) for efficiency.

Copilot

Pull request overview

Refactors Base64 encode/decode APIs and call sites to use std::string_view inputs and std::string outputs, aligning implementations with updated coding conventions.

Changes:

Updated encoding::Base64 public API signatures (favoring std::string_view + std::string out params) and rewrote core encode/decode internals accordingly.
Migrated multiple Velox call sites (SparkSQL/PrestoSQL functions, text reader/writer, plan/expression serialization) to the new Base64 API.
Simplified and updated Base64 unit tests to match the new API.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
velox/functions/sparksql/UnBase64Function.h	Switches to `string_view` + `std::string` decode path before copying into Velox varbinary.
velox/functions/sparksql/Base64Function.h	Switches to `string_view` + `std::string` encode path before copying into Velox varchar.
velox/functions/prestosql/BinaryFunctions.h	Updates Base64/Base64Url encode/decode paths to new `string_view` + `std::string` APIs.
velox/dwio/text/writer/TextWriter.cpp	Updates VARBINARY serialization to Base64 encode from `string_view`.
velox/dwio/text/reader/TextReader.cpp	Updates VARBINARY parsing to Base64 decode into `std::string` then copy.
velox/core/PlanNode.cpp	Updates ValuesNode serialization to new `Base64::encode` overload.
velox/core/Expressions.cpp	Updates ConstantTypedExpr serialization to new `Base64::encode` overload.
velox/common/encode/tests/Base64Test.cpp	Updates tests to new API and output-buffer handling style.
velox/common/encode/Base64.h	Replaces pointer/size overloads with `string_view` and `std::string&` output APIs; adjusts helpers.
velox/common/encode/Base64.cpp	Reimplements encode/decode to write into `std::string` via `push_back`/`reserve`; updates MIME helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-27T09:57:16Z

velox/dwio/text/reader/TextReader.cpp

+      std::string decodedStr;
+      Status status = encoding::Base64::decode(str, decodedStr);

      if (status.code() == StatusCode::kOK) {
-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(blen.value())));
+        varBinBuf_->resize(decodedStr.size());
+        std::memcpy(varBinBuf_->data(), decodedStr.data(), decodedStr.size());
+        flatVector->set(insertionRow, StringView(decodedStr));


flatVector->set(insertionRow, StringView(decodedStr)) stores a view into decodedStr, which is a local variable and will be destroyed at the end of the scope. This leaves the vector holding a dangling pointer. Use the owned buffer (varBinBuf_) when constructing the StringView (e.g., view over varBinBuf_->data() with the decoded size), or ensure the backing storage outlives the vector.

Copilot · 2026-02-27T09:57:16Z

velox/dwio/text/reader/TextReader.cpp

        varBinBuf_->resize(str.size());

        VELOX_CHECK_NOT_NULL(str.data());
-
-        len = str.size();
        memcpy(varBinBuf_->data(), str.data(), str.size());

        // Use StringView, set(vector_size_t idx, T value) fails because
        // strlen(varBinBuf_->data()) is undefined due to lack of null
        // terminator
-        flatVector->set(
-            insertionRow,
-            StringView(varBinBuf_->data(), static_cast<int32_t>(str.size())));
+        flatVector->set(insertionRow, StringView(str));


In the non-base64 branch, the code copies bytes into varBinBuf_ but then sets the vector to StringView(str) instead of viewing the copied buffer. This defeats the purpose of the copy and can make the stored view depend on the lifetime of str. Set the value from varBinBuf_ (data pointer + size) to ensure stable ownership.

Copilot · 2026-02-27T09:57:17Z

velox/common/encode/Base64.cpp

+Status Base64::decodeMime(std::string_view input, std::string& output) {
+  size_t inputSize = input.size();
  if (inputSize == 0) {
    return Status::OK();
  }


decodeMime(..., std::string& output) appends via output.push_back(...) later, but never clears output at the start. If callers reuse the same std::string, results will be concatenated with previous contents. Clear (and ideally reserve) output at function entry.

Copilot · 2026-02-27T09:57:17Z

velox/common/encode/Base64.cpp

  // If there's nothing to encode, do nothing.
+  size_t inputSize = input.size();
  if (inputSize == 0) {
    return;
  }



encodeMime(..., std::string& output) writes using output.push_back(...) but doesn't clear output first. Reusing the same output string will produce concatenated encodings. Clear output at the start (and reserve expected size) to make the API safe and consistent with other encode paths.

Suggested change

// If there's nothing to encode, do nothing.

size_t inputSize = input.size();

if (inputSize == 0) {

return;

}

// Ensure the output starts empty for each call.

output.clear();

// If there's nothing to encode, leave output empty and return.

size_t inputSize = input.size();

if (inputSize == 0) {

return;

}

// Reserve the full encoded size, including CRLFs, to avoid reallocations.

output.reserve(calculateMimeEncodedSize(inputSize));

Copilot · 2026-02-27T09:57:17Z

velox/core/PlanNode.cpp


-  obj["data"] =
-      encoding::Base64::encode(serializedData.data(), serializedData.size());
+  obj["data"] = encoding::Base64::encode(serializedData);


This call now uses Base64::encode(std::string_view, bool includePadding = false) with the default includePadding=false. Previously, the encode(const char*, size_t) overload encoded with padding. Changing padding behavior in serialized plan JSON can break backward compatibility for stored/transported plans. Consider explicitly passing includePadding=true here (or otherwise guaranteeing stable serialization).

Suggested change

obj["data"] = encoding::Base64::encode(serializedData);

obj["data"] = encoding::Base64::encode(serializedData, true);

Copilot · 2026-02-27T09:57:17Z

velox/core/Expressions.cpp

    auto serializedValue = out.str();
-    obj["valueVector"] = encoding::Base64::encode(
-        serializedValue.data(), serializedValue.size());
+    obj["valueVector"] = encoding::Base64::encode(serializedValue);


Same concern as in PlanNode.cpp: this now defaults to includePadding=false, potentially changing the serialized expression format compared to the previous padded encoding. If this data is persisted or exchanged, explicitly request padding to preserve compatibility.

Suggested change

obj["valueVector"] = encoding::Base64::encode(serializedValue);

obj["valueVector"] = encoding::Base64::encode(serializedValue, true);

Copilot · 2026-02-27T09:57:17Z

velox/common/encode/Base64.cpp

-void Base64::encode(const char* input, size_t inputSize, char* output) {
-  encodeImpl(std::string_view(input, inputSize), kBase64Charset, true, output);
+void Base64::encode(std::string_view input, std::string& output) {
+  encodeImpl(input, kBase64Charset, false, output);


Base64::encode(std::string_view, std::string&) hardcodes includePadding=false, while historical base64 defaults typically include padding (and the previous pointer-based encode path did). This is a behavior change for callers migrating to the new API. Consider (a) adding an includePadding parameter to this overload, or (b) defaulting to includePadding=true for standard Base64 to match prior behavior, while keeping URL-safe variants configurable.

Suggested change

encodeImpl(input, kBase64Charset, false, output);

encodeImpl(input, kBase64Charset, true, output);

Copilot · 2026-02-27T09:57:18Z

velox/functions/sparksql/UnBase64Function.h

+    auto status = encoding::Base64::decodeMime(
+        std::string_view(input.data(), input.size()), decodedStr);
+    if (status.ok()) {
+      result.resize(decodedSize.value());


result is resized based on decodedSize.value() but the copy uses decodedStr.size(). If these diverge (e.g., due to input normalization rules), this can either overflow (decodedStr.size() > result.size()) or leave uninitialized trailing bytes (decodedStr.size() < result.size()). Prefer resizing to decodedStr.size() (or validating equality before copying).

Suggested change

result.resize(decodedSize.value());

result.resize(decodedStr.size());

Copilot · 2026-02-27T09:57:18Z

velox/functions/sparksql/Base64Function.h

-    encoding::Base64::encodeMime(input.data(), input.size(), result.data());
+    std::string encodedStr;
+    encoding::Base64::encodeMime(
+        std::string_view(input.data(), input.size()), encodedStr);


result is resized based on calculateMimeEncodedSize(input.size()), but the code copies encodedStr.size() bytes without verifying the sizes match. If encodeMime ever produces an output length different from the calculated size, this can write past result or leave extra garbage at the end. Consider resizing result to encodedStr.size() after encoding (or explicitly asserting the sizes match) before copying.

Suggested change

std::string_view(input.data(), input.size()), encodedStr);

std::string_view(input.data(), input.size()), encodedStr);

result.resize(encodedStr.size());

Copilot · 2026-02-27T09:57:18Z

velox/common/encode/tests/Base64Test.cpp

-        Base64::calculateMimeDecodedSize(in.data(), in.size()).value();
-    std::string out(decSize, '\0');
-    auto result = Base64::decodeMime(in.data(), in.size(), out.data());
+    const auto decSizeOpt = Base64::calculateMimeDecodedSize(in);


decSizeOpt.value() is used without checking decSizeOpt.hasError(). If a caller reuses this helper with invalid input later, this would throw/assert unexpectedly. Consider asserting decSizeOpt.hasValue() (or failing the test with the error message) before using .value().

Suggested change

const auto decSizeOpt = Base64::calculateMimeDecodedSize(in);

const auto decSizeOpt = Base64::calculateMimeDecodedSize(in);

if (decSizeOpt.hasError()) {

VELOX_USER_FAIL(decSizeOpt.error().message());

}

refactor: Update Base64 to match coding standard

f610bb6

# Conflicts: # velox/functions/prestosql/BinaryFunctions.h

Copilot AI review requested due to automatic review settings February 27, 2026 09:50

sourcery-ai bot reviewed Feb 27, 2026

View reviewed changes

Copilot AI reviewed Feb 27, 2026

View reviewed changes

	obj["data"] = encoding::Base64::encode(serializedData);
	obj["data"] = encoding::Base64::encode(serializedData, true);

	obj["valueVector"] = encoding::Base64::encode(serializedValue);
	obj["valueVector"] = encoding::Base64::encode(serializedValue, true);

	encodeImpl(input, kBase64Charset, false, output);
	encodeImpl(input, kBase64Charset, true, output);

	result.resize(decodedSize.value());
	result.resize(decodedStr.size());

	std::string_view(input.data(), input.size()), encodedStr);
	std::string_view(input.data(), input.size()), encodedStr);
	result.resize(encodedStr.size());

Conversation

Joe-Abraham commented Feb 27, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for refactored Base64 encoding/decoding API

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Joe-Abraham commented Feb 27, 2026

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Joe-Abraham commented Feb 27, 2026 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Feb 27, 2026 •

edited

Loading