Skip to content

feat(function): Implement Google Polyline encoding and decoding functions#16516

Open
deepthibose01 wants to merge 6 commits intofacebookincubator:mainfrom
deepthibose01:feature/google_polyline_functions
Open

feat(function): Implement Google Polyline encoding and decoding functions#16516
deepthibose01 wants to merge 6 commits intofacebookincubator:mainfrom
deepthibose01:feature/google_polyline_functions

Conversation

@deepthibose01
Copy link

@deepthibose01 deepthibose01 commented Feb 25, 2026

Abstract

This PR adds the scalar function and unit testing for the function that implements google's polyline encoding and decoding logic based on the already available function implementation in presto as part of #16041

@netlify
Copy link

netlify bot commented Feb 25, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 14daa74
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69b27fc00bb6270008734819

@meta-cla
Copy link

meta-cla bot commented Feb 25, 2026

Hi @deepthibose01!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@deepthibose01 deepthibose01 changed the title [Scalar Function] [Google Polyline Functions] Implement encode and decoding functions [WIP] Google Polyline Functions: Implement encode and decoding functions Feb 25, 2026
@deepthibose01 deepthibose01 changed the title [WIP] Google Polyline Functions: Implement encode and decoding functions [WIP] feat : Implement Google Polyline encode and decoding functions Feb 25, 2026
@deepthibose01 deepthibose01 changed the title [WIP] feat : Implement Google Polyline encode and decoding functions [WIP] feat : Implement Google Polyline encoding and decoding functions Feb 25, 2026
@meta-cla
Copy link

meta-cla bot commented Feb 26, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 26, 2026
@meta-cla
Copy link

meta-cla bot commented Feb 26, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@deepthibose01 deepthibose01 force-pushed the feature/google_polyline_functions branch 2 times, most recently from ba845ae to 5c3ab3d Compare February 26, 2026 07:26
@Yuhta Yuhta requested a review from jagill February 26, 2026 22:55
@deepthibose01
Copy link
Author

gtest runs

./_build/debug/velox/functions/prestosql/tests/velox_functions_test --gtest_filter="GeometryFunctionsTest.testGooglePolylineFunctions"
Running main() from /private/tmp/googletest-20250910-5249-qce8ci/googletest-1.17.0/googletest/src/gtest_main.cc
Note: Google Test filter = GeometryFunctionsTest.testGooglePolylineFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from GeometryFunctionsTest
[ RUN      ] GeometryFunctionsTest.testGooglePolylineFunctions
[       OK ] GeometryFunctionsTest.testGooglePolylineFunctions (23 ms)
[----------] 1 test from GeometryFunctionsTest (23 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (184 ms total)
[  PASSED  ] 1 test.

Copy link
Collaborator

@jkhaliqi jkhaliqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, could we maybe add the test case where the input geometry is another Geo Type like POLYGON for encode as well?

@deepthibose01
Copy link
Author

@jkhaliqi Thanks for the review. I have resolved the review comments and has added the non-point testing scenarios. Please have a look.

 velox % ./_build/debug/velox/functions/prestosql/tests/velox_functions_test --gtest_filter="GeometryFunctionsTest.testGooglePolylineFunctions"
Running main() from /private/tmp/googletest-20250910-5249-qce8ci/googletest-1.17.0/googletest/src/gtest_main.cc
Note: Google Test filter = GeometryFunctionsTest.testGooglePolylineFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from GeometryFunctionsTest
[ RUN      ] GeometryFunctionsTest.testGooglePolylineFunctions
[       OK ] GeometryFunctionsTest.testGooglePolylineFunctions (13 ms)
[----------] 1 test from GeometryFunctionsTest (13 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (90 ms total)
[  PASSED  ] 1 test.

@deepthibose01 deepthibose01 requested a review from jkhaliqi March 3, 2026 06:40
@deepthibose01
Copy link
Author

Thanks @jkhaliqi. I have ensured the changes are made. Please check. Thanks

Copy link
Collaborator

@jkhaliqi jkhaliqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

.. function:: google_polyline_encode(points: array(Geometry)) -> encoded: varchar

Encodes an array of Point geometries into a Google Polyline encoded string.
Uses the default precision exponent of 5. The precision used for decoding is 10^precision_exponent.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: decoding -> encoding?
Could we not say directly that it encodes with precision 10^5 as there is no way to override the (default) precision for this function?

Same for the decode function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The statements are updated to add only the relevant information wrt specific function signature.

#include "velox/functions/prestosql/types/BingTileType.h"
#include "velox/functions/prestosql/types/GeometryRegistration.h"
#include "velox/functions/prestosql/types/SphericalGeographyRegistration.h"
#include "velox/functions/prestosql/GooglePolylineFunctions.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the line below GeometryFunctions.h. (the includes are sorted alphabetically).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the file inclusions to follow alphabetical order

/// Encode a single delta value using Google Polyline encoding.
/// https://developers.google.com/maps/documentation/utilities/polylinealgorithm
/// Algorithm:
/// 1. Convert signed to unsigned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention the ZigZag encoding used here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added detailed ZigZag encoding documentation in the algorithm description.

}

while (unsignedDelta >= 0x20) {
int64_t nextChunk = (0x20 | (unsignedDelta & 0x1f)) + 63;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a const for magic numbers like 63.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All magic numbers have been replaced with named constants. The constants are properly scoped within the anonymous namespace since they're only used by the internal delta encoding/decoding functions.

out_type<Varchar>& result,
const arg_type<Array<Geometry>>& points,
int64_t precisionExponent) {
VELOX_USER_CHECK_GE(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there also a maximum for the exponent? I image this is not arbitrary large either?
We need to prevent overflows.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java only validates minimum precision (>= 1). Should I add maximum validation (e.g., <= 18) as a safety improvement even though Java doesn't have it?

out_type<Array<Geometry>>& result,
const arg_type<Varchar>& encoded,
int64_t precisionExponent) {
VELOX_USER_CHECK_GE(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, overflow?

FOLLY_ALWAYS_INLINE void call(
out_type<Varchar>& result,
const arg_type<Array<Geometry>>& points) {
callImpl(result, points, kDefaultPrecisionExponent);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can save the calculation by hard coding the actual value for 10^5 and use that with a little variation. That way we don't have to re-calculate the same value over and over.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added precomputed constant kDefaultPrecision = 100000.0 . Both encode and decode functions now check if precisionExponent == kDefaultPrecisionExponent and use the hard-coded value directly, avoiding the std::pow() calculation for the common case (default precision of 10^5).

kMinimumPrecisionExponent);

double precision = std::pow(10.0, static_cast<double>(precisionExponent));
std::string encodedStr(encoded.data(), encoded.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make a copy or could we not operate on StringView directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.the copy is unnecessary; I'll remove it and use the StringView directly since decodeNextDelta only reads from it.

int64_t b;

do {
VELOX_USER_CHECK_LT(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to throw an exception for bad user input. Instead, we want to use the Status return. (unwinding the exception is expensive).
Can we refactor this so the caller can use the Status as a return of the call function in case a user error occurs - in this case a bad input with missing data.

Check the other geometry function implementations.

Copy link
Author

@deepthibose01 deepthibose01 Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @czentgr for the review!
I agree and also like to implement the approach with status return.Request your guidance regarding the error-handling approach.
Java reference (GeoFunctions.java:1394) also throws exceptions for invalid input. Should I convert to Status returns (better C++ performance, matches other Velox geometry functions) or keep exceptions to match Java?

Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also run the pre-commit workflow.

@deepthibose01 deepthibose01 force-pushed the feature/google_polyline_functions branch 3 times, most recently from ba00006 to 9aaa667 Compare March 10, 2026 09:41
@deepthibose01
Copy link
Author

Hi @czentgr,
Thank you for the detailed review! I've addressed all the feedback. Please have a look.

velox_functions_test --gtest_filter="GeometryFunctionsTest.testGooglePolylineFunctions"
Running main() from /private/tmp/googletest-20250910-5249-qce8ci/googletest-1.17.0/googletest/src/gtest_main.cc
Note: Google Test filter = GeometryFunctionsTest.testGooglePolylineFunctions
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from GeometryFunctionsTest
[ RUN      ] GeometryFunctionsTest.testGooglePolylineFunctions
[       OK ] GeometryFunctionsTest.testGooglePolylineFunctions (16 ms)
[----------] 1 test from GeometryFunctionsTest (16 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (91 ms total)
[  PASSED  ] 1 test.
pre-commit run --all-files
Check hooks apply to the repository......................................Passed
Check for useless excludes...............................................Passed
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check for added large files..............................................Passed
check that executables have shebangs.....................................Passed
check that scripts with shebangs are executable..........................Passed
clang-tidy...............................................................Passed
license-header...........................................................Passed
CMake formatter..........................................................Passed
clang-format.............................................................Passed
ruff.....................................................................Passed
ruff-format..............................................................Passed
shellcheck...............................................................Passed
shfmt....................................................................Passed
yamllint.................................................................Passed
yamlfmt..................................................................Passed
zizmor...................................................................Passed
Validate GitHub Actions workflows........................................Passed

@deepthibose01 deepthibose01 requested a review from czentgr March 10, 2026 09:46
Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments.

auto arrayVec = makeNullableArrayVector<std::string>({{points}});
auto input = makeRowVector({arrayVec});
std::optional<std::string> result = evaluateOnce<std::string>(
"google_polyline_encode(transform(c0, x -> ST_GEOMETRYFROMTEXT(x)))",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use google_polyline_encode(ST_GeometryFromText(c0)) directly? No need to use the transform function. Let's use ST_GeometryFromText also to align with the other usage (even though it is not case sensitive). Same with the other occurrences.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ST_GeometryFromText do not seem to have an array overload and only returns single Geometry and encode functions requires Array<Geometry> which is why transform is used here and this was done in accordance with the existing pattern in the tests in the same file like in GEOMETRY_UNION, ST_LineString, ST_MultiPoint. The other alternative to possibly avoid transform fully that I could try on is to create the geometry array but it would need multiple evaluateOnce calls which seemed inefficient.

ST_GeometryFromText
FOLLY_ALWAYS_INLINE Status
call(out_type<Geometry>& result, const arg_type<Varchar>& wkt)

google_polyline_encode
call(out_type<Varchar>& result, const arg_type<Array<Geometry>>& points)


const auto testCustomEncode = [&](const std::optional<std::vector<
std::optional<std::string>>>& points,
const int32_t precision,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for const.


const auto testCustomDecode =
[&](const std::optional<std::string>& encoded,
const int32_t precision,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for const.

expectedPoints) {
auto encodeVec = makeFlatVector<std::string>({encoded.value()});
auto input = makeRowVector({encodeVec});
auto output = evaluate(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use evaluateOnce instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decode function returns an array of strings/points, but evaluateOnce<std::vectorstd::string> do not seem to work. When I tried to use it, it gave the compilation error

error: no type named 'NativeType' in 'facebook::velox::CppToType<std::vector<std::string>>'
   28 |   using Type = typename CppToType<T>::NativeType;

Also when I checked the existing test in such scenarios like ST_Points, ST_Geometries all seem to use the evaluate().

auto arrayVec = makeNullableArrayVector<std::string>({{points}});
auto precisionVec = makeFlatVector<int32_t>({precision});
auto input = makeRowVector({arrayVec, precisionVec});
std::optional<std::string> result = evaluateOnce<std::string>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loos like a lot of copy&paste for encode and decode versions. The only difference is the additional c1 columns.

Can precision be a nullopt as well from a function point of view (aka is NULL allowed as column value for precision?). If it is not allowed to be NULL we can use a single lambda with a std::optional for precision and if it is not set use the standard version, otherwise pass the precision.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java implementation and C++ implementation are both using primitive types that cannot be null and there seems to have No @Nullable annotation in Java and have combined the encode versions and decode versions together as advised.

@deepthibose01 deepthibose01 requested a review from czentgr March 11, 2026 09:20
Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good. Just one more comment.

@jagill When you get a chance can you please review as well?

"POLYGON((-135 85, -45 85, 45 85, 135 85, -135 85))", 619.00E9);
}

TEST_F(GeometryFunctionsTest, testGooglePolylineFunctions) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing. Can we add a test that does "testDecode(testEncode(data))". It is a bit tricky with the current way the lambdas work because each function also validates. And also vice versa testEncode(testDecode).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Christian, Thanks for the review. Added bidirectional round-trip tests for testEncode(testDecode(encoded)) and testDecode(testEncode(points)) .Both test functions now support an optional compose parameter with default value as false. When compose=true, the function returns its result without validation, enabling composition. The outer function validates the complete round-trip, confirming encode and decode are inverse operations. Tested with precision values: default, 1, 6, and 16.

Making common impl private

Test file for the new function

Using similar points check for encoding as in presto side

brace mismatch

More test cases for validation wrt Java test scenarios

Custom encode/decode tests

Added additonal test scenarios from Java

Added additonal test scenarios from Java
@deepthibose01 deepthibose01 force-pushed the feature/google_polyline_functions branch from 4b210fa to b7359c4 Compare March 12, 2026 08:37
@deepthibose01 deepthibose01 requested a review from czentgr March 12, 2026 08:38
@deepthibose01 deepthibose01 changed the title [WIP] feat : Implement Google Polyline encoding and decoding functions feat(function): Implement Google Polyline encoding and decoding functions Mar 12, 2026
@deepthibose01 deepthibose01 force-pushed the feature/google_polyline_functions branch from b7359c4 to 64a88ae Compare March 12, 2026 08:44
@deepthibose01 deepthibose01 force-pushed the feature/google_polyline_functions branch from 64a88ae to 14daa74 Compare March 12, 2026 08:56
@deepthibose01 deepthibose01 marked this pull request as ready for review March 12, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants