feat: Add Spark months_between function#14909
feat: Add Spark months_between function#14909zml1206 wants to merge 5 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
|
|
||
| SELECT months_between('1997-02-28 10:30:00', '1996-10-30', true); -- 3.94959677 | ||
| SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false); -- 3.9495967741935485 | ||
| SELECT months_between('1997-02-28 10:30:00', '1996-03-31 11:00:00', true); -- 11 |
There was a problem hiding this comment.
spark-sql (default)> SELECT months_between('1997-02-28 10:30:00', '1996-03-31 11:00:00', true);
11.0
| SELECT months_between('1997-02-28 10:30:00', '1996-10-30', true); -- 3.94959677 | ||
| SELECT months_between('1997-02-28 10:30:00', '1996-10-30', false); -- 3.9495967741935485 | ||
| SELECT months_between('1997-02-28 10:30:00', '1996-03-31 11:00:00', true); -- 11 | ||
| SELECT months_between('1997-02-21 10:30:00', '1996-03-21 11:00:00', true); -- 11 |
There was a problem hiding this comment.
spark-sql (default)> SELECT months_between('1997-02-21 10:30:00', '1996-03-21 11:00:00', true);
11.0
velox/functions/lib/TimeUtils.h
Outdated
| inline constexpr int64_t kSecondsInMinute = 60; | ||
| inline constexpr int64_t kSecondsInHour = 3600; | ||
| inline constexpr int64_t kSecondsInDay = 86'400; | ||
| inline constexpr int64_t kSecondsInMonth = 2'678'400; |
There was a problem hiding this comment.
Could you use a function such as kSecondsInDay * 31 to represent
| std::optional<DateTimeUnit> unit_ = std::nullopt; | ||
| }; | ||
|
|
||
| template <typename T> |
velox/functions/lib/DateTimeUtil.h
Outdated
| } | ||
|
|
||
| FOLLY_ALWAYS_INLINE bool isEndDayOfMonth(const std::tm& tm) { | ||
| const auto endDay = util::getMaxDayOfMonth(getYear(tm), getMonth(tm)); |
There was a problem hiding this comment.
remove endDay since it is used only once
velox/functions/lib/DateTimeUtil.h
Outdated
| } | ||
|
|
||
| FOLLY_ALWAYS_INLINE double | ||
| monthsBetween(const std::tm& tm1, const std::tm& tm2, const bool roundOff) { |
There was a problem hiding this comment.
const bool -> bool, remove cost when passing by value
| "months_between(c0, c1, c2)", timestamp1, timestamp2, roundOff); | ||
| }; | ||
|
|
||
| EXPECT_EQ( |
There was a problem hiding this comment.
Could you add more roundOff false test to make sure it works well?
velox/functions/lib/DateTimeUtil.h
Outdated
| } | ||
|
|
||
| FOLLY_ALWAYS_INLINE double | ||
| monthsBetween(const std::tm& tm1, const std::tm& tm2, bool roundOff) { |
There was a problem hiding this comment.
Only SparkSql has this function, we could move it to struct MonthsBetweenFunction
jinchengchenghh
left a comment
There was a problem hiding this comment.
LGTM, could you also help take a look? Thanks! @rui-mo
rui-mo
left a comment
There was a problem hiding this comment.
Would you please update apache/gluten#10782 to ensure all Spark tests pass? Thanks.
It seems that all Spark tests passed, is there something I missed? |
address comments
2023776 to
429b40c
Compare
@zml1206 Just noticed some workflow failed, and want to make sure it works. |
They are |
rui-mo
left a comment
There was a problem hiding this comment.
Thanks. Just two nits. Looks good overall.
|
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this in D83490561. |
|
@bikramSingh91 merged this pull request in a6fa169. |

Adds Spark months_between function.
https://github.com/apache/spark/blob/29434ea766b0fc3c3bf6eaadb43a8f931133649e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L2031