How to make rescaling utility functions more general? #6323
Replies: 13 comments 9 replies
-
Some references:
|
Beta Was this translation helpful? Give feedback.
-
Would you test this to confirm whether we indeed incur lots of copying for the error message strings? This data point would help direct the discussion. |
Beta Was this translation helpful? Give feedback.
-
I would vote for something like |
Beta Was this translation helpful? Give feedback.
-
Can folly |
Beta Was this translation helpful? Give feedback.
-
I think there are 2 different mechanism to vote for/discuss here:
As previously suggested, for this we can refer to any of the available implementations like absl::Status , rocksdb::Status , arrow::Result , impala:Status, kudu:Status
BTW, is the plan to start with lower level APIs like cast conversion utils and eventually bubble them up to higher levels like expression eval's main loop? |
Beta Was this translation helpful? Give feedback.
-
Do we have conclusion? |
Beta Was this translation helpful? Give feedback.
-
I also vote we create a lightweight status class instead of folly::Expected (or similar) as it gives us more flexibility to evolve it and add the members we need. Although we are only targeting the rescaling path for now, I suspect this will quickly end up being used in other parts of the codebase. Happy to help iterate on a PR is anyone would like to take a stab at it. |
Beta Was this translation helpful? Give feedback.
-
I am just curious about the refactor scope. If we wanted to replace all exception-throwing code in Velox to status-return (do we?), will one have to catch exceptions and handle returned statuses at the same time when calling Velox's API? Since exceptions still be thrown from std library or possibly from 3rd libraries. Just wanted to make sure we won't create more complexity to users here. |
Beta Was this translation helpful? Give feedback.
-
I think we should simply add it to the rescaling path for now. We can convert Status to Velox Error outside. |
Beta Was this translation helpful? Give feedback.
-
So currently only parquet writer is copied from arrow, right? What's the long team plan for Velox Parquet writer? Will we re-implement it or use arrow as it is? I don't think copying the code from arrow is good idea. We will eventually customize code but just leave the "arrow" name there without hardcoded dependency. |
Beta Was this translation helpful? Give feedback.
-
I start looking at this as part of vendoring the Parquet writer code (#6193). I'll send out a draft soon based on the Status/Result classes Arrow has today, which are pretty handy. |
Beta Was this translation helpful? Give feedback.
-
FYI, I just merged today a first version of a Status class for Velox. It's heavily based on Arrow's: When I'm back from holiday's break I'll submit a proposal for an Result/Expect class. Seems like the one in folly is pretty functional and get be easily used with our Status class. |
Beta Was this translation helpful? Give feedback.
-
As mentioned in #6051 (comment) .
After #5801, when rescaling utility functions are implemented, it's better to return an error message instead of throwing an exception, and it will probably introduce a field called
errorMessage
. But if we introduce some APIs likefolly::Expected
or something like that, it might be more clear. Should we introduce these APIs?If a rescaling function returns
folly::Expected
, it should befolly::Expected<TOutput, ERROR>
whichTOutput
standsint64_t
orint128_t
andERROR
could be a string-type or an enum-type value. If we use a string-type value asERROR
, it might contain a lot of copying so an enum-type may be better.Should we add a new enum class for the rescaling function's error reason? And What should be contained in this enum class?
/cc @mbasmanova @jinchengchenghh @rui-mo
Beta Was this translation helpful? Give feedback.
All reactions