fix: preserve async UDF return field metadata#22663
Conversation
9d1f609 to
7b117e0
Compare
paleolimbot
left a comment
There was a problem hiding this comment.
To the extent that I'm qualified to approve these things, this looks good to me!
| pub fn field(&self, _input_schema: &Schema) -> Result<Field> { | ||
| Ok(self.return_field.as_ref().clone().with_name(&self.name)) |
There was a problem hiding this comment.
i wonder if we're better off deprecating this function and do something similar to ScalarFunctionExpr and implement PhysicalExpr::return_field 🤔
datafusion/datafusion/physical-expr/src/scalar_function.rs
Lines 283 to 285 in e2db766
There was a problem hiding this comment.
Good idea. I have updated the PR to deprecate fn field and implemented fn return_field.
34f1794 to
b1241ec
Compare
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
Which issue does this PR close?
Closes #22662.
Rationale for this change
Async scalar UDFs can compute output field metadata in
return_field_from_args(...), butAsyncFuncExprrebuilt the output field from only name, data type, and nullability. This dropped metadata from async UDF result fields.What changes are included in this PR?
This PR updates
AsyncFuncExpr::field(...)to preserve the plannedreturn_fieldmetadata and only rename the field for the async expression output.It also adds a regression test that verifies an async UDF result batch preserves metadata attached by
return_field_from_args(...).Are these changes tested?
Yes.
Added a regression test in:
datafusion/core/tests/user_defined/user_defined_async_scalar_functions.rsThe test fails without the fix and passes with the fix.
Are there any user-facing changes?
Yes.
Async scalar UDF result fields now preserve metadata attached by
return_field_from_args(...).