Skip to content

fix: preserve async UDF return field metadata#22663

Open
Kontinuation wants to merge 2 commits into
apache:mainfrom
Kontinuation:fix-async-udf-field-metadata
Open

fix: preserve async UDF return field metadata#22663
Kontinuation wants to merge 2 commits into
apache:mainfrom
Kontinuation:fix-async-udf-field-metadata

Conversation

@Kontinuation
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #22662.

Rationale for this change

Async scalar UDFs can compute output field metadata in return_field_from_args(...), but AsyncFuncExpr rebuilt the output field from only name, data type, and nullability. This dropped metadata from async UDF result fields.

What changes are included in this PR?

This PR updates AsyncFuncExpr::field(...) to preserve the planned return_field metadata and only rename the field for the async expression output.

It also adds a regression test that verifies an async UDF result batch preserves metadata attached by return_field_from_args(...).

Are these changes tested?

Yes.

Added a regression test in:

  • datafusion/core/tests/user_defined/user_defined_async_scalar_functions.rs

The test fails without the fix and passes with the fix.

Are there any user-facing changes?

Yes.

Async scalar UDF result fields now preserve metadata attached by return_field_from_args(...).

@github-actions github-actions Bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate labels May 31, 2026
@Kontinuation Kontinuation force-pushed the fix-async-udf-field-metadata branch from 9d1f609 to 7b117e0 Compare May 31, 2026 12:15
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the extent that I'm qualified to approve these things, this looks good to me!

Comment on lines +91 to +92
pub fn field(&self, _input_schema: &Schema) -> Result<Field> {
Ok(self.return_field.as_ref().clone().with_name(&self.name))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we're better off deprecating this function and do something similar to ScalarFunctionExpr and implement PhysicalExpr::return_field 🤔

fn return_field(&self, _input_schema: &Schema) -> Result<FieldRef> {
Ok(Arc::clone(&self.return_field))
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I have updated the PR to deprecate fn field and implemented fn return_field.

@github-actions github-actions Bot added the physical-plan Changes to the physical-plan crate label Jun 5, 2026
@Kontinuation Kontinuation force-pushed the fix-async-udf-field-metadata branch from 34f1794 to b1241ec Compare June 5, 2026 16:12
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details
     Cloning apache/main
    Building datafusion v53.1.0 (current)
       Built [  98.898s] (current)
     Parsing datafusion v53.1.0 (current)
      Parsed [   0.036s] (current)
    Building datafusion v53.1.0 (baseline)
       Built [ 100.064s] (baseline)
     Parsing datafusion v53.1.0 (baseline)
      Parsed [   0.035s] (baseline)
    Checking datafusion v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.627s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [ 202.847s] datafusion
    Building datafusion-physical-expr v53.1.0 (current)
       Built [  28.506s] (current)
     Parsing datafusion-physical-expr v53.1.0 (current)
      Parsed [   0.048s] (current)
    Building datafusion-physical-expr v53.1.0 (baseline)
       Built [  28.969s] (baseline)
     Parsing datafusion-physical-expr v53.1.0 (baseline)
      Parsed [   0.051s] (baseline)
    Checking datafusion-physical-expr v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.355s] 223 checks: 222 pass, 1 fail, 0 warn, 30 skip

--- failure type_method_marked_deprecated: type method #[deprecated] added ---

Description:
A type method is now #[deprecated]. Downstream crates will get a compiler warning when using this method.
        ref: https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-deprecated-attribute
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/type_method_marked_deprecated.ron

Failed in:
  method datafusion_physical_expr::async_scalar_function::AsyncFuncExpr::field in /home/runner/work/datafusion/datafusion/datafusion/physical-expr/src/async_scalar_function.rs:92

     Summary semver requires new minor version: 0 major and 1 minor checks failed
    Finished [  59.206s] datafusion-physical-expr
    Building datafusion-physical-plan v53.1.0 (current)
       Built [  36.056s] (current)
     Parsing datafusion-physical-plan v53.1.0 (current)
      Parsed [   0.127s] (current)
    Building datafusion-physical-plan v53.1.0 (baseline)
       Built [  36.035s] (baseline)
     Parsing datafusion-physical-plan v53.1.0 (baseline)
      Parsed [   0.126s] (baseline)
    Checking datafusion-physical-plan v53.1.0 -> v53.1.0 (no change; assume patch)
     Checked [   0.613s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [  75.169s] datafusion-physical-plan

@github-actions github-actions Bot added the auto detected api change Auto detected API change label Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto detected api change Auto detected API change core Core DataFusion crate physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AsyncFuncExpr drops async UDF return field metadata

3 participants