Skip to content

variant_schema udf#37

Draft
sdf-jkl wants to merge 36 commits intodatafusion-contrib:mainfrom
sdf-jkl:variant_schema_udf
Draft

variant_schema udf#37
sdf-jkl wants to merge 36 commits intodatafusion-contrib:mainfrom
sdf-jkl:variant_schema_udf

Conversation

@sdf-jkl
Copy link
Copy Markdown
Contributor

@sdf-jkl sdf-jkl commented Mar 2, 2026

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 2, 2026

It works, but I'm not happy about encoding VariantSchema into binary. replaced with json

@friendlymatthew, you can take a look

Copy link
Copy Markdown
Member

@friendlymatthew friendlymatthew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sdf-jk . Directionally this makes sense but I think it needs a simplification pass

I think there's a bunch of leftover infrastructure from the removed UDAF that has no callers in the scalar UDF, we can probably strip this out.

The type widening logic is also incomplete and makes me a bit nervous. Plus it only kicks in when merging elements within a single array. I'd rather just do equality checks for now and add widening in a follow up with full support

Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment on lines +346 to +365
/// Prints schema in a presentable manner
pub fn print_schema(schema: &VariantSchema) -> String {
match schema {
VariantSchema::Primitive(s) => format!("{s}"),

VariantSchema::Variant => "VARIANT".to_string(),

VariantSchema::Array(inner) => {
format!("ARRAY<{}>", print_schema(inner))
}

VariantSchema::Object(fields) => {
let parts: Vec<String> = fields
.iter()
.map(|(k, v)| format!("{k}: {}", print_schema(v)))
.collect();
format!("OBJECT<{}>", parts.join(", "))
}
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems we are reinventing std::fmt::Display.

We could probably just impl Display and call sites use variant_schema.to_string() or format!("{variant_schema"})

Copy link
Copy Markdown
Contributor Author

@sdf-jkl sdf-jkl Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that earlier and it didn't look good. I can give it another shot, just to show you how it looks.
I misread that. This sounds good.

Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Comment thread src/variant_schema.rs Outdated
Copy link
Copy Markdown
Member

@friendlymatthew friendlymatthew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sdf-jk . Directionally this makes sense but I think it needs a simplification pass

I think there's a bunch of leftover infrastructure from the removed UDAF that has no callers in the scalar UDF, we can probably strip this out.

The type widening logic is also incomplete and makes me a bit nervous. Plus it only kicks in when merging elements within a single array. I'd rather just do equality checks for now and add widening in a follow up with full support

@friendlymatthew
Copy link
Copy Markdown
Member

Sorry I think I removed serde_json when I was performing a manual merge

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 3, 2026

@friendlymatthew thanks, I'll address the comments.

Sorry I think I removed serde_json when I was performing a manual merge

No worries

@sdf-jkl
Copy link
Copy Markdown
Contributor Author

sdf-jkl commented Mar 4, 2026

@friendlymatthew thanks, I've addressed your comments. Ready for another pass.

@sdf-jkl sdf-jkl requested a review from friendlymatthew March 4, 2026 01:54
@sdf-jkl sdf-jkl marked this pull request as draft March 24, 2026 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

variant_schema initial support

2 participants