Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[substrait]: schema roundtrip fails #15210

Open
niebayes opened this issue Mar 13, 2025 · 0 comments
Open

[substrait]: schema roundtrip fails #15210

niebayes opened this issue Mar 13, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@niebayes
Copy link
Contributor

niebayes commented Mar 13, 2025

Describe the bug

If a DataFusion schema contains a timestamp column with timezone other than "UTC", the roundtrip fails.
Specifically, converts the schema to substrait named struct and then converts the named struct back to schema, the reconstructed schema will always have a timestamp column with timezone being "UTC", other than the original timezone.

To Reproduce

The following test fails. Note, there's a timestamp column with non-UTC timezone.

#[test]
fn named_struct_names() -> Result<()> {
    let schema = DFSchemaRef::new(DFSchema::try_from(Schema::new(vec![
        Field::new("int", DataType::Int32, true),
        Field::new(
            "struct",
            DataType::Struct(Fields::from(vec![Field::new(
                "inner",
                DataType::List(Arc::new(Field::new_list_field(DataType::Utf8, true))),
                true,
            )])),
            true,
        ),
        Field::new("trailer", DataType::Float64, true),
        Field::new("time", DataType::Timestamp(TimeUnit::Second, None), true),
        Field::new(
            "time_with_tz",
            DataType::Timestamp(TimeUnit::Second, Some("+00:00".into())),
            true,
        ),
    ]))?);

    let named_struct = to_substrait_named_struct(&schema)?;

    // Struct field names should be flattened DFS style
    // List field names should be omitted
    assert_eq!(
        named_struct.names,
        vec!["int", "struct", "inner", "trailer", "time", "time_with_tz"]
    );

    let roundtrip_schema =
        from_substrait_named_struct(&test_consumer(), &named_struct)?;
    assert_eq!(schema.as_ref(), &roundtrip_schema);
    Ok(())
}

Expected behavior

No response

Additional context

No response

@niebayes niebayes added the bug Something isn't working label Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant