Skip to content

Cannot create a table from a pyarrow schema #2030

Open
@DavidEscott

Description

@DavidEscott

Apache Iceberg version

0.9.0

Please describe the bug 🐞

I can't figure out how one is actually expected to create a table from a pyarrow schema.

For this use case I have a pandas or polars table that I have directly created a pyarrow schema from. I would like to create a new empty table from this virgin pyarrow schema. It has no field ids and I would like pyiceberg to generate them.

io.pyarrow.pyarrow_to_schema would seem to be the way to convert the schema to an iceberg schema. It first calls the _HasIds() visitor, this is seemingly returning False because the schema doesn't have ids.

Next is to try and apply the name_mapping, but I don't want to provide a name_mapping.

It then fails warning me that the table doesn't have the Iceberg table does not have 'schema.name-mapping.default' defined, which is circular as I don't have a table yet.


It seems like pyarrow_to_schema is missing a case where one would apply the _ConvertToIcebergWithoutIDs visitor to the arrow schema as is done in _pyarrow_to_schema_without_ids

Is there some reason we are not supposed to build tables this way?

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions