Skip to content

UnicodeEncodeError in mdio_to_segy when EBCDIC header contains non-ASCII characters #549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ajmalrasi opened this issue May 7, 2025 · 0 comments

Comments

@ajmalrasi
Copy link

Issue

When converting MDIO to SEG-Y using mdio_to_segy(), the process fails with a UnicodeEncodeError if the EBCDIC header contains non-ASCII characters.

Traceback

Traceback (most recent call last):
  File "/home/dask/multidimio-trace-selector/client.py", line 593, in <module>
    status, message, flag_data = main()
  File "/home/dask/multidimio-trace-selector/client.py", line 482, in main
    max_cpu_mdio_to_sgy, mem_usage_mdio_to_sgy, cut_segy_file_size = create_segy(code_runtime_logger, MDIO_FILE, SEGY_DESTINATION, TEMP_DESTINATION, client, cut_mask)
  File "/home/dask/multidimio-trace-selector/client.py", line 95, in create_segy
    mdio_to_segy(
  File "/usr/local/lib/python3.10/site-packages/mdio/converters/mdio.py", line 121, in mdio_to_segy
    mdio, segy_factory = feature.result()
  File "/usr/local/lib/python3.10/site-packages/distributed/client.py", line 401, in result
    return self.client.sync(self._result, callback_timeout=timeout)
  File "/usr/local/lib/python3.10/site-packages/mdio/segy/creation.py", line 97, in mdio_spec_to_segy
    text_bytes = factory.create_textual_header(text_str)
  File "/usr/local/lib/python3.10/site-packages/segy/factory.py", line 140, in create_textual_header
    return text_spec.encode(text)
  File "/usr/local/lib/python3.10/site-packages/segy/schema/text_header.py", line 98, in encode
    return self.processor.encode(string)
  File "/usr/local/lib/python3.10/site-packages/segy/schema/text_header.py", line 42, in encode
    buffer = text.encode("ascii")
UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 426: ordinal not in range(128)

Sample File
A reproducible sample segy file:
gs://tgs-geophysical-test-samples/AP_Data/900_General/WeirdCharinEBCDIC.sgy

Ingestion Params.

{
  'segy_path': 'gs://tgs-geophysical-test-samples/AP_Data/900_General/WeirdCharinEBCDIC.sgy',
  'mdio_path_or_buffer': 'gs://tgs-geophysical-test-samples/AJ/7707/7707_WeirdCharinEBCDIC.mdio',
  'index_names': ('inline', 'xline'),
  'index_bytes': (189, 193),
  'index_types': ('int32', 'int32'),
  'chunksize': [128, 128, 128],
  'overwrite': True
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant