Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disrupted bootstrapping with ctrl-C leads to FileNotFoundError #2232

Open
perezja opened this issue Jan 30, 2025 · 1 comment
Open

Disrupted bootstrapping with ctrl-C leads to FileNotFoundError #2232

perezja opened this issue Jan 30, 2025 · 1 comment
Assignees

Comments

@perezja
Copy link

perezja commented Jan 30, 2025

I am using a conda environment for my flow, and after disrupting my flow run during the bootstrapping step, every subsequent run throws:

`FileNotFoundError: [Errno 2] No such file or directory: '/mnt/fsx/homes/japerez/.conda_envs/metaflow/linux-64/2e4cd95c23c6862/bin/python'

Apparently, some metadata persisted during the disrupted run informing the orchestrator that the conda environment was successfully bootstrapped, and it's looking at some hashed path to retrieve the environment but it doesn't exist.

Seems like a pretty straightforward issue to fix.

I had to manually add an extra package (e.g., pyyaml) to trigger a new bootstrapping event. After this my flow was successfully launched.

@conda_base(
    python='3.11',
    packages={
        'pandas': '2.2.3',
        'pyarrow': '18.1.0',
        's3fs': '2024.12.0',
        'pydantic-settings': '2.7.1',
        'pyyaml': '*'
    }
)
class DataStoreManifestFlow2(FlowSpec):
    ...

version

name : metaflow
version : 2.13.3
description : Metaflow: More Data Science, Less Engineering

command

python src/run_generate_manifest.py --environment=conda run

error

Metaflow 2.13.3 executing DataStoreManifestFlow2 for user:japerez
Validating your flow...
The graph looks good!
Running pylint...
Pylint not found, so extra checks are disabled.
2025-01-30 00:18:03.027 Bootstrapping virtual environment(s) ...
2025-01-30 00:18:03.071 Virtual environment(s) bootstrapped!
2025-01-30 00:18:03.483 Workflow starting (run-id 5778):
INFO:botocore.credentials:Found credentials from IAM Role: ds_general_compute_instance_role_phi
2025-01-30 00:18:04.482 Workflow failed.
2025-01-30 00:18:04.482 Terminating 0 active tasks...
2025-01-30 00:18:04.482 Flushing logs...
Internal error
Traceback (most recent call last):
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/cli.py", line 629, in main
start(auto_envvar_prefix="METAFLOW", obj=state)
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/_vendor/click/core.py", line 829, in call
return self.main(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/_vendor/click/core.py", line 782, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/cli_components/utils.py", line 69, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/_vendor/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/_vendor/click/core.py", line 610, in invoke
return callback(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/tracing/init.py", line 27, in wrapper_func
return func(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/cli_components/run_cmds.py", line 127, in wrapper
return func(args, kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/_vendor/click/decorators.py", line 33, in new_func
return f(get_current_context().obj, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/cli_components/run_cmds.py", line 360, in run
runtime.execute()
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/runtime.py", line 512, in execute
self._launch_workers()
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/runtime.py", line 964, in _launch_workers
self._launch_worker(task)
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/runtime.py", line 990, in _launch_worker
worker = Worker(task, self._max_log_size, self._config_file_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/runtime.py", line 1601, in init
self._proc = self._launch()
^^^^^^^^^^^^^^
File "/mnt/fsx/homes/japerez/.cache/pypoetry/virtualenvs/cbio-automation-xFts6JfH-py3.11/lib/python3.11/site-packages/metaflow/runtime.py", line 1666, in _launch
return subprocess.Popen(
^^^^^^^^^^^^^^^^^
File "/opt/miniconda3/lib/python3.11/subprocess.py", line 1026, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "/opt/miniconda3/lib/python3.11/subprocess.py", line 1950, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/fsx/homes/japerez/.conda_envs/metaflow/linux-64/2e4cd95c23c6862/bin/python'

@savingoyal
Copy link
Collaborator

thanks for the issue. we are taking a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants