Skip to content

Add support for Pbs #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open

Add support for Pbs #69

wants to merge 31 commits into from

Conversation

ekouts
Copy link
Collaborator

@ekouts ekouts commented May 20, 2025

This is still work-in-progress

@ekouts ekouts marked this pull request as ready for review May 26, 2025 10:43
@ekouts ekouts requested review from Copilot, jpdorsch and theely May 26, 2025 10:56
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds comprehensive PBS scheduler support across the codebase, including model adjustments, exception handling, client integration, API schema extensions, updated shell scripts for job submission, and Docker-based PBS cluster provisioning.

  • Core models and exceptions updated to support PBS entities and optional fields.
  • Firecrest API endpoints and response models extended to include PBS types.
  • Shell scripts now fall back to PBS_JOBID, and Docker Compose plus Dockerfiles configured for a local PBS cluster.

Reviewed Changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/lib/scheduler_clients/models.py Made JobTime.limit optional to accommodate missing values.
src/lib/scheduler_clients/init.py Imported PBS models and updated __all__.
src/lib/exceptions.py Introduced new PbsError exception subclass.
src/firecrest/status/models.py Extended node/partition/reservation responses to include PBS types.
src/firecrest/filesystem/transfer/scripts/slurm_job_move.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/scripts/slurm_job_extract.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/scripts/slurm_job_downloader.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/scripts/slurm_job_delete.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/scripts/slurm_job_copy.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/scripts/slurm_job_compress.sh Added PBS_JOBID fallback in echo statement.
src/firecrest/filesystem/transfer/router.py Added trailing commas and formatting; duplicated work_dir logic.
src/firecrest/dependencies.py Registered PbsCliClient for SchedulerType.pbs.
src/firecrest/config.py Added pbs to SchedulerType enum.
src/firecrest/compute/models.py Extended submit/response models to include PBS job types.
f7t-api-config.local-env.yaml Added a cluster-pbs entry with PBS-specific directives.
docker-compose.yml Added pbs service and commented-out platform entry.
build/docker/pbs-cluster/supervisord.conf Created supervisord configuration for PBS services.
build/docker/pbs-cluster/ssh/sshd_config_base Base SSHD config with PBS cluster settings.
build/docker/pbs-cluster/init_sequence.sh Initialization script for PBS postinstall and service startup.
build/docker/pbs-cluster/Dockerfile Dockerfile to build an Ubuntu-based PBS cluster image.

Comment on lines +158 to +160
work_dir = next(
iter([fs.path for fs in system.file_systems if fs.default_work_dir]), None
)
Copy link
Preview

Copilot AI May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The logic to compute work_dir using next(iter(...), None) is repeated multiple times; consider extracting it into a helper function to reduce duplication.

Suggested change
work_dir = next(
iter([fs.path for fs in system.file_systems if fs.default_work_dir]), None
)
work_dir = _get_default_work_dir(system.file_systems)

Copilot uses AI. Check for mistakes.


RUN useradd -ms /bin/bash pbsuser

RUN apt-get update && apt-get install -y postgresql postgresql-contrib && rm -rf /var/lib/apt/lists/*
Copy link
Preview

Copilot AI May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple consecutive apt-get update commands; consider combining package installs into a single RUN block to reduce image layers and speed up builds.

Copilot uses AI. Check for mistakes.



PubkeyAcceptedKeyTypes ssh-rsa,ssh-rsa-cert-v01@openssh.com,ssh-ed25519-cert-v01@openssh.com,ssh-ed25519
PermitRootLogin yes
Copy link
Preview

Copilot AI May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate PermitRootLogin directive found (also on line 23); consolidate or remove redundant entries to avoid ambiguity.

Suggested change
PermitRootLogin yes

Copilot uses AI. Check for mistakes.

ekouts and others added 3 commits May 26, 2025 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants