Skip to content

Releases: aws-samples/aws-eda-slurm-cluster

aws-eda-slurm-cluster v2.12.1

14 Feb 22:37
b04d0d2
Compare
Choose a tag to compare

What's Changed

  • Hot fix

Bug Fixes

  • Bug #309: cron job to update users_groups.json broken

Full Changelog: v2.12.0...v2.12.1

aws-eda-slurm-cluster v2.12.0

14 Feb 15:27
fac7a65
Compare
Choose a tag to compare

What's Changed

  • Configure OCI containers
  • Give option to enable the Pyxis plugin so can run docker containers.
  • Update installer to monitor status of ParallelCluster stack after configuration stack is updated.
  • Various Exostellar Infrastructure Optimizer configuration fixes.
  • Enable ENA Express if compute node supports it.

New Features

  • Feature #259: Build pyxis spack plugin on login nodes
  • Feature #291: Add support for rootless OCI containers
  • Feature #305: Enable custom code in on_compute_node_configured.sh
  • Feature #308: Add support for ENA express

Bug Fixes

  • Bug #301: Xio without RES or extra mounts throws an exception
  • Bug #306: Make the default volume size to the size of the root volume in the VM image AMI

Full Changelog: v2.11.0...v2.12.0

aws-eda-slurm-cluster v2.11.0

02 Jan 18:23
2c93c97
Compare
Choose a tag to compare

What's Changed

  • Add support for ParallelCluster 3.12.0
  • Change the RealMemory of compute nodes to match total instance type memory to prevent users from accidentally requesting twice as much memory as they really want because they don't specify 95% of actual memory.
  • Document how to configure Slurm accounting.
  • Fix XIO bugs, enhance configuration support
  • Update RES templates for latest RES version. Update keycloak instance type to c7a.medium instead of t3.micro for stability.

New Features

  • Feature #272: Improve documentation for ClusterConfig section
  • Feature #275: Update RES templates for latest version
  • Feature #277: Enable spot for only certain InstanceTypes
  • Feature #282: document command11SubmitterDeconfigure command in instructions.
  • Feature #283: Change RealMemory of compute nodes to match total instance type memory
  • Feature #295: Add support for ParallelCluster 3.12.0

Bug Fixes

  • Bug #253: ParallelCluster incorrectly requiring FSxZ egress rules
  • Bug #280: Unable to create 3.10.1 cluster
  • Bug #288: External login node can't acces slurmdbd
  • Bug #293: Use of uninitialized variable

Full Changelog: v2.10.0...v2.11.0

aws-eda-slurm-cluster v2.10.0

02 Jan 18:07
ada1a31
Compare
Choose a tag to compare

What's Changed

  • Add support for ParallelCluster 3.11.1.
  • Add support for Exostellar Infrastructure Optimizer
  • Update default instance types to use latest instance types. Use more instance types if only using on-demand or spot.
  • Update lambdas from Python 3.9 to 3.12.

New Features

  • Feature #226: Add Exostellar support
  • Feature #268: Add support for ParallelCluster 3.11.1

Bug Fixes

  • Bug #266: Document which python program needs to be updated if I need to create the user/group json file a different way
  • Bug #267: slurmctld log shows "error: Node XXXXXXXXXX appears to have a different slurm.conf than the slurmctld."
  • Fix bug in XIO configuration
  • Require instances to have at least 4GB of memory.

Full Changelog: v2.9.0...v2.10.0

aws-eda-slurm-cluster v2.9.0

02 Jan 17:47
694e464
Compare
Choose a tag to compare

What's Changed

  • Change the names of the compute resources to include the instance type and also the number of cores and amount of memory. This makes it easier for users to select compute nodes to use for their jobs.

New Features

  • Add UseOnDemand configuration option that is similar to UseSpot so you can configure a cluster without on-demand instances if you want.

Bug Fixes

  • Bug #261: Restore memory based partitions
  • Bug #262: Default excludes incorrect if not using default includes
  • Bug #264: Create partitions with number of cores and amount of memory in name

Full Changelog: v2.8.0...v2.9.0

aws-eda-slurm-cluster v2.8.0

02 Oct 20:51
32aa3c3
Compare
Choose a tag to compare

New Features

  • #258: Add support for ParallelCluster 3.11.0

v2.7.1

09 Sep 19:09
2d84608
Compare
Choose a tag to compare

What's Changed

  • Clean up security groups and permissions for extra mounts by @cartalla in #246
  • Update deployment-prerequisites.md by @cartalla in #247

Full Changelog: v2.7.0...v2.7.1

aws-eda-slurm-cluster v2.7.0

09 Sep 19:03
2a533f8
Compare
Choose a tag to compare

What's Changed

  • Add ParallelCluster 3.10.0, 3.10.1 support by @cartalla in #244

New Features

  • Feature #242: Add support for ParallelCluster 3.10.0
  • Feature #243: Add support for ParallelCluster 3.10.1

Bug Fixes

  • Bug #221: Running install.sh with -cdk-cmd update in rapid succession can damage the cluster

Full Changelog: v2.6.0...v2.7.0

aws-eda-slurm-cluster v2.6.0

09 Sep 18:59
8ee5253
Compare
Choose a tag to compare

What's Changed

  • Update deployment docs by @cartalla in #234
  • Do not auto-prune instance types if there are too many by @cartalla in #235
  • Support ParallelCluster 3.9.2 and 3.9.3. Fix ansible playbooks. by @cartalla in #241

New Features

  • Feature #236: Add support for ParallelCluster 3.9.2
  • Feature #240: Add support for ParallelCluster 3.9.3

Bug Fixes

  • Bug #220: reducing number of compute resources to aggressively.
  • Bug #222: Documentation corrections required on deploy-parallel-cluster documentation
  • Bug #238: HeadNode fails to configure due to ansible change. on_head_node_configured.sh fails as ansible has deprecated ansible.builtin.include
  • Bug #239: Documentation update: location of licenses is incorrect on doc page

Full Changelog: v2.5.0...v2.6.0

aws-eda-slurm-cluster v2.5.0

09 Sep 18:52
8dff7cd
Compare
Choose a tag to compare

What's Changed

  • Add support for ParallelCluster versions 3.9.0 and 3.9.1 by @cartalla in #232

New Features

  • Feature #229: Add support for ParallelCluster version 3.9.0 and 3.9.1

Bug Fixes

  • Bug #204: Can only configure 3 clusters on a submitter host
  • Bug #230: Python 3.8 Lambda deprecated on 10/12/2024
    Update lambdas to use new version of python
  • Bug #231: Cluster fails to deploy because create_slurm_accounts.py fails

Full Changelog: v2.4.0...v2.5.0