Base Command Manager / Bright Cluster Manager Release Notes
Release notes for Bright 8.2-25
== General ==
- Known Issues
* On SLES*, an update of cmdaemon + cm-boost results in cmdaemon temporarily unable to automatically start during the packages update, reporting error while loading shared libraries: libboost_thread.so.1.68.0. The workaround is to start cmdaemon manually after the update of the packages is completed.
- Improvements
* Added CUDA 11.3 packages
* Updated cuda-driver to version 460.73.01
* Updated cuda11.2 to version 11.2.2
* Updated cm-openssl to 1.1.1k
* Added smbk5pwd to openldap-servers package
* Added cm-boost module files
== cmdaemon ==
- Improvements
* Added extra checks for a rare crash in head nodes IPs RPC
* Allow for occupation rate to be sampled for other groupings than partition
* New adv. config. option WlmDefaultDrainMessage allowing to change the default drain message
* New adv. config. option RsyncAlwaysExclude allowing for a global exclude list to be added to all rsyncs
* New adv. config. option SlurmNeverRestart to prevent cmdaemon restarting slurmctld
* Added cmd.service hooks so that start, stop, or crashes can be reported other than by email
* By default, perform a daily malloc trim to reduce memory usage, controlled by adv. config. option MallocTrimInterval
- Fixed Issues
* Device status enum metric not showing up for newly added devices
* In some cases, an issue with generating hostlist expression in a [] format
* Ensure FastSchedule option is stripped when upgrading to Slurm > v19
* An issue with cuda-dcgm service segfault with the latest cuda-driver packages
* Ensure that on HA clusters slurmctld does is not restarted when cmdaemon is restarted
* An issue with createramdisk with recent kernel versions on centos8/rhel8
* Possible memory corruption when using file customization
* In some cases, an issue with bringing up a VLAN on top of a bond interface
== node-installer ==
- Fixed Issues
* Added fips dracut module to createramdisk for rhel and centos when available, to support fips in the original ramdisk
== cuda-dcgm ==
- Fixed Issues
* An issue with cuda-dcgm service segfault with the latest cuda-driver packages
== ml ==
- New Features
* Introduced ML package cm-dynet for CUDA 11.2
* Introduced ML package cm-fastai2 for CUDA 10.2 and CUDA 11.2
* Introduced ML package cm-gpytorch for CUDA 11.2
* Introduced ML package cm-ml-distdeps for CUDA 11.2
* Introduced ML package cm-nccl2 for CUDA 11.2
* Introduced ML package cm-pytorch-extra for CUDA 10.2 and CUDA 11.2
* Introduced ML package cm-pytorch for CUDA 11.2
* Introduced package cm-chainer for CUDA 11.2
* Introduced package cm-cub for CUDA 11.2
* Introduced package cm-cudnn8.0 for CUDA 11.2
* Introduced package cm-cudnn8.1 for CUDA 10.2
* Introduced package cm-cudnn8.1 for CUDA 11.2
* Introduced package cm-cutensor for CUDA 10.2 and CUDA 11.2
* Introduced package cm-ml-pythondeps for CUDA 11.2
* Introduced package cm-opencv4 for CUDA 11.2
* Introduced package cm-openmpi-geib for CUDA 11.2
* Introduced package cm-tensorflow2-extra-* for CUDA 10.2
* Introduced package cm-xgboost for CUDA 11.2
* Updated cm-cub-* packages to v1.12.0
* Updated cm-cudnn8.1-* packages to v8.1.1.33
* Updated cm-cutensor-* packages to v1.3.0
* Updated cm-dynet-* packages to v2.1.2
* Updated cm-fastai2-* packages to v2.3.1
* Updated cm-gpytorch-* packages to v1.4.1
* Updated cm-horovod-* packages to v0.21.3
* Updated cm-mxnet-* packages to v1.8.0
* Updated cm-nccl2-* packages to v2.9.8
* Updated cm-opencv3-* packages to v3.4.14
* Updated cm-opencv4-* packages to v4.5.2
* Updated cm-pytorch-* packages to v1.8.1 and moved extra dependencies to cm-pytorch-extra-* packages (e.g. torchvision, torchtext)
* Updated cm-tensorrt-* packages to v7.2.3.4 (cuDNN 8.1)
* Updated cm-xgboost-* packages to v1.4.1
- Improvements
* Stopped upgrading PyTorch and its related ML packages for sles12
* Unified git packages under cm-git
== pbspro2020 ==
- New Features
* Upgrade PBS Pro 2020 to 2020.1.3
== slurm20 ==
- New Features
* Upgrade slurm20 packages to 20.02.7 (CVE-2021-31215)
== slurm20.11 ==
- New Features
* Introduced Slurm 20.11 integration
== wlm-setup ==
- Improvements
* Support for Slurm 20.11