Base Command Manager / Bright Cluster Manager Release Notes
Release notes for Bright 9.1-4
== General ==
- New Features
* Support for Ubuntu2004
* Update Docker to v19.03.15
- Improvements
* Added smbk5pwd to openldap-servers package
* cm-nvhpc: updated to version 21.2
* Added mlnx-ofed52 package.
- Fixed Issues
* Ensure cm-python37 contains the dmidecode module
* cm-bios-tools: set content-type header info for redfish communication, without changing other header data
* An issue with linking cm-libpam / cm-check-alloc
== cmdaemon ==
- Improvements
* Validation for the partition node basename and digits
* Added adv. conf. option ForkUpdateOOM, as a way to prevent dmesg from filling up with cmd OOM messages
* By default, perform a daily malloc trim to reduce memory usage, controlled by adv. config. option MallocTrimInterval
* In some cases, Prometheus query could take a long time, use memory and timeout
* Disable all monitoring recorders during an upgrade
* Map jobs to AMD GPUs
- Fixed Issues
* An issue with networking interface bonding options on Ubuntu
* In some cases, an issue with shutting down the director due to the timeout for monitoring backup
* An issue with triggering sync to the passive headnode after cloning an image
* An issue with detecting UGE job array tasks nodes in CMSH/BrightView
* Possible memory corruption when using file customization
* Ensure slurmd is restarted next to slurmctld on specific configuration changes
* Ensure SchedulerType from SlurmServerRole is added to slurm.conf
* An issue that can cause excessive Jobs Ended messages, especially when using Job Arrays
* Rare deadlock in Prometheus entity cache trim
* Introduced batching for cases where cmdaemon calls sacct with a lot of Job ID's as arguments to prevent possible crashes
* Setting timezone to an empty string can crash cmsh
* An issue with converting information timeouts from milliseconds to seconds, which can cause long RPC delays
* An issue with UGE job array metric collection on non-joined cgroup layouts
* In some cases with multiple edge sites, a wrong node can be displayed as booting
== node-installer ==
- Fixed Issues
* Allow mkinitrd_cm and install-ipxe to recognize sle-hpc as a valid SLES distribution
== cmsh ==
- Improvements
* Tab completion for the -t / --type option
* Added timeout to cmsh power command
* Fixed an issue with cmsh packages installed size for ubuntu
== ml ==
- New Features
* Introduced package cm-cudnn8.0 for CUDA 11.2
* Introduced package cm-cutensor for CUDA 10.2 and CUDA 11.2
* Introduced package cm-ml-pythondeps for CUDA 11.2
* Introduced ML package cm-ml-distdeps for CUDA 11.2
* Introduced package cm-cudnn8.1 for CUDA 10.2
* Introduced package cm-cudnn8.1 for CUDA 11.2
* Introduced package cm-tensorflow2-extra-* for CUDA 10.2
* Updated cm-tensorflow2-* packages to v2.4.1
* Updated cm-gpytorch-* packages to v1.3.1
* Updated cm-horovod-* packages to v0.21.3
* Updated cm-opencv3-* packages to v3.4.13
* Updated cm-tensorflow-* packages to v1.15.5 (end-of-life reached)
* Updated cm-xgboost-* packages to v1.3.3
* Updated cm-tensorrt-* packages to v7.2.2
* Updated cm-pytorch-* packages to v1.7.1
* Updated cm-onnx-* packages to v1.8.1
* Updated cm-dynet-* packages to v2.1.2
* Introduced package cm-chainer for CUDA 11.2
* Introduced package cm-xgboost for CUDA 11.2
* Introduced package cm-opencv4 for CUDA 11.2
* Introduced ML package cm-nccl2 for CUDA 11.2
* Introduced package cm-openmpi-geib for CUDA 11.2
* Introduced cm-opencv4-* packages
- Improvements
* Deprecated cm-bazel package
* Renamed cm-tensorrt-cuda10.2-gcc package to cm-tensorrt-cuda10.2
* Deprecated cm-fastai-* packages
* Deprecated cm-tensorflow-*, cm-horovod-tensorflow-*, cm-onnx-tensorflow-* and cm-keras-* packages
* Switched GCC support for several ML packages from GCC 5 (-gcc) to GCC 8 (-gcc8)
* Introduced package cm-fastai2 for CUDA 10.2