Base Command Manager / Bright Cluster Manager Release Notes

Release notes for Bright 9.2-3

== General ==

- Improvements

* Upgrade openssl to 3.0.3
* Add Mellanox 5.6 OFED stack (mlnx-ofed56 packages)

- Fixed Issues

* cm-kubernetes: Increase the resources requests and limits for flannel to prevent the cgroups from running out of memory
* cm-kubernetes: make the https://:30443/dashboard ingress redirect to /dashboard/ to resolve browser-side issues where the browser will show an empty page instead of the dashboard

== cmdaemon ==

- New Features

* Automatic detection and configuration of the Slurm generic resources (GRES) settings for Multi-Instance GPU (MIG) devices

- Improvements

* Include the command line arguments in the information events generated by cmdaemon when a kubectl command times out
* Modifying a network in cmdaemon that is used by Kubernetes will now request the relevant Kubernetes services to update their configuration and restart
* Add capability to monitor and capture CUDA GPU XID errors
* Enable the cmdaemon API calls to constrain an HPC job's location

- Fixed Issues

* An issue where the charge back filter for user project managers does not return all matches
* Speed up the batch operations to hold, suspend, release, or remove WLM jobs, which resolves an issue with RPC errors reported by cmsh
* An issue where password crypt can generate duplicate edge site secret hashes
* Add a workaround in cmdaemon for an issue where some older base distribution versions of openssl are unable to create FIPS compliant DH parameters during add-on installation
* Remove the JobInformation rows from the database when removing a WLM setup
* An issue where WLM jobs are no longer cached in cmdaemon after head node HA takeover
* Send the rsyslog log to both head nodes for on-prem compute nodes
* An issue where UGE server is started on both directors on edge HA setup, instead of only on the active
* An issue where the active UGE WLM server does not migrate to the active cmdaemon head node

== cm-create-image ==

- Fixed Issues

* An issue where the sanity checks fail for archives created with leading "./" in the filenames

== cm-kubernetes-setup ==

- Improvements

* Use the --overwrite command line flag when running kubectl taint to avoid errors when taint already exists

- Fixed Issues

* Ensure swap is disabled on the compute nodes running Kubernetes
* Allow shorewall traffic between calico (cali+) wildcard interfaces to be routed back to the same interface. This resolves an issue where some services are unable to connect and report a timeout

== cm-wlm-setup ==

- Improvements

* Enable prejob in WLM when setting it up in "express" mode with cm-wlm-setup when at least one health check has been selected as a prejob health check

== cmsh ==

- Improvements

* Add job state selection flags for cmsh job operations, allowing the ability to filter WLM jobs based on their state

== pythoncm ==

- Improvements

* Add stdin, stdout, and stderr wrappers to the pythoncm jobinfo
* Add pythoncm WLM job information latest metric data methods