Base Command Manager / Bright Cluster Manager Release Notes
Release notes for Bright 6.0-28 (released 2013-01-29)
== cmdaemon ==
* Changed cmd init script. Use /proc//oom_score_adj file if available, else use /proc//oom_adj file to set OOM adjust value. For newer kernels oom_adj is deprecated
* Increased database softwareimage.name size to 256 (from 32)
* Added: allow specifying custom cuda path with advanced config: NvidiaSmiCommand, NvidiaSmiLibs, CudaVersion
* Added: burn status to pythoncm
* Added: hold/suspend commands for slurm jobs
* Added: allow certail scsi entries to be ignored (adv. config: ScsiIgnoreModels)
* Added: AJAX console backend - resize feature
* Added: allow custom power script to set the power info message
* Added: CMCloud to readonly
* Improved: display of single object submodes in cmsh
* Improved: unicode support for pythoncm
* Fixed: lowercase check for runIf in cmsh
* Fixed: allow uppercase in passwords from json
* Fixed: AJAX console backend - issue with closing the rshell window
* Fixed: issue with slurm job time parsing
* Fixed: SGE configuration if ResolveToExternalName=true in cmd.conf
* Fixed: service died events not being logged to event log file
* Fixed: use script timeout when running prototype --initialize
* Fixed: return NaN for metrics when script exit is != 0
* Fixed: category based statistics for constant value
* Fixed: Node certificate regeneration
* Fixed: slurm grestypes being listed twice
* Fixed: send slurm queue update events after config update forces slurm restart
* Fixed: incorrect DEVICE name for BOOTIF:0 alias.
* Fixed: possible cmsh deadlock set disksetup
== node-installer ==
* Added: BCM_RESET_NIC=X special kernel cmdline parameter which will cause a reset of the NIC and then delay for X seconds before attempting to start the NIC again.
* Fixed: issue with booting/provisioning over bonded interface when a member interface is down
* Fixed and issue where cloning a disk layout, containing swap on software raid did not work. This affected the setup of failover head nodes with such disk layouts.
== node-installer-slave ==
* Added: save clone install log file on the primary node
* Added: preserve custom kernel cmdline options during clone install
== manuals ==
* updated: partitioning defaults depending on disk size during head node installation
* synchronizing/exporting properties across items with cmgui-right-click menu options
* synchronizing/exporting properties across items with cmgui-right-click menu options relocation into correct section
* improved: configuration techniques section moved from post-install software installation to configuration chapter
* improved: margins
== buildmaster ==
* Do not install/include pbspro-green packages
* Added: kernel-firmware package to default dist package list for sles11sp2
* Added: cm-config-dhclient package to slave image/nodes. On RHEL5 and 6 based nodes this will prevent the dhclient to overwrite the /etc/resolv.conf file.
* Improved: boot partition on raid1 in default raid5 disk layouts for head node
* Fixed: turn on boot.md service for sles, when software raid is used
* Fixed: default exclude-list{sync,grab,grabnew} to explicity exclude pbspro directories
== cluster-tools ==
* cmsub: allow use of several workload managers in parallel
* Disabled powersave option for pbspro
* Added: support for creating additional shared resources via xml
* Fixed: configure category openlavaclient role
* Fixed: pbspro cert path
* Fixed: issue with wlm-setup workload manager configuration if any cloud provider is set up
* Fixed: cmsub: issue with workload manager specified inside a jobscript
* Fixed: issue with cm-updates-extra.repo.sles11* repo files. Changed from default update repos to extra update repos.
== cm-config-dhclient ==
* For rhel5 based distros prevent dhclient to overwrite /etc/resolv.conf, added dhclient-enter-hooks file.
== cm-config-dhcp ==
* Fixed: Update dhcp config to sp2 version, set previous settings. Set DHCLIENT_BIN variable to dhclient
== cm-config-intelcompliance-master ==
* Fixes for rhel5/rhel6 /etc/dat.conf
* Fix for rhel5/rhel6 /etc/dat.conf, which is normally a link to /etc/{ofed,rdma}/dat.conf, but if mlnx or qlgc ofed is installed /etc/dat.conf is changed to a file by ofed rpms.
* Changed exclude libraries, for Dell tools i386 libs.
* changes made to intel cluster check config files, for update to 2.0
* updated intel cluster checker version
* update of intel-cluster-runtime from 3.3 to 3.4
== cm-config-intelcompliance-slave ==
* Fixes for rhel5/rhel6 /etc/dat.conf
* Fix for rhel5/rhel6 /etc/dat.conf, which is normally a link to /etc/{ofed,rdma}/dat.conf, but if mlnx or qlgc ofed is installed /etc/dat.conf is changed to a file by ofed rpms.
* updated intel cluster checker version
* update of intel-cluster-runtime from 3.3 to 3.4
== cm-config-ssh ==
* Fixed an issue with logging in using 'zsh' shell on some systems (RHEL6).
== cm-webportal ==
* Fixed: Mail address in user portal's contact information being incorrect
== cmgui ==
* Added: lastname to user properties
* Added: option to remove user home directory
* Added: Shared to slurm job queue
* Added: allow mirrored racks
* Fixed: focus of new windows
* Fixed: node ip clone, negative ips will be set 0.0.0.0
* Fixed: update rackview on device add/update/remove
* Fixed: in some cases cmgui showing cloud nodes in the wrong queue
== cuda42 ==
* Added checks to verify scripts, to check if cuda-libs and cuda-driver package are installed. Added logging to the verify scripts.
* Adjustment of lspci command in verify_cuda script.
== cuda50 ==
* Added checks to verify scripts, to check if cuda-libs and cuda-driver package are installed. Added logging to the verify scripts.
* Adjustment of lspci command in verify_cuda script.
* Added support to opencl sdk for CUDA device compute capability 3.5
* Replaced version string for driver version, for shared modules to current. This enables to switch between different CUDA 5.0 kernel driver versions, without having to update the shared packages.
== cvos-config-dhcp ==
* Update dhcp config to sp2 version, set previous settings. Set DHCLIENT_BIN variable to dhclient (was done in finalize script but this package would overwrite change).
== cvos-config-intelcompliance-master ==
* Fixes for rhel5/rhel6 /etc/dat.conf
* Fix for rhel5/rhel6 /etc/dat.conf, which is normally a link to /etc/{ofed,rdma}/dat.conf, but if mlnx or qlgc ofed is installed /etc/dat.conf is changed to a file by ofed rpms.
* Changed exclude libraries, for Dell tools i386 libs.
* changes made to intel cluster check config files, for update to 2.0
* updated intel cluster checker version
* update of intel-cluster-runtime from 3.3 to 3.4
== cvos-config-intelcompliance-slave ==
* Fixes for rhel5/rhel6 /etc/dat.conf
* Fix for rhel5/rhel6 /etc/dat.conf, which is normally a link to /etc/{ofed,rdma}/dat.conf, but if mlnx or qlgc ofed is installed /etc/dat.conf is changed to a file by ofed rpms.
* updated intel cluster checker version
* update of intel-cluster-runtime from 3.3 to 3.4
== finalise-base ==
* Set rhnsd init script off for rhel5 and rhel6 images. Compute nodes will have same system ID, from image, if all connect to RHN it will lead to warning on the RHN.
* Removed changing/fixing /etc/sysconfig/network/dhcp file. This is configured with cm-config-dhcp package.
* RHEL6 based master and slave: Create empty /etc/dhclient.conf file if it does not exist. To prevent a dangling link (/etc/dhcp/dhclient.conf)
== intel-cluster-checker ==
* Update from 2.0.013 to 2.0.014
* Added CLCK_CONFIG_FILE env variable
* changed CLCK_NODELIST env variable, env variable is directly used by cluster checker if not set in config file, or on the command line.
* Fixed: log dir env variable changed for 2.0
* Update from 1.8 to 2.0
== intel-cluster-runtime ==
* update from 3.3 to 3.4
== intel-wired-ethernet-drivers ==
* update of driver versions. e1000e: from 2.1.4 to 2.2.14, igb: from 4.0.17 to 4.1.2, ixgbe: from 3.11.33 to 3.12.6.
== node-installer-nfsroot ==
* Added kernel-firmware package to package list for sles11
* Updated kernel packages for sl5, sl6, rhel6
== pbspro-green ==
* Fixed: get_power_cycle_strategy call
== qlgc-ofed ==
* Changed kernel module build options, removed nfsrdma, for rhel6. Building nfsrdma modules leads to build failures.
* Changed kernel module build options, added nfsrdma, for sles11 and rhel6
== slurm ==
* Fixed: issue with slurm autogenerated section header
* Fixed: issue with header slurmdbdb