Base Command Manager / Bright Cluster Manager Release Notes
Release notes for Bright 6.0-32 (released 2013-06-19)
== cmdaemon ==
* Added /home to cloud director sync exclude list
* Added: max-lease-time 10 for scalemp dhcp configs
* Added: pythoncm example for creating a set of nodes
* Added: getStatisticalData wrapper call to pythoncm.Cluster
* Added: get_device_status_token to power profile
* Added: set_device_status_token for power profile
* Added: drain overview token to readonly profile
* Added: advanced config to disable quorum (DisableQuorum): passive head node will take over without querying the nodes which are up
* Added: PowerDistributionUnitHeartbeatPort advanced config option
* Added: an extra tag to the disk layout XML format, which allows swap priority to be set. For example: 12Glinux swap10. Or for raid: a1b2110.
* Improved: change default bmc user name to bright and default bmc user id to 4
* Improved: changed openvpn verbosity level in the headnode's client-tunX.conf file from 3 to 1
* Fixed: prevent snmp_sess_select flooding /var/spool/cmd: use snmp large fsset as default
* Fixed: LSF in cloud setups (add vpn ip of head nodes into LSF hosts file)
* Fixed: exclude /cm/shared/apps/lsf/{,var}/conf/hosts from synchronization head node<-->clouddirector
* Fixed: duplicating of chkpnt LSF queue parameter in lsb.queues
* Fixed: LSF cloud host groups settings
* Fixed: networks with empty domainnames being added to postfix config
* Fixed: json cmmain:getProfile call
* Fixed: do not write gateway if headnode has a dhcp IP address on the external network
* Fixed: setting slots in LSF/openlava when slots > 9
* Fixed: allow deviceIsUp healthcheck for mic and gpu
* Fixed: cmd -i parsing of provisioning role, all images was not always respected
* Fixed: EC2 AMIs being listed for bright versions that do not match the running cmdaemon
* Fixed: Possible crash in pythoncm when removing an object while no longer connected
* Fixed: cmsh crash in cloud mode during connect
* Fixed: slowdown and incorrect behavior for metric collection scripts
* Fixed: clear restart required message when doing 'open -e -n ...'
* Fixed: PBS Pro and TORQUE node drain/undrain when no queue is assigned
* Fixed: In some cases many admin emails being sent if failover could not be completed
== node-installer ==
* Added: cisco ucs check when configuring bmc user
* Added: an extra tag to the disk layout XML format, which allows swap priority to be set. For example: 12Glinux swap10. Or for raid: a1b2110. Include swap priority when generating XML from the current disk layout. In addition this solves a bug where generating XML would miss any swaps on devices that did not also contain other mounted file-systems.
* Fixed: issue where, with requirefullinstallconfirmation set to yes, a node could still do a full install if the disk layout does not match, without asking the user to confirm.
== manuals ==
* added: kernel module selection by cm-create-image
* added: how to acknowledge all events in cmgui
* added: overwriting/preserving DAS filesystem during cmha-setup
* added: updated hybrid cloud setup images
* added: list of environmental variables under cmdaemon
* added: build-config.xml
* fixed: data-aware scheduling (cmsub) only submits jobs to the cloud, not to the regular nodes
* updated: node-lists syntax
* improved: running cmgui with cluster-on-demand
* improved: cluster-on-demand - using HVM AMIs on it
* updated: cloud director provisioning considerations
* improved: added 40GB is default storage size in cloud provisioning
* updated: json cmgui
== buildmaster ==
* Added /home/* to the default sync exclude list for all nodes.
* Added: deviceIsUp metric to gpuUnit health configuration by default
* Fixed: pbspro license server param not properly set during remote install
* usb-install: fix issue where usb device does not get mounted
* Fixed: issue where ifaceorder file gets reverted back to the original even after updating the order
== cluster-tools ==
* Added: default value for missing cluster manager ID
* Improved: check whether LSF profile is loaded before submitting cmsub jobs
* cm-create-image: fix reading kernel modules when an empty line exists
* Fixed: simple net init for netmap, which prevents automatic ip guess
* Fixed: fsexport /home not needed for tunnel networks
* cm-register-node: update fsexports for both head nodes in HA setup
* cloud-setup: Changed the default baseaddress of the netmap network from 172.31.0.0 to 172.30.0.0
* cloud network now defaults to 172.31.0.0/16
* cm-register-node: update Master:master on registered node in ha setup
* cm-register-node: make fakeprovisioning node param update generic (ha setups)
* cm-create-image: exit/fail gracefully when user tries to create a software image using the head node as the source
* cm-register-node: fix attribute name 'myhostname'
== cm-config-ssh ==
* Fixed: in some cases users' ~/.ssh directory not being created with the proper permissions
== cm-config-ssh-slave ==
* Fixed: in some cases users' ~/.ssh directory not being created with the proper permissions
== cmgui ==
* Added: ssh button for chassis
* Move to 172.30 for netmap and 172.31 for cloudnet
* Improved: shift-click to acknowledge events
* Improved: make ec2 spot requests not persistent
* Improved: Speed up job node selection filter
* Improved: time parsing for workload-job group plots
* Fixed: ip in cloud folder overview
* Fixed: event acknowledgment
== cuda42 ==
* Build nvidia kernel module in /tmp using the cuda driver init script
* Updated verify scripts: unload gcc module before loading cuda/toolkit module, use more CPU cores for compilation, changed check for installed cuda packages.
* Changed /cm/local module paths, to be able to also use more up to date nvidia driver package.
* Fixed MANPATH for the toolkit
* Changed environment modules to use prepend-path instead of append-path.
== cuda50 ==
* Bug fix for pynvml.
* Updated nvidia driver, libs, xorg from 304.54 to 304.84
* Build nvidia kernel module in /tmp using the cuda driver init script
* Update of driver/libs from 304.84/304.54 to 319.23
* Update of TDK from 3.304.4 to 3.304.5
* Changed environment modules to use prepend-path instead of append-path.
* Updated verify scripts: unload gcc module before loading cuda/toolkit module, use more CPU cores for compilation, changed check for installed cuda packages.
== finalise-base ==
* Set installonly_limit to 0 in yum.conf, to prevent yum automatically removing older kernels which can still be used by the software image.
== hdf5 ==
* Fixed libdir and dependency_libs path bug. .la files were pointing to RPM build directory, instead of pointing to /cm/shared/..... directory.
== hdf5-18 ==
* Fixed libdir and dependency_libs path bug. .la files were pointing to RPM build directory, instead of pointing to /cm/shared/..... directory.
== intel-compilers-2013 ==
* Update from 2013.2 tp 2013.3
* Update of module files.
* Update from 2013.3 to 2013.4
== intel-itac ==
* Updated module file. Fixed variable value search and replace in spec file.
== intel-wired-ethernet-drivers ==
* update of ixgbe driver from 3.12.6 to 3.13.10
* Update of ixgbevf driver from 3.14.5 to 3.15.1
* update of igbvf driver from 2.0.4 to 2.3.2
== iperf ==
* Fixed package update problems.
== pgi ==
* Run PGI makelocalrc in post section of RPM, gcc version on can differ per distro release update.
* Updated from 2013-132 to 2013-135
* update from 12.10 to 13.2