[clustervisor-announce] ClusterVisor 1.x Released!

ClusterVisor Announcements clustervisor-announce at lists.advancedclustering.com
Fri Apr 28 17:05:44 CDT 2023


We are proud to announce that the 1.x release of ClusterVisor is now
publicly available for those who want to upgrade from the 0.x release. The
1.x release includes many changes and improvements such as a revamped stats
database, integration with SLURM, a new-and-improved redesign of the web
interface, a new appliance node to help facilitate disaster recovery, and
much more. All of the details can be found in our webinar that we made for
the new release:
https://www.advancedclustering.com/did-you-miss-our-clustervisor-1-x-launch-webinar-watch-it-here/

Aside from critical bug patches, the 0.x release will not be receiving any
further updates. So any and all new features for ClusterVisor will require
upgrading to the 1.x, however we will continue to support 0.x releases
should you wish to stay on that version for whatever reason. Also, while
upgrading from 0.x to 1.x is free, it does require some migrations to be
performed so if this is something you are interested in please contact us
at support at advancedclustering.com and one of our technical support agents
can help guide you through the process of getting the latest version of
ClusterVisor on your cluster.

As a brief reminder as it is covered at the beginning of the aforementioned
webinar above, the version schema is a bit different in the 1.x release.
Now the version is tracked as 1.YY.MM-BUILD where YY is the year (23 for
2023), MM is the month (04 for April), and BUILD is the build ID (which
monotonically increments with each build we make of CV); lastly the "1."
for the major version will indicate compatibility and no breaking changes
with your current setup, such that all new releases in 1.x can be upgraded
via a `dnf update`. The rest of this email will be the release notes for
all of the changes made between 0.5.44-5 and 1.23.04-2316 listed
chronologically in reverse order:

Release 1.23.04

   - Fourth Release


   - Fixed file handler limits in cv-serverd and cv-statsd daemons
   - Optimized performance of cv-stats to better handle querying large
   number of nodes
   - Optimized cv-clientd to minimize server calls on start to better
   handle large number of nodes starting at the same time
   - Optimized cv-serverd to maximize performance when initializing clients
   starting up
   - Fixed client-side to also handle chunked munge data for larger packets
   - Fixed bug in the networking config plugin to handle IP aliases
   correctly in EL9
   - Fixed bug in power stat plugin where it will disable itself if any
   IPMI query times out
   - Updated migrations to handle new format used by racklayout
   - Added --merge and --migrate to cv-db-image
   - Patched appliance-install script to ensure that cv-serverd is online
   before running CV client scripts
   - Patched cv-installer to use MongoDB 4.0 in EL9 to mitigate against
   cv-db-image breaking from API changes in 4.2+
   - Fixed bug where units for user-defined group stats were not included
   in the stat definition
   - Fixed bug in node selectors so that --rack and --chassis both work as
   expected again
   - Fixed bug in firmware stat plugin for bmc_version stat


   - Third Release


   - Patches clustervisor-migrate package to install with the correct
   dependencies without trying to upgrade CV to 1.x too early


   - Second Release


   - Fixed typo in schema file
   - Fixed bug in Slurm accounting API
   - Added fallback in megaraid stat plugin to check /opt/MegaRAID for
   storcli binary
   - Patched installer to use 4.0 for MongoDB to fix deprecation that
   breaks cv-db-image
   - Fixed bug in networking config plugin
   - Removed file handle cap on cv-serverd and cv-statsd
   - Fixed handling renames when no plugins have been run on the node yet
   - Patched cv-stats to support appliance nodes


   - First Release


   - Fixed bug in cv-reconfigure missing a variable definition
   - Fixed bug in networking config plugin that erroneously reconfigured
   the firewall
   - Fixes bug in infiniband stat plugin to run without crashing when
   counters are not present
   - Fixes bug in emails sent by monitoring rules having the wrong
   description
   - Adds ability to enable/disable monitoring actions without needing to
   disable every rule attached to it

Release 1.23.03

   - Fifth Release


   - Changed cv-statsd to not delete past retention period on start nor
   during loop, only delete what ought to be deleted since the time it
   started; this offsets the edge case of expensive starts of cv-statsd after
   long gaps of inactivity to clear out older stats to a later point in time
   of the sysadmin's choosing
   - Patches plugin system behavior when appliance is not present
   - Added --toggle-stats to cv-serveradm to toggle on/off the stats loop
   in cv-serverd without restarting the server
   - Added --offline to cv-migrate to tweak migrations to support offline
   clusters
   - Prevent clustervisor-migrate package from updating any other CV
   packages
   - Improved error messages in cv-installer
   - Patched bug in cloner-reconfigure writing erroneous entries to GRUB in
   EL8
   - Patched bug in Slurm wrapper for CV not returning the correct HTTP
   status codes
   - Patched bug in IPMI stat plugin to prevent disabling the plugin from
   false positives during runtime
   - Added handler for "firewall" field to networking config plugin to
   enable/disable firewalld
   - Fixed config template for cluster-mces-errors monitoring rule
   - Fixed edge case bug for config plugins firing when not triggered by
   unrelated fields
   - The help text for 'list-roles' is now 'distro-roles' in cv-image
   image-new
   - Added cv-slurmadm to synchronize system users with slurm users
   - Patched cv-useradm to delete LDAP users from groups when a user is
   being deleted


   - Fourth Release


   - Makes clustervisor-migrate package accessible in air gap environments
   - Fixes behavior in cv-useradm when deleting LDAP users to also remove
   them from their groups first


   - Third Release


   - Fixes build system to embed the correct cloner-reconfigure binary in
   the EL9 release of the RPMs


   - Second Release


   - Fixes race condition in config plugin system in cv-serverd
   - Fixes cloner-reconfigure to use a backwards compatible build to ensure
   cloner works for any supported OS combination regardless of the hosting OS
   version


   - First Release


   - Fixed bug in cv-image to ensure clustervisor-client package is fully
   installed during image creation
   - Fixed bug in cv-statsd causing partial crashing behavior when device
   list is changed
   - Fixes bug in power stats plugin for slower IPMI devices
   - Removes clustervisor-migrate package when updating clustervisor-server
   if it is still installed

Release 1.23.02

   - Second Release


   - Fixes bug in cv-statsd that causes process to hang after two weeks of
   continuously running (e.g. after the first data retention period for stats
   data)
   - Improveed performance of cv-statsd
   - Minor bug fixes for stat plugins used by cv-clientd (i.e. disks, mem,
   net, power, and tuned)


   - First Release

Server:

   - Fixed edge case for handling renamed or deleted monitoring rules to
   cease tracking them afterwards.
   - Added stat plugin for BIOS and BMC versions.
   - Added templates for common monitoring rules
   - Added feature to cv-image to update an installed OS to the packages
   and config in the cv-image template
   - Include new plugins added in 1.x to migration from 0.x to 1.x for
   nodes.
   - Fixed minor bug in cv-serverd that occurred sometimes on startup
   - Fixed minor bug in cloner for EL8 and existing systems
   - Fixed cloner to not require installation of clustervisor-client in
   image during cloning process

Web:

   - Added heatmap min/max support to the rack dashboard widget
   - Added templates for dashboard and device details, includes
   upload/download for templates
   - Added support for 0U slots for PDUs in rack diagram
   - Added lazy loading for device details charts for tabs with multiple
   devices to improve first load time
   - Fixed exports download to work for non-Chrome browsers
   - Fixed minor issues with monitoring rules, expression builder, and
   dashboard shown in web interface

Release 1.23.01
*Highlights of the changes made (in no particular order)*

   - New global tmp directory within CVNew schema version, 1.0 to 1.1
   - New /clustervisor directory for holding application state (backups,
   images, netboot, databases, etc)
   - New hourly backup of the ClusterVisor database
   - New appliance system for dedicated nodes running ClusterVisor
   - New templates system for more quickly creating configuration entries
   - Renamed cv-yumrepod to cv-packaged
   - New test profile system for the testing framework
   - Moved the server plugins into the common module; now called the
   "config plugins"
   - New config plugins added: limits, lmod, selinux, bootoptions, tuned
   - ClusterVisor is now licensed and requires a license file from ACT to
   start the ClusterVisor server
   - Config plugins are now run on compute nodes to distribute load and
   remove majority of server caching
   - Bug fixes for cv-authsync, cv-conf, cv-clientd, and the config plugin
   system
   - Addition to cv-cloner to show cloner states
   - Addition to cv-commit to clear both reconfigure and overflow queue
   from the utility
   - Addition to cv-serveradm to send a test email
   - Addition to cv-clientd to retain its own stats in memory
   - Addition to cv-conf to support multi-line strings in YAML
   - Addition to cv-serverd to serve a bootstrapping system for new nodes
   - Addition to ldapauth plugin to support multiple LDAP domains
   - Addition to yumrepo_server to support multiple repos
   - Addition to ssh config plugin to support setting the PermitRootLogin
   setting for the SSH server
   - Addition to cv-reconfigure to re-run enable instructions for config
   plugins on demand
   - Addition to cli commands with node selectors to include -A /
   --appliance for including appliance nodes
   - Added cv-scheduler to interop with slurm
   - Added common.cv_parser for handling expression language
   - Added common.cv_sqlite for interfacing with SQLite databases
   - Added cv-migrate to migrate from 0.x to 1.x
   - Added cv-image for building base install images to bootstrap nodes
   - Added appliance collection and support for multiple collections that
   map to a "node-like" entity
   - Added an automated installation system for the appliance
   - Fixed memory leak bug in pymunge
   - Node groups are now a collection that link to devices (not just nodes)
   - Authorized SSH keys for root user can now be distributed using new
   config.ssh configurations
   - Nodes can now be bootstrapped into ClusterVisor through a single
   command provided by the server
   - Support added for EL9
   - Rewrote stat plugins and moved into common package
   - Removed background server plugins (was deprecated)
   - Removed hide_* fields from config.permissions (no longer needed after
   web UI rewrite)
   - Removed 0.4 schema (was deprecated)
   - Replaced InfluxDB with cv-statsd and new custom stats database system
   - Replaced the monitoring system, now uses new stats database and
   expression language
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.advancedclustering.com/pipermail/clustervisor-announce/attachments/20230428/c865f247/attachment.html>


More information about the clustervisor-announce mailing list