Nutanix LCM Framework Upgrade Process

Nutanix Life Cycle management LCM Upgrade Process

Nutanix LCM Framework: Life Cycle Management has predefined fix upgrade protocol step to step process to upgrade the hardware firmware e.g SATADOM, BMC, BIOS, HBA, Disk etc. Nutanix upgrade process steps are very clearly defined with Phoenix booting image to upgrade the firmware take place through Phoenix OS after host first boot up into Phoenix OS.

You must configure rules in your external firewall to allow LCM updates. See the LCM Firewall Requirements for details. Consult the LCM Guide for full details on using the feature.

LCM’s ability to inventory or update certain LCM components may depend on which versions of the AOS and Foundation are running on the cluster. Users wishing to see a full list of available updates should consider bringing these software’s up-to-date first or check the LCM Release Notes to see if any of these dependencies exist for your environment.

First pre-checks will run to make sure that the cluster is in a good state for the upgrade to proceed. Prism will report if any pre-checks fail and you can consult KB-4584 for an explanation of each of them and how to resolve the LCM issue. Once the issue that caused the pre-check to fail is resolved, run a fresh Inventory and then try the upgrade operation again.

With the exception of certain modules for Dell platforms, all firmware updates performed through LCM require the hosts to boot into a CentOS-based staging area called Phoenix.

LCM has built-in intelligence that tells it what order to do the firmware updates, so there is no need for users to worry about what updates to perform first. Users can simply select the action “Update All” and LCM will automatically satisfy all dependencies between the firmware’s.

If multiple hosts are selected to have firmware updates performed, LCM will evacuate User VMs from the hosts one-at-a-time and boot them into the Phoenix staging area to perform the updates. No user VMs will be powered-off and your workload should continue to be served without disruption.

Depending on the firmware being upgraded, you may see your hypervisor reboot several times back into Phoenix. This is expected behavior and you should not try to intervene.

Once the firmware updates are completed the selected node will boot back into the hypervisor and power-up the local Controller VM, making sure that all clusters services are up and running.

Finally, the LCM will make sure that the local hypervisor is once again schedulable node and can host User VMs before the upgrade continues onto the next node.

LCM Operation

LCM performs two functions: taking inventory of the cluster and performing updates on the cluster. Note that an LCM update is not reversible.

Before performing an update, LCM runs a set of pre-checks to verify the state of the cluster. If any checks fail, the update operation is aborted.

Nutanix LCM operations are written to output logs File :

  • genesis.out
  • lcm_ops.out
  • lcm_ops.trace
  • lcm_wget.log

The Nutanix LCM log files record all operations, including successes and failures. If an operation fails, LCM suspends it to wait for mitigation.

The LCM framework can also update itself when necessary. Although connected to Nutanix AOS, the framework is not tied to the AOS release cycle.

Nutanix LCM Upgrade Alert Message:

To apply these updates, each node will boot into a separate utility, one  node at a time. Each node could reboot multiple times (depending on  component/ version) and user workloads will not be affected as automatic  migration of workloads to other nodes will be handled by the update  process. Disk checks will run before  and after the update process.Please ensure your cluster has sufficient  compute and storage resources to handle one node being offline during  the rolling update process. Refer to KB 6945 for more details.  

Upgrade Time Estimate Via LCM ?

This depends on the number of firmware updates being performed on a given node and how long it takes to evacuate User VMs from each host. A satadom firmware upgrade, for example, LCM tends to take about 45 minutes per-node to upgrade. During firmware upgrade Nutanix LCM framework put / enter / enable the Nutanix CVM / AHV Host into maintenance mode to live migrate the hosted VMs to another host.

LCM Dependency

The only prerequisite for LCM framework is dependent on AOS and Foundation software that must be minimum 5.9.x and 4.2.x or later respectively to upgrade the firmware and software via LCM Framework and cluster’s all Nutanix CVMs are up. That the Foundation service is in a stopped-state across the cluster. This service is typically not running unless an LCM upgrade or Cluster Expand operation is taking place.

Foundation Upgrading Process

The foundation binaries are updates across all CVMs. No running services, CVMs, or hypervisors are restarted.
Foundation upgrade installation process take up to 1 minute.

Conclusion

Nutanix LCM framework is extremely robust and has predefined step to step process to upgrade the hardware firmware and software easily without any downtime. Nutanix LCM is designed with no downtime required structure.

Thanks to being with Hyperhci Tech Blog to stay tuned.