Nutanix Cluster’s Components and Acropolis Services Explained

HyperHCI Admin Nutanix Cluster May 24, 2019

Nutanix Cluster components are the core programs or services that actually run the Nutanix Acropolis cluster successfully.
Nutanix cluster is the distributed architecture and each node of the Nutanix Acropolis cluster shares resources across the cluster to distribute the task and responsibilities.

All components run on multiple nodes in the cluster, and depend on connectivity between their peers that also run the component. Most components also depend on other components for information.

Cluster Components Overview

Zeus : Access interface for Zookeeper.
Zookeeper : Manages cluster configuration
Medusa : Access interface for Cassandra
Cassandra : Distributed metadata store
Stargate : Data I/O manager for the cluster
Curator : Handles Map Reduce cluster management and cleanup
Genesis : Cluster component & service manager
Chronos : Job and task scheduler
Cerebro : Replication/DR manager
Pithos : vDisk configuration manage
Prism: Management interface for nCLI, APIs adn HTML5 Web console
Acrpolis Services
Dynamic Scheduler

Cluster Components In Detailed

Zeus

Zeus is the interface for Zookeeper to give information access to other components about nutanix cluster
A key element of a distributed system is a method for all nodes to store and update the cluster’s configuration
Cluster configuration contains information about physical components ( nodes, disks ) and logical components ( storage container ) in the cluster
Zeus keep tracks of nodes IP address, capacities and data replication rules like RF-2 or RF-3
Store all cluster information
Zeus is the Nutanix library that all other components use to access the cluster configuration

Zookeeper

Key Role: Cluster configuration manager
Zookeeper is accessed via an interface called Zeus
Zookeeper runs on either three or five nodes, depending on the redundancy factor that is applied to the cluster
one Zookeeper node is elected as the leader in the cluster
The leader receives all requests for information and confers with the two follower nodes
If the leader stops responding, a new leader is elected automatically
Zookeeper has no dependencies, meaning that it can start without any other cluster components running

Zeus and Zookeeper Service explained

Medusa

Medusa is the interface of Cassandra
Stargate and Curator communicates to Cassandra through Medusa
Medusa store and keep tracks host and hosting virtual machines data
Medusa keep tracks of the VM’s Master data replica in case of node failure data would be safe.
Medusa is a Nutanix abstraction layer that sits in front of the database that holds this metadata
The database is distributed across all nodes in the cluster, using a modified form of Apache Cassandra

Cassandra

Key Role: Distributed metadata store
Cassandra is a distributed, high-performance, scalable database that stores all metadata about the guest VM data stored in a Nutanix datastore
In the case of NFS datastores, Cassandra also holds small files saved in the datastore
When a file reaches 512K in size, the cluster creates a vDisk to hold the data
Cassandra stores and manages all of the cluster metadata in a distributed ring-like manner based upon a heavily modified Apache Cassandra
The Paxos algorithm is utilized to enforce strict consistency Paxos service runs on every node in the cluster
The Cassandra is accessed via an interface called Medusa
Cassandra runs on all nodes of the cluster. These nodes communicate with each other once a second using the Gossip protocol, ensuring that the state of the database is current on all nodes
Cassandra depends on Zeus to gather information about the cluster configuration

Medusa and Cassandra Service explained

Stargate

Key Role: Data I/O manager
Stargate is responsible for all data management and I/O operations and is the main interface from the hypervisor (via NFS, iSCSI, or SMB)
A distributed system that presents storage to other systems (such as a hypervisor) needs a unified component for receiving and processing data that it receives
All read and write requests are sent across vSwitch Nutanix to the Stargate process running on that node
Stargate depends on Medusa to gather metadata and Zeus to gather cluster configuration data
Stargate service runs on every node in the cluster in order to serve localized I/O

Curator

Key Role: MapReduce cluster management and cleanup
Curator run on each node of the cluster and tracks the metadata and VM’s master data on storage.
A Curator master node periodically scans the metadata database and identifies cleanup and optimization tasks that Stargate or other components should perform
Analyzing the metadata is shared across other Curator nodes, using a MapReduce algorithm
Curator depends on Zeus to learn which nodes are available, and Medusa to gather metadata. Based on that analysis, it sends commands to Stargate
It does disk balancing and information life cycle management
It is elected by a Curator master node who manages the task and job delegation

Stargate and Curator Service explained

Genesis

Key Role: Cluster component & service manager
Genesis is a process which runs on each node and is responsible for any services interactions to start, stop and restart
Genesis is a process which runs independently of the cluster and does not require the cluster to be configured/running
Genesis depends on Zookeeper is up and running
The cluster_init and cluster_status pages are displayed by the Genesis process

Chronos

Key Role: Job and task scheduler
Chronos is responsible for taking the jobs and tasks resulting from a Curator scan and scheduling/throttling tasks among nodes
Chronos runs on every node and is controlled by an elected Chronos Master that is responsible for the task and job delegation and runs on the same node as the Curator Master

Cerebro

Key Role: Replication/DR manager
Chronos is also responsible for data replication from DC to DR if remote site and protection domain is configured
Cerebro is responsible for the replication and DR capabilities of DSF
Chronos is responsible to take scheduled automatic snapshot as per protection domain configuration
This includes the scheduling of snapshots, the replication to remote sites, and the site migration/failover
Cerebro runs on every node in the Nutanix cluster and all nodes participate in replication to remote clusters/sites

Pithos

Key Role: vDisk configuration manage
Pithos is responsible for vDisk (DSF file) configuration data
Pithos runs on every node and is built on top of Cassandra

Prism

Key Role: UI and API
Prism provides a management gateway for administrators to configure and monitor the Nutanix cluster. This includes the nCLI, REST API and HTML5 UI web console
Prism runs on every node in the cluster, one node elected as leader
All requests are forwarded from followers to the leader using Linux iptables
This allows administrators to access Prism using any Controller VM IP address
If the Prism leader fails, a new leader is elected
Prism communicates with Zeus for cluster configuration data and Cassandra for statistics to present to the user
It also communicates with the ESXi hosts for VM status and related information

Acropolis Services

Nutanix Acropolis Service run as Master-Slave fashion on every CVM with an elected Acropolis Master which is responsible for task scheduling, execution, IPAM, etc. Similar to other components which have a Master, if the Acropolis Master fails, a new one will be elected.

The Nutanix Acropolis Service role breakdown for each can be seen below:

Acropolis Master
- Task scheduling & execution
- Stat collection / publishing
- Network Controller (for hypervisor)
- VNC proxy (for hypervisor)
- HA (for hypervisor)
Acropolis Slave
- Stat collection / publishing
- VNC proxy (for hypervisor)

Here we show a conceptual view of the Acropolis Master / Slave relationship:

Dynamic Scheduler

Nutanix Dynamic Scheduler delivers efficient scheduling of resources is critical to ensure resources are effectively consumed. The Acropolis Dynamic Scheduler extends the traditional means of scheduling that relies upon compute utilization (CPU/MEM) to make placement decisions.

It leverages compute, as well as storage and others to drive VM and volume (ABS) placement decisions. This ensures that resources are effectively consumed and end-user performance is optimal.

Nutanix Resource scheduling can be broken down into two key areas:

Initial placement
- Where an item is scheduled at power-on
Runtime Optimization
- Movement of workloads based upon runtime metrics

The original Acropolis Scheduler had taken care of the initial placement decisions since its release. With its release in AOS 5.0, the Acropolis Dynamic Scheduler expands upon this to provide runtime resources optimization.

The figure shows a high-level view of the scheduler architecture:

Conclusion

Every component of the Nutanix Acropolis cluster has own importance to run and maintain the cluster services up and running and handle to virtual machine workload along provide the hardware and software component(s) failure up to sustainable limit.

Thanks to be here and sharing posts with your buddies.! 🙂

Blog Author

Nutanix Cluster’s Components and Acropolis Services Explained

Cluster Components Overview

Cluster Components In Detailed