Nutanix Cluster components are the core programs or services that actually run the Nutanix Acropolis cluster successfully.
Nutanix cluster is the distributed architecture and each node of the Nutanix Acropolis cluster shares resources across the cluster to distribute the task and responsibilities.
All components run on multiple nodes in the cluster, and depend on connectivity between their peers that also run the component. Most components also depend on other components for information.
Cluster Components Overview
- Zeus : Access interface for Zookeeper.
- Zookeeper : Manages cluster configuration
- Medusa : Access interface for Cassandra
- Cassandra : Distributed metadata store
- Stargate : Data I/O manager for the cluster
- Curator : Handles Map Reduce cluster management and cleanup
- Genesis : Cluster component & service manager
- Chronos : Job and task scheduler
- Cerebro : Replication/DR manager
- Pithos : vDisk configuration manage
- Prism: Management interface for nCLI, APIs adn HTML5 Web console
- Acrpolis Services
- Dynamic Scheduler
Cluster Components In Detailed
Zeus
- Zeus is the interface for Zookeeper to give information access to other components about nutanix cluster
- A key element of a distributed system is a method for all nodes to store and update the cluster’s configuration
- Cluster configuration contains information about physical components ( nodes, disks ) and logical components ( storage container ) in the cluster
- Zeus keep tracks of nodes IP address, capacities and data replication rules like RF-2 or RF-3
- Store all cluster information
- Zeus is the Nutanix library that all other components use to access the cluster configuration
Zookeeper
- Key Role: Cluster configuration manager
- Zookeeper is accessed via an interface called Zeus
- Zookeeper runs on either three or five nodes, depending on the redundancy factor that is applied to the cluster
- one Zookeeper node is elected as the leader in the cluster
- The leader receives all requests for information and confers with the two follower nodes
- If the leader stops responding, a new leader is elected automatically
- Zookeeper has no dependencies, meaning that it can start without any other cluster components running
Medusa
- Medusa is the interface of Cassandra
- Stargate and Curator communicates to Cassandra through Medusa
- Medusa store and keep tracks host and hosting virtual machines data
- Medusa keep tracks of the VM’s Master data replica in case of node failure data would be safe.
- Medusa is a Nutanix abstraction layer that sits in front of the database that holds this metadata
- The database is distributed across all nodes in the cluster, using a modified form of Apache Cassandra
Cassandra
- Key Role: Distributed metadata store
- Cassandra is a distributed, high-performance, scalable database that stores all metadata about the guest VM data stored in a Nutanix datastore
- In the case of NFS datastores, Cassandra also holds small files saved in the datastore
- When a file reaches 512K in size, the cluster creates a vDisk to hold the data
- Cassandra stores and manages all of the cluster metadata in a distributed ring-like manner based upon a heavily modified Apache Cassandra
- The Paxos algorithm is utilized to enforce strict consistency Paxos service runs on every node in the cluster
- The Cassandra is accessed via an interface called Medusa
- Cassandra runs on all nodes of the cluster. These nodes communicate with each other once a second using the Gossip protocol, ensuring that the state of the database is current on all nodes
- Cassandra depends on Zeus to gather information about the cluster configuration
Stargate
- Key Role: Data I/O manager
- Stargate is responsible for all data management and I/O operations and is the main interface from the hypervisor (via NFS, iSCSI, or SMB)
- A distributed system that presents storage to other systems (such as a hypervisor) needs a unified component for receiving and processing data that it receives
- All read and write requests are sent across vSwitch Nutanix to the Stargate process running on that node
- Stargate depends on Medusa to gather metadata and Zeus to gather cluster configuration data
- Stargate service runs on every node in the cluster in order to serve localized I/O
Curator
- Key Role: MapReduce cluster management and cleanup
- Curator run on each node of the cluster and tracks the metadata and VM’s master data on storage.
- A Curator master node periodically scans the metadata database and identifies cleanup and optimization tasks that Stargate or other components should perform
- Analyzing the metadata is shared across other Curator nodes, using a MapReduce algorithm
- Curator depends on Zeus to learn which nodes are available, and Medusa to gather metadata. Based on that analysis, it sends commands to Stargate
- It does disk balancing and information life cycle management
- It is elected by a Curator master node who manages the task and job delegation
Genesis
- Key Role: Cluster component & service manager
- Genesis is a process which runs on each node and is responsible for any services interactions to start, stop and restart
- Genesis is a process which runs independently of the cluster and does not require the cluster to be configured/running
- Genesis depends on Zookeeper is up and running
- The cluster_init and cluster_status pages are displayed by the Genesis process
Chronos
- Key Role: Job and task scheduler
- Chronos is responsible for taking the jobs and tasks resulting from a Curator scan and scheduling/throttling tasks among nodes
- Chronos runs on every node and is controlled by an elected Chronos Master that is responsible for the task and job delegation and runs on the same node as the Curator Master
Cerebro
- Key Role: Replication/DR manager
- Chronos is also responsible for data replication from DC to DR if remote site and protection domain is configured
- Cerebro is responsible for the replication and DR capabilities of DSF
- Chronos is responsible to take scheduled automatic snapshot as per protection domain configuration
- This includes the scheduling of snapshots, the replication to remote sites, and the site migration/failover
- Cerebro runs on every node in the Nutanix cluster and all nodes participate in replication to remote clusters/sites
Pithos
- Key Role: vDisk configuration manage
- Pithos is responsible for vDisk (DSF file) configuration data
- Pithos runs on every node and is built on top of Cassandra
Prism
- Key Role: UI and API
- Prism provides a management gateway for administrators to configure and monitor the Nutanix cluster. This includes the nCLI, REST API and HTML5 UI web console
- Prism runs on every node in the cluster, one node elected as leader
- All requests are forwarded from followers to the leader using Linux iptables
- This allows administrators to access Prism using any Controller VM IP address
- If the Prism leader fails, a new leader is elected
- Prism communicates with Zeus for cluster configuration data and Cassandra for statistics to present to the user
- It also communicates with the ESXi hosts for VM status and related information
Read more Nutanix Prism Core Architecture Explained
Acropolis Services
Nutanix Acropolis Service run as Master-Slave fashion on every CVM with an elected Acropolis Master which is responsible for task scheduling, execution, IPAM, etc. Similar to other components which have a Master, if the Acropolis Master fails, a new one will be elected.
The Nutanix Acropolis Service role breakdown for each can be seen below:
- Acropolis Master
- Task scheduling & execution
- Stat collection / publishing
- Network Controller (for hypervisor)
- VNC proxy (for hypervisor)
- HA (for hypervisor)
- Acropolis Slave
- Stat collection / publishing
- VNC proxy (for hypervisor)
Here we show a conceptual view of the Acropolis Master / Slave relationship:
Dynamic Scheduler
Nutanix Dynamic Scheduler delivers efficient scheduling of resources is critical to ensure resources are effectively consumed. The Acropolis Dynamic Scheduler extends the traditional means of scheduling that relies upon compute utilization (CPU/MEM) to make placement decisions.
It leverages compute, as well as storage and others to drive VM and volume (ABS) placement decisions. This ensures that resources are effectively consumed and end-user performance is optimal.
Nutanix Resource scheduling can be broken down into two key areas:
- Initial placement
- Where an item is scheduled at power-on
- Runtime Optimization
- Movement of workloads based upon runtime metrics
The original Acropolis Scheduler had taken care of the initial placement decisions since its release. With its release in AOS 5.0, the Acropolis Dynamic Scheduler expands upon this to provide runtime resources optimization.
The figure shows a high-level view of the scheduler architecture:
Conclusion
Every component of the Nutanix Acropolis cluster has own importance to run and maintain the cluster services up and running and handle to virtual machine workload along provide the hardware and software component(s) failure up to sustainable limit.
Thanks to be here and sharing posts with your buddies.! 🙂