Site icon HyperHCI.com

Nutanix Metro Availability Troubleshooting

Nutanix Metro Availability Troubleshooting

Nutanix Metro Availability Troubleshooting

Today i will explore the top 5 common errors / issues of Nutanix Disaster Recovery Metro availability along with will share the tips and tricks to troubleshoot the issues to get the final resolution. This post will very helpful to all those, who are stuck in Nutanix Metro availability technical issues and looking solution for it.

Nutanix Metro Availability feature is available with VMware ESXi only and in future- Metro availability feature coming with Nutanix AHV also. I am sharing common errors, issues you might face in Nutanix metro availability feature during and/or after Metro availability configuration.

Read also: Nutanix Move Tool Issues Troubleshooting

Nutanix Metro Availability common Issues & Troubleshooting

I have top 5 common errors, issues listed out here of Nutanix Disaster recovery Metro Availability and good thing is have the Nutanix Metro availability issues, error’s resolutions / solution also.

Lets explore the top 5 common technical issues of Nutanix Metro availability and do the solution step by step to all of them.

Read Also: How to Kill Stuck, Hung Task in Nutanix Prism via Command

Automatic Promoting Active-Passive Issue

Issue 1: Alert – A130116 – Automatic Promote Metro Availability

Symptoms:

The A130116 – Automatic Promote Metro Availability alert is raised on the standby site in a Witness Metro configuration when the PD (protection domain) on the standby site is promoted to Active.

Possible conditions are:

Resolution

Read also: Download Nutanix NGT Tool for Windows, Linux OS

Metro Availability Failure Issue

Issue 2: Alert – A130117 – Failed to update Metro Availability failure handling

Issue 2.1: Alert – A130118 – Metro Availability- Failed to update Metro Availability failure handling on the remote site

Symptoms

Error in updating failure handling on the Source Metro Availability protection domain, this alert may be generated because of the following reasons.

Resolution

cvm$ cerebro_cli get_master_location
cvm$ cat ~/data/logs/sysstats/ping_remotes.INFO | egrep -v "IP : time" | awk '/^#TIMESTAMP/ || $3>10.00 || $3=unreachable' | egrep -B1 " ms|unreachable" | egrep -v "\-\-"

Output would be like this:

#TIMESTAMP 2941677438847 : 10/02/2020 09:10:50 PM

10.X.X.X : 180 ms
10.Y.Y.Y : unreachable
10.Z.Z.Z : 180 ms
10.Q.Q.Q : unreachable
cvm$ nc -v <remote site CVM IP> 2009

cvm$ nc -v <remote site CVM IP> 2020
watch -d “genesis status”

Read Also: What is Nutanix AHV ?

High Network Latency Issue

Issue 3: High network latency between Metro Availability Protection Domains (also known as stretched clusters)

Symptoms

High network latency greater than 5ms between active and standby metro clusters for 10 seconds delays the commit acknowledgement to the VM, which results remote site is then shown in the Incompatible Remote Sites list due to LATENCY “Bad” and metro relationship automatically disabled.

Resolution

Read Also: Nutanix AHV – Boot VM in BIOS UEFI Mode

Witness VM Not Reachable Issue

Issue 4: Alert – A130115 – Witness VM Not Reachable

Symptoms

The Nutanix Metro Availability Witness VM Not Reachable alert is generated when the cluster involved in Metro Availability is unable to contact the Witness VM on the network, get a response or unable to authenticate to it. This alert may be generated because of the following reasons.

Resolution

  1. Witness VM is down or not reachable over the network
    • Check if the Witness VM is up and running
    • Ping the Witness VM to confirm if it is accessible over the network.
  2. Witness VM internal server errors
    • Check for any errors/Alerts being reported on the witness VM.
  3. Witness VM authentication/password errors
    • Confirm the password used to connect to the witness VM is valid.

Read Also: Google Nutanix Cloud Platform ?

Nutanix Files Server Migration Issue

Issue 5: Nutanix Files : Issues while Migrating Nutanix Files server between ESXi Nutanix clusters (Metro Availability pair)

Symptoms

When migrating a Nutanix Files server cluster to the remote site, where the remote site is the other side of a Nutnaix Metro Availability pair, you may see an issue during activation where the task to activate the Nutanix files server hangs at 47% until it finally times out.

Resolution

Note: Removing a VM from inventory does not delete the VM from disk

Now you should now have just one entry per FSVM with the correct name.

Having completed the workaround above, you should be able to run the Activate workflow for the migrated Files server without issue.

Read Also: Shutdown / Start Nutanix vSphere Cluster – Best Practice

Conclusion

hopefully, you would get help to resolve your Nutanix Metro Availability issues mentioned in this post.

Thanks to being with HyperHCI Tech Blog to stay tuned and being connected to Follow Us on social networks.

Useful Links

Exit mobile version