Sorting bits into bytes...

Fixing Teaming and Failover drift vSwitch

Standardization is key for automation! Without proper standardization, every (automated) change becomes a huge risk. I work for a large enterprise and have come to realize that sometimes we lack the necessary standardization. Even though everything is well documented, I found some discrepancies between how a vSwitch should be configured and how it was actually configured. Primarily, the number of active uplinks didn’t match what was documented. I discovered several vSwitches and portgroups with just one active uplink, while the standard is to have both uplinks active. Since these were vSwitches, I didn’t have a central tool to rectify the configuration error across the environment, so I had to write a script to correct the issue manually.

Before I dive into the script, let me briefly explain why standardization and automation are so important.

Why Standardization and Automation Matter

Standardization ensures consistency across your IT environment. When systems are standardized, you can predict their behavior, apply changes with confidence, and reduce the risk of errors. Imagine the impact if the network team decided to patch one physical switch, and some of your ESXi hosts only had a single uplink configured. You could lose management connectivity to hundreds of hosts, putting the stability of your environment at risk. Proper standardization ensures that every host and network component follows the same rules, reducing this risk.

Automation builds on top of standardization. Once your environment is standardized, automation can help you efficiently maintain and enforce those standards across your infrastructure. With automation, you can quickly detect and correct configuration drifts, scale changes across large environments, and eliminate human error from repetitive tasks. This is especially important in large-scale environments where manual management is impractical and error-prone.

What the script does is check if there is an uplink on standby. If it finds one, it changes the vSwitch teaming and failover settings so that both uplinks are active. It then checks the teaming and failover settings of every portgroup on that vSwitch. If it finds a portgroup with an uplink on standby, it forces that portgroup to adopt the vSwitch’s settings.

Script Parameters:

  1. -vCenterServer (Mandatory):
    Specifies the vCenter Server(s) you want to connect to. You can provide one or multiple servers as an array. Example: @("vcenter1.local.domain", "vcenter2.local.domain").
  2. -vCenterUsername (Mandatory):
    The username to authenticate with the vCenter Server(s). Example: "administrator@vsphere.local".
  3. -vCenterPassword (Mandatory):
    The password to authenticate with the vCenter Server(s). Make sure to handle this securely in production environments.
  4. -clusterName (Optional):
    If provided, the script will only target the specified cluster. If omitted, it will run against all clusters within the vCenter Server(s). Example: "Prod01".

The script ensures that your vSwitches and portgroups are compliant with your organization’s standard failover and teaming policies, thus reducing the risk of single points of failure.

Thank you for reading my blog, and feel free to reach out to me with any questions or feedback!

<#
.SYNOPSIS
Script: nicTeamingPolicy
Version: 1.0 (Tested)
Date: Sept 09, 2024
Author: Kabir Ali - info@kablog.nl
Description: This script will correct the teaming and failover settings of the vSwitch
Version history:
0.1 - Sept 09 - Initial version
1.0 - Sept 16 - Tested release

.EXAMPLE
.\nicTeamingPolicy.ps1 -vCenterServers "@("vcenter1.local.domain", "vcenter2.local.domain")" -vCenterUsername "administrator@vsphere.local" -vCenterPassword "VMware1!" -clusterName "Prod01"
#>


Param (
    [Parameter(Mandatory = $true)]
    [ValidateNotNullOrEmpty()]
    [string[]]$vCenterServers,  # Array of vCenter Server(s)

    [Parameter(Mandatory = $true)]
    [ValidateNotNullOrEmpty()]
    [string]$vCenterUsername,  # vCenter Username

    [Parameter(Mandatory = $true)]
    [ValidateNotNullOrEmpty()]
    [string]$vCenterPassword,  # vCenter Password

    [Parameter(Mandatory = $false)]
    [string]$clusterName = "*"  # Optional Cluster Name (default to all)
)


# Function used to get the configured vSwitch failover settings
Function Get-vSwitchFailover {
    Param (
        $esxcli,  # The EsxCli object
        [string]$vSwitchName  # Name of the vSwitch
    )

    # Check if $esxcli is not empty
    if ($esxcli) {
        $failoverPolicy = $esxcli.network.vswitch.standard.policy.failover.get.Invoke(@{
            vswitchname = $vSwitchName
        })
        return $failoverPolicy
    } else {
        Write-Warning "Function received empty esxcli. Check connectivity to host."
        Break  # Exit the function if $esxcli is null
    }
}

# Function used to change the configured vSwitch failover settings
Function Set-vSwitchFailover {
    Param (
        $esxcli,  # The EsxCli object
        [string]$activeUplinks,  # Active uplinks (comma-separated)
        [string]$vSwitchName  # Name of the vSwitch
    )

    if ($esxcli) {
        # Set the failover policy for the vSwitch
        $esxcli.network.vswitch.standard.policy.failover.set.Invoke(@{
            vswitchname = $vSwitchName;  # Specify vSwitch name
            loadbalancing = "mac";  # Load Balancing policy based on MAC
            failback = $true;  # Enable failback
            notifyswitches = $true;  # Notify switches when a failover happens
            activeuplinks = $activeUplinks;  # Set active uplinks
            standbyuplinks = $null;  # No standby uplinks
        })
    } else {
        Write-Warning "Function received empty esxcli. Check connectivity to host."
        Break  # Exit if $esxcli is null
    }
}

# Function used to get a list of portgroups
Function Get-portGroupList {
    Param (
        $esxcli,  # The EsxCli object
        [string]$vSwitchName  # Name of the vSwitch
    )

    if ($esxcli) {
        # Retrieve the list of port groups on the specified vSwitch
        $portGroups = $esxcli.network.vswitch.standard.portgroup.list.Invoke()
        return $portGroups
    } else {
        Write-Warning "Function received empty esxcli. Check connectivity to host."
        Break  # Exit if $esxcli is null
    }
}

# Function used to get the configured portgroup failover settings
Function Get-portGroupFailover {
    Param (
        [string]$portGroupName,  # Port group name
        [string]$vSwitchName  # Name of the vSwitch
    )

    if ($esxcli) {
        # Get the failover policy for the port group
        $failoverPolicy = $esxcli.network.vswitch.standard.portgroup.policy.failover.get.Invoke(@{
            portgroupname = $portGroupName
        })
        return $failoverPolicy
    } else {
        Write-Warning "Function received empty portgroup name."
        Break  # Exit if $esxcli is null
    }
}

# Function used to force the portgroup to use the vSwitch settings
Function Set-portGroupFailover {
    Param (
        $esxcli,  # The EsxCli object
        [string]$portgroupname  # Port group name
    )

    if ($esxcli) {
        # Reset the port group to use the virtual switch's failover policy
        $esxcli.network.vswitch.standard.portgroup.policy.failover.set.Invoke(@{
            portgroupname = $portGroupName;
            usevswitch = $true  # Reset to use the vSwitch's failover settings
        })
    } else {
        Write-Warning "Function received empty esxcli. Check connectivity to host."
        Break  # Exit if $esxcli is null
    }
}

# Log the start of the script
Write-Output "-------------" 
Write-Output "Starting script on $(Get-Date)"  # Output the current time and date

foreach ($vCenterServer in $vCenterServers) {

    # Connect to the vCenter
    try {
        Connect-VIServer -Server $vCenterServer -User $vCenterUsername -Password $vCenterPassword -ErrorAction Stop
        Write-Host "Connected to $vCenterServer successfully." -ForegroundColor Green
    } catch {
        Write-Warning "Failed to connect to vCenter: $vCenterServer. Error: $_"
        Break  # Exit if unable to connect to vCenter
    }
    
    # Get the cluster(s) to work on
    if ($clusterName) {
        $clusters = get-cluster -name $clusterName -ErrorAction SilentlyContinue
    } else {
        $clusters = get-cluster
    }

    # Loop through each cluster
    foreach ($cluster in $clusters) {
        Write-Output "Cluster: $($cluster) found in current vCenter: $($vCenterServer)"

        # Get all hosts in the cluster
        $vmhosts = get-cluster -name $cluster.name | get-vmhost
    
        foreach ($vmhost in $vmhosts) {
            $esxcli = Get-EsxCli -VMHost $vmhost -V2  # Retrieve EsxCli object
            $currentFailover = Get-vSwitchFailover -esxcli $esxcli -vSwitchName vSwitch0  # Get failover policy of vSwitch0

            if ($currentFailover.StandbyAdapters -ne $null) {
                write-warning "Not all uplinks active on the vSwitch0 on ESXi host $($vmhost)!"
                $adapters = $currentFailover.ActiveAdapters + $currentFailover.StandbyAdapters  # Combine active and standby adapters
                $adaptersString = $adapters -join ","  # Create a string of the adapters
                Write-Output "Changing failover settings on ESXi host $($vmhost) for vSwitch0."
                Set-vSwitchFailover -esxcli $esxcli -activeUplinks $adaptersString -vSwitchName vSwitch0  # Set new failover settings
            }

            # List all port groups on the vSwitch
            $portGroups = Get-portGroupList -esxcli $esxcli -vSwitchName vSwitch0

            # Loop through each port group to get its failover policy
            foreach ($portGroup in $portGroups) {
                $portGroupName = $portGroup.Name
                $portGroupFailover = Get-portGroupFailover -portGroupName $portGroupName -vSwitchName vSwitch0

                # Check if the standby adapters exist, and modify the settings if necessary
                if ($portGroupFailover.StandbyAdapters -ne $null) {
                    write-warning "Not all uplinks active on portgroup $($portGroupName)!"
                    $adapters = $portGroupFailover.ActiveAdapters + $portGroupFailover.StandbyAdapters  # Combine adapters
                    $adaptersString = $adapters -join ","  # Create a string of the adapters
                    Write-Output "Changing failover settings on ESXi host $($vmhost) for portgroup $($portGroupName)."
                    Set-portGroupFailover -esxcli $esxcli -portgroupname $portGroupName  # Reset port group to use vSwitch settings
                }
            }
        }
    }

    # Disconnect from the vCenter
    Write-Host "Disconnected from $vCenterServer successfully." -ForegroundColor Yellow
    Disconnect-VIServer -Server * -Confirm:$false
}

#Broadcom #VMware #vSwitch #PowerShell #PowerCLI

Leave a Reply