Fixing Teaming and Failover drift vSwitch
Standardization is key for automation! Without proper standardization, every (automated) change becomes a huge risk. I work for a large enterprise and have come to realize that sometimes we lack the necessary standardization. Even though everything is well documented, I found some discrepancies between how a vSwitch should be configured and how it was actually configured. Primarily, the number of active uplinks didn’t match what was documented. I discovered several vSwitches and portgroups with just one active uplink, while the standard is to have both uplinks active. Since these were vSwitches, I didn’t have a central tool to rectify the configuration error across the environment, so I had to write a script to correct the issue manually.
Before I dive into the script, let me briefly explain why standardization and automation are so important.
Why Standardization and Automation Matter
Standardization ensures consistency across your IT environment. When systems are standardized, you can predict their behavior, apply changes with confidence, and reduce the risk of errors. Imagine the impact if the network team decided to patch one physical switch, and some of your ESXi hosts only had a single uplink configured. You could lose management connectivity to hundreds of hosts, putting the stability of your environment at risk. Proper standardization ensures that every host and network component follows the same rules, reducing this risk.
Automation builds on top of standardization. Once your environment is standardized, automation can help you efficiently maintain and enforce those standards across your infrastructure. With automation, you can quickly detect and correct configuration drifts, scale changes across large environments, and eliminate human error from repetitive tasks. This is especially important in large-scale environments where manual management is impractical and error-prone.
What the script does is check if there is an uplink on standby. If it finds one, it changes the vSwitch teaming and failover settings so that both uplinks are active. It then checks the teaming and failover settings of every portgroup on that vSwitch. If it finds a portgroup with an uplink on standby, it forces that portgroup to adopt the vSwitch’s settings.
Script Parameters:
-vCenterServer
(Mandatory):
Specifies the vCenter Server(s) you want to connect to. You can provide one or multiple servers as an array. Example:@("vcenter1.local.domain", "vcenter2.local.domain")
.-vCenterUsername
(Mandatory):
The username to authenticate with the vCenter Server(s). Example:"administrator@vsphere.local"
.-vCenterPassword
(Mandatory):
The password to authenticate with the vCenter Server(s). Make sure to handle this securely in production environments.-clusterName
(Optional):
If provided, the script will only target the specified cluster. If omitted, it will run against all clusters within the vCenter Server(s). Example:"Prod01"
.
The script ensures that your vSwitches and portgroups are compliant with your organization’s standard failover and teaming policies, thus reducing the risk of single points of failure.
Thank you for reading my blog, and feel free to reach out to me with any questions or feedback!
<# .SYNOPSIS Script: nicTeamingPolicy Version: 1.0 (Tested) Date: Sept 09, 2024 Author: Kabir Ali - info@kablog.nl Description: This script will correct the teaming and failover settings of the vSwitch Version history: 0.1 - Sept 09 - Initial version 1.0 - Sept 16 - Tested release .EXAMPLE .\nicTeamingPolicy.ps1 -vCenterServers "@("vcenter1.local.domain", "vcenter2.local.domain")" -vCenterUsername "administrator@vsphere.local" -vCenterPassword "VMware1!" -clusterName "Prod01" #> Param ( [Parameter(Mandatory = $true)] [ValidateNotNullOrEmpty()] [string[]]$vCenterServers, # Array of vCenter Server(s) [Parameter(Mandatory = $true)] [ValidateNotNullOrEmpty()] [string]$vCenterUsername, # vCenter Username [Parameter(Mandatory = $true)] [ValidateNotNullOrEmpty()] [string]$vCenterPassword, # vCenter Password [Parameter(Mandatory = $false)] [string]$clusterName = "*" # Optional Cluster Name (default to all) ) # Function used to get the configured vSwitch failover settings Function Get-vSwitchFailover { Param ( $esxcli, # The EsxCli object [string]$vSwitchName # Name of the vSwitch ) # Check if $esxcli is not empty if ($esxcli) { $failoverPolicy = $esxcli.network.vswitch.standard.policy.failover.get.Invoke(@{ vswitchname = $vSwitchName }) return $failoverPolicy } else { Write-Warning "Function received empty esxcli. Check connectivity to host." Break # Exit the function if $esxcli is null } } # Function used to change the configured vSwitch failover settings Function Set-vSwitchFailover { Param ( $esxcli, # The EsxCli object [string]$activeUplinks, # Active uplinks (comma-separated) [string]$vSwitchName # Name of the vSwitch ) if ($esxcli) { # Set the failover policy for the vSwitch $esxcli.network.vswitch.standard.policy.failover.set.Invoke(@{ vswitchname = $vSwitchName; # Specify vSwitch name loadbalancing = "mac"; # Load Balancing policy based on MAC failback = $true; # Enable failback notifyswitches = $true; # Notify switches when a failover happens activeuplinks = $activeUplinks; # Set active uplinks standbyuplinks = $null; # No standby uplinks }) } else { Write-Warning "Function received empty esxcli. Check connectivity to host." Break # Exit if $esxcli is null } } # Function used to get a list of portgroups Function Get-portGroupList { Param ( $esxcli, # The EsxCli object [string]$vSwitchName # Name of the vSwitch ) if ($esxcli) { # Retrieve the list of port groups on the specified vSwitch $portGroups = $esxcli.network.vswitch.standard.portgroup.list.Invoke() return $portGroups } else { Write-Warning "Function received empty esxcli. Check connectivity to host." Break # Exit if $esxcli is null } } # Function used to get the configured portgroup failover settings Function Get-portGroupFailover { Param ( [string]$portGroupName, # Port group name [string]$vSwitchName # Name of the vSwitch ) if ($esxcli) { # Get the failover policy for the port group $failoverPolicy = $esxcli.network.vswitch.standard.portgroup.policy.failover.get.Invoke(@{ portgroupname = $portGroupName }) return $failoverPolicy } else { Write-Warning "Function received empty portgroup name." Break # Exit if $esxcli is null } } # Function used to force the portgroup to use the vSwitch settings Function Set-portGroupFailover { Param ( $esxcli, # The EsxCli object [string]$portgroupname # Port group name ) if ($esxcli) { # Reset the port group to use the virtual switch's failover policy $esxcli.network.vswitch.standard.portgroup.policy.failover.set.Invoke(@{ portgroupname = $portGroupName; usevswitch = $true # Reset to use the vSwitch's failover settings }) } else { Write-Warning "Function received empty esxcli. Check connectivity to host." Break # Exit if $esxcli is null } } # Log the start of the script Write-Output "-------------" Write-Output "Starting script on $(Get-Date)" # Output the current time and date foreach ($vCenterServer in $vCenterServers) { # Connect to the vCenter try { Connect-VIServer -Server $vCenterServer -User $vCenterUsername -Password $vCenterPassword -ErrorAction Stop Write-Host "Connected to $vCenterServer successfully." -ForegroundColor Green } catch { Write-Warning "Failed to connect to vCenter: $vCenterServer. Error: $_" Break # Exit if unable to connect to vCenter } # Get the cluster(s) to work on if ($clusterName) { $clusters = get-cluster -name $clusterName -ErrorAction SilentlyContinue } else { $clusters = get-cluster } # Loop through each cluster foreach ($cluster in $clusters) { Write-Output "Cluster: $($cluster) found in current vCenter: $($vCenterServer)" # Get all hosts in the cluster $vmhosts = get-cluster -name $cluster.name | get-vmhost foreach ($vmhost in $vmhosts) { $esxcli = Get-EsxCli -VMHost $vmhost -V2 # Retrieve EsxCli object $currentFailover = Get-vSwitchFailover -esxcli $esxcli -vSwitchName vSwitch0 # Get failover policy of vSwitch0 if ($currentFailover.StandbyAdapters -ne $null) { write-warning "Not all uplinks active on the vSwitch0 on ESXi host $($vmhost)!" $adapters = $currentFailover.ActiveAdapters + $currentFailover.StandbyAdapters # Combine active and standby adapters $adaptersString = $adapters -join "," # Create a string of the adapters Write-Output "Changing failover settings on ESXi host $($vmhost) for vSwitch0." Set-vSwitchFailover -esxcli $esxcli -activeUplinks $adaptersString -vSwitchName vSwitch0 # Set new failover settings } # List all port groups on the vSwitch $portGroups = Get-portGroupList -esxcli $esxcli -vSwitchName vSwitch0 # Loop through each port group to get its failover policy foreach ($portGroup in $portGroups) { $portGroupName = $portGroup.Name $portGroupFailover = Get-portGroupFailover -portGroupName $portGroupName -vSwitchName vSwitch0 # Check if the standby adapters exist, and modify the settings if necessary if ($portGroupFailover.StandbyAdapters -ne $null) { write-warning "Not all uplinks active on portgroup $($portGroupName)!" $adapters = $portGroupFailover.ActiveAdapters + $portGroupFailover.StandbyAdapters # Combine adapters $adaptersString = $adapters -join "," # Create a string of the adapters Write-Output "Changing failover settings on ESXi host $($vmhost) for portgroup $($portGroupName)." Set-portGroupFailover -esxcli $esxcli -portgroupname $portGroupName # Reset port group to use vSwitch settings } } } } # Disconnect from the vCenter Write-Host "Disconnected from $vCenterServer successfully." -ForegroundColor Yellow Disconnect-VIServer -Server * -Confirm:$false }
#Broadcom #VMware #vSwitch #PowerShell #PowerCLI