DRS vMotion Limiter
vGPUs are becoming more and more common in today’s infrastructure. These new features do not always play nice with other features. As an example, my current customer is using vGPUs in their infrastructure, and they face a problem when combining DRS and VMs with a vGPU. Whenever too many vMotions happen at the same time, the stun time creates all sorts of issues for the end users. As a workaround, they have set DRS to manual just to be able to control the number of simultaneous vMotions. As this is a repeating task, it should be automated with PowerShell.
Here is a script that gives you the ability to control the number of simultaneous vMotions started by DRS.
<# Author: Kabir Ali - info@kablog.nl Scriptname: Apply_DRS_Recommendation.ps1 (ADR) Version: 1.0 (Tested) Date: Januari 12 2023 Why: VMs with vGPU's don't like it when they are migrated all at onec. This script will apply the DRS recommendations, where you can set how many migrations should happen simultaneously. #> <# Example: .\Apply_DRS_Recommendation.ps1 -vCenter "vCenter@local.domain" -Cluster "Production" -Limit "1" #> Param ( [Parameter(Mandatory = $true)][string]$vCenter, [Parameter(Mandatory = $true)][string]$Cluster, [Parameter(Mandatory = $true)][string]$Limit ) # Function to check the number of vMotions and sleep if limit is hit Function running_vMotions { # Get number of running vMotions $num_of_running_vmotions = (Get-Task -Status Running | where{$_.Name -eq 'RelocateVM_Task'}).Count while ($num_of_running_vmotions -eq $Limit){ Write-output "vMotion Limit Hit. Number of running vMotions $($num_of_running_vmotions) - Sleeping for 3s" Start-Sleep 3 $num_of_running_vmotions = (Get-Task -Status Running | where{$_.Name -eq 'RelocateVM_Task'}).Count } } # Function to start the vMotion to the host with the lowest CPU usage Function vMotion ($VM_Name, $Dest_host) { get-vm -name $VM_Name | Move-VM -Destination $Dest_host -RunAsync | out-null write-output "vMotion started for VM: $($VM_Name) to host $($Dest_host)" } # Bypass SSL certificate verification add-type @" using System.Net; using System.Security.Cryptography.X509Certificates; public class TrustAllCertsPolicy : ICertificatePolicy { public bool CheckValidationResult( ServicePoint srvPoint, X509Certificate certificate, WebRequest request, int certificateProblem) { return true; } } "@ [System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy # Authenticating with API [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12 # Connect to the vCenter try { Connect-VIServer -Server $vCenter -ErrorAction Stop } Catch { Write-Warning -Message "Error: Failed to connect to the vCenter. Stopping script." Break } # For the given cluster get DRS Recommendations $drs_recom = Get-Cluster -name $Cluster | Get-DrsRecommendation # Break output of Get-DrsRecommendation into array. Dest_host substring is required to remove the trailing dot from the hostname. [array]$drs_text = @() foreach ($drs_line in $drs_recom.Recommendation) { $drs_text += New-Object PSObject -Property @{ VM_Name = $drs_line.Split()[2] -replace '''' Source_host = $drs_line.Split()[5] -replace '''' Dest_host = $drs_line.Split()[8] -replace '''' Status = "Waiting" } } # Are there any recommendations to apply? if ($drs_text.Count -gt 0) { # Loop through the array foreach ($drs_entry in $drs_text) { # Check number of running vMotions running_vMotions # Remove last dot from ESI host $VM_Name = $drs_entry.VM_Name $Dest_fqdn = $drs_entry.Dest_host $Dest_host = $Dest_fqdn.Substring(0,$Dest_fqdn.Length-1) # Display status Write-Output "vMotion will start for $($VM_Name) to host $($Dest_host)" # Start new vMotion vMotion $VM_Name $Dest_host } } else { Write-Output "No DRS recommendations pending." } # Disconnect vcenter. Disconnect-VIServer -Server * -Confirm:$false
Add this script to the Windows task scheduler, and you will never have to worry about too many DRS vMotions happening at the same time!