An admission controlled resource scheduler for large scale vSphere deployments
While DRS (Dynamic Resource Scheduler) in vSphere handles CPU and Memory allocations within a single vSphere cluster, larger deployments require another layer of scheduling to make the use of multiple clusters transparent. So this class doesn't replace DRS, but in fact works on top of it.
The scheduler in this class performs admission control to make sure clusters don't get overloaded. It does so by adding additional metrics to the already existing CPU and Memory reservation system that DRS has. After admission control it also performs very basic initial placement. Note that in-cluster placement and load-balancing is left to DRS. Also note that no cross-cluster load balancing is done.
This class uses the concept of a Pod: A set of clusters that share a set of datastores. From a datastore perspective, we are free to place a VM on any host or cluster. So admission control is done at the Pod level first. Pods are automatically dicovered based on lists of clusters and datastores.
Admission control covers the following metrics:
Host availability: If no hosts are available within a cluster or pod, admission is denied.
Minimum free space: If a datastore falls below this free space percentage, admission to it will be denied. Admission to a pod is granted as long at least one datastore passes admission control.
Maximum number of VMs: If a Pod exceeds a configured number of powered on VMs, admission is denied. This is a crude but effective catch-all metric in case users didn't set proper individual CPU or Memory reservations or if the scalability limit doesn't originate from CPU or Memory.
Placement after admission control:
Cluster selection: A load metric based on a combination of CPU and Memory load is used to always select the "least loaded" cluster. The metric is very crude and only meant to do very rough load balancing. If DRS clusters are large enough, this is good enough in most cases though.
Datastore selection: Right now NO intelligence is implemented here.
Usage: Instantiate the class, call make_placement_decision and then use the exposed computer (cluster), resource pool, vm_folder and datastore. Currently once computed, a new updated placement can't be generated.