About resize requests in a MIG


This document explains how resize requests in a managed instance group (MIG) work and their limitations. To learn more about the methods to add virtual machine (VM) instances in a MIG, see Add and remove VMs from a MIG.

Use MIG resize requests to create multiple VMs all at once. This approach is useful in the following scenarios:

  • Resize requests help you avoid charges for partial capacity that Compute Engine might create as it provisions all resources.

  • Resize requests use the flex-start provisioning model (Preview), which increases your chance to obtain high-demand resources like GPUs.

How resize requests work

The following sections outline how MIG resize requests work.

On creation

When you create a MIG resize request, you must specify the following properties:

  • To define the number of VMs to create, use one of the following properties:

    • resizeBy: the number of VMs to create. The MIG automatically generates the names of the VMs.

    • instanceNames: a list of names of VMs to create. The MIG creates as many VMs as the number of names you specify. This property is in Preview and is useful if your workload requires specific VM names.

  • requestedRunDuration: the duration for which the VMs must run. The duration must be between 10 minutes and seven days. After the run duration ends, the MIG automatically deletes the VMs. When you create a resize request in a MIG that uses the features and services available from Cluster Director, this property is optional. In such MIGs, if you don't specify a run duration for a resize request, then the VMs run until the end of the reservation that the MIG uses.

After creation

After you create a MIG resize request, the request goes through different states. The following diagram shows these states:

A diagram of each state that Compute Engine can set a resize request to.

The states shown in the preceding diagram are as follows:

  • CREATING: Compute Engine received the resize request, the MIG's target size increases by the number of VMs specified in the request, and the MIG creates managed instances that are in a CREATING state. These managed instances represent the VMs that the MIG creates when the resize request succeeds.

  • ACCEPTED: the request has been accepted and created. The underlying scheduler mechanism, the Dynamic Workload Scheduler (DWS), schedules the creation of the requested resources based on resource availability and the run duration specified in the request. If you lack quota for the requested resources or the resources are temporarily unavailable, then the DWS persists the request until you have sufficient quota and the resources become available.

  • SUCCEEDED: the MIG created the requested number of VMs all at once. The VMs run until the MIG deletes them after the specified run duration ends, or until you delete the VMs.

  • FAILED: the resize request failed due to a technical error and Compute Engine decreased the target size of the MIG by the number of requested VMs.

  • CANCELLED: a user canceled the resize request. Canceling a resize request stops the MIG from creating the requested resources. After canceling a resize request, Compute Engine decreases the MIG's target size by the number of requested VMs and automatically deletes the request after 14 days. Optionally, you can delete a resize request before Compute Engine automatically deletes it.

If you delete a MIG containing resize requests, then this operation also deletes any resize requests and VMs in the MIG. However, if you delete a MIG when the MIG is creating VMs to fulfill a resize request, Compute Engine waits until the MIG has finished creating the requested number of VMs and the state of the resize request transitions to SUCCEEDED before deleting the MIG.

Limitations

The following sections outline the limitations for creating MIG resize requests.

For resize requests

MIG resize requests have the following limitations:

  • You can only use resize requests to obtain the following GPU machine types:

    • For zonal MIGs, all GPU machine types

    • For regional MIGs, all GPU machine types except A4 and A3 Ultra

  • You can only cancel resize requests that are in the ACCEPTED state.

  • You can only delete a resize request after it succeeds (SUCCEEDED), fails (FAILED), or is canceled (CANCELLED).

For the instance template

When you want to create a MIG resize request, the MIG's instance template must adhere to the following:

For the MIG

For the MIG in which you want to create resize requests, the following limitations apply:

Quota for GPU VMs with requested run duration

GPU VMs that are configured to be automatically deleted after a predefined run time of 7 days or less can consume either preemptible or standard allocation quotas. This behavior is intended to help you improve the obtainability of allocation quota for temporary-but-uninterrupted workloads. For more information about this behavior, see GPU VMs and preemptible allocation quotas.

Pricing

There are no costs associated with creating, canceling, or deleting resize requests. Instead, you're charged as follows:

  • Charges start when the MIG creates your requested number of VMs.

  • Charges stop when one of the following occurs:

    • The MIG automatically deletes the VMs at the end of their run duration.

    • You delete the VMs.

What's next