Auto-provisioning profiles

Auto-provisioning profiles control how the node auto-provisioner behaves when it considers consolidating a Fleet. They give you a knob to trade off aggressive cost optimization against workload disruption, so you can match the auto-provisioner’s behavior to the sensitivity of your workloads.

Background: consolidation

CFKE continuously evaluates whether the current set of nodes in a Fleet is the most efficient way to run your workloads. When it sees that pods could be repacked onto fewer or smaller nodes, it consolidates: pods are evicted from under-utilized nodes, rescheduled onto a tighter footprint, and the freed nodes are removed. Consolidation is the main reason CFKE delivers strong cost efficiency without the user having to right-size anything by hand.

The trade-off is that consolidation involves pod evictions. Workloads that tolerate restarts benefit from aggressive consolidation because they pay a lower infrastructure bill. Workloads that are sensitive to disruption (long-running jobs, stateful services, sessions with warm caches) may prefer a calmer environment, even if it means leaving some headroom on the table.

Available profiles

You can pick the consolidation policy that fits each Fleet:

Conservative (conservative, default): the auto-provisioner prioritizes the stability of the Fleet. It removes nodes that are already empty, for example because their pods finished, scaled down, or moved away on their own, but it does not evict running pods purely to repack them onto a cheaper layout. This profile keeps long-running workloads stable at the price of leaving some unused capacity around for longer.
Aggressive (aggressive): the auto-provisioner prioritizes cost optimization. It will disrupt and replace under-utilized nodes whenever it detects a cheaper layout, including evicting running pods to repack them onto fewer or smaller nodes. This produces the lowest infrastructure cost but the highest rate of pod evictions.

Picking a profile

The right profile depends on what your workloads look like:

Stateless services with fast restarts and no warm-cache dependency: the aggressive profile is usually the right choice. Evictions are cheap and the savings add up.
Long-running batch jobs, large model training runs, services with expensive warm-up, or workloads with tight SLOs around availability: the conservative profile, which is also the default, avoids surprise evictions and keeps workloads on stable nodes.
Mixed clusters: run multiple Fleets, each with its own profile, and use Fleet constraints and pod scheduling rules to direct workloads to the Fleet whose policy fits.

Pod-level controls such as PodDisruptionBudget, terminationGracePeriodSeconds, and the karpenter.sh/do-not-disrupt annotation continue to work alongside profiles. Profiles set the default behavior for the Fleet; pod-level controls fine-tune individual workloads.

Configuring a profile

Set the profile when you create or update a Fleet through the API or the console. In the API, set the scalingProfile field on the Fleet to aggressive or conservative. If you omit the field, the Fleet uses the conservative profile.

json

{
  "scalingProfile": "aggressive"
}

Each Fleet has its own profile, so you can mix aggressive and conservative Fleets in the same cluster and steer workloads to the right one with Fleet constraints and pod scheduling rules.

Previous
← Fleets with static capacity

Next
Node regions →

Auto-provisioning profiles

Background: consolidation

Available profiles

Picking a profile

Configuring a profile

Related topics