General Scaling
Core concepts and decision criteria for NodeGroup scaling operations.
Scaling Timeline
Scale-Up Operations (5-8 minutes total):
- Initiation: 0-30 seconds (validation and planning)
- Resource allocation: 30 seconds - 2 minutes (VM provisioning)
- Node bootstrap: 2-5 minutes (OS initialization and Kubernetes registration)
- Health checks: 1-2 minutes (readiness verification)
Scale-Down Operations (3-5 minutes total):
- Workload migration: 1-3 minutes (pod eviction and rescheduling)
- Node draining: 30 seconds - 1 minute (graceful removal)
- Resource cleanup: 30 seconds - 1 minute (VM termination)
Scaling Constraints
Node Limits:
- Minimum: 1 node per worker NodeGroup
- Maximum: 10 nodes per NodeGroup
- NodeGroup must be in "Ready" state
Resource Requirements:
- vCloud quota availability for CPU, memory, and storage
- IP address availability within assigned subnets
- Sufficient cluster-level resources for orchestration
When to Scale
Scale Up Scenarios
- Performance Issues: CPU above 70%, memory above 80%, or pod scheduling failures
- Capacity Planning: Traffic growth, seasonal events, or high availability requirements
- Development: Testing and deployment activities requiring extra capacity
Scale Down Scenarios
- Efficiency: Resource utilization below 30% across multiple nodes
- Cost Optimization: Reducing unnecessary infrastructure expenses
- Operational: Maintenance windows, off-peak periods, or project completion
Scaling Strategies
Conservative Scaling
- Scale 1-2 nodes at a time
- Monitor impact before additional scaling
- Best for: Production environments, cost-sensitive workloads
Aggressive Scaling
- Scale rapidly to meet immediate demand
- Higher initial over-provisioning
- Best for: High-availability requirements, variable workloads
Predictive Scaling
- Scale based on historical patterns
- Pre-scale before anticipated load increases
- Best for: Scheduled workloads, known traffic patterns