---
name: gpu-capacity
version: 2.0.0
author: "Eliad Maimon <eliadm@amazon.com>"
display_name: GPU Capacity
description: "Help AWS internal users guide their external customers on how to get GPU/accelerator capacity on AWS — activate when someone asks about GPU availability, where to find instances (P5, P4, P5e, P5en, P6, Trn, Trn2, Inf2, G-series, etc.), capacity blocks, spot GPU, how to get GPU capacity, needs guidance choosing between Spot, Capacity Blocks, On-Demand, and ODCRs, OR asks about GPU chip availability by name (B200, B300, H100, H200, A100, L4, L40S), OR asks which region has a specific instance/GPU, OR asks about regional availability of accelerator instances, OR asks how to escalate a capacity request, OR asks about the ML Demand Intake process."
icon: "🖥️"
trigger: "GPU availability, how to get GPU, find capacity blocks, spot GPU capacity, GPU options, where can I get instances, H100, H200, B200, B300, A100, L4, L40S, P5, P5e, P5en, P6, p6-b200, p6-b300, Trn, Trn2, Inf2, capacity, which region, where is available, find in region, regional availability, where can I find, escalate, escalation, ML Demand Intake, can't get capacity, no capacity available, capacity request, approval process"
inputs:
  - name: query
    description: "The user's GPU capacity question in natural language."
    type: string
    required: true
tools: [browser_launch, browser_navigate, browser_screenshot, browser_read_page, browser_click, browser_type, browser_run_js, browser_batch, url_fetch, run_python, file_write, open_in_session_tab]
depends-on: [browser]
last-verified: 2026-05-27
model_compatibility: [smart]
source: "slack:#quick-desktop-productivity"
source_ts: "1780419212.340839"
---

## Overview

This skill helps AWS internal users guide their external customers on how to get GPU/accelerator capacity on AWS, by combining live data from internal tools with decision-tree guidance. It scrapes live availability, searches capacity blocks, checks pricing, and provides purchasing recommendations.

**Performance target:** <45 seconds total. If any tool takes >15s, skip and provide direct link.

**No exploration:** Each step below provides exact selectors, JS scripts, and element references. Execute them directly — do NOT explore the page, list elements, or take investigative screenshots first. If the prescribed approach fails on first attempt, take ONE diagnostic screenshot, adapt selectors based on what you see, and retry once. If that also fails, bail with a direct link.

**CB-first rule:** When the user needs capacity for >24 hours and wants predictability (training runs, demos, planned workloads), recommend **Capacity Blocks first** — guaranteed capacity, no interruption, better than OD pricing. Only suggest Spot/OD as alternatives if no CBs are available for the timeframe, or user explicitly prioritizes budget over reliability.

## Key Resources

| Resource | URL | Focus |
| --- | --- | --- |
| GPU Availability Tracker | https://gpu-availability.emea-genai.startups.aws.dev/gpu-availability/ | Real-time availability by region |
| Capacity Blocks Finder | https://capacity-finder.emea-genai.startups.aws.dev/capacity-finder/ | Reservable capacity blocks |
| EC2 GPU Genie | https://d3px0ly4ak66lc.cloudfront.net/instance-pricing | Pricing (OD, Spot, CB) |
| Spot GPU Wiki | https://w.amazon.com/bin/view/Users/hyangelo/EC2SpotGPUCapacity/ | Spot capacity dashboard |
| GPU Instance Reference | https://d3px0ly4ak66lc.cloudfront.net/accelerator-to-aws-instance-mapping | Full accelerator → instance type mapping |
| Decision Tree Blog | https://builder.aws.com/content/3B88fT4nDiNkg5Wrya4WV5zroSS/how-to-get-gpu-capacity-on-aws | Purchasing decision logic |
| EC2 Capacity Escalation | https://w.amazon.com/bin/view/AWS/Teams/Core_Services/EC2_Capacity_Escalation_Approvals | Formal capacity allocation mechanism |

## Common Aliases

- **H100** → p5
- **H200** → p5e.48xlarge / p5en.48xlarge (always check BOTH when user asks about H200)
- **B200** → p6-b200
- **B300** → p6-b300
- **A100** → p4d / p4de
- **L4** → g6.xlarge through g6.48xlarge
- **L40S** → g6e.xlarge through g6e.48xlarge
- **RTX 6000 Pro** → g7e.2xlarge through g7e.48xlarge

## Workflow

### Step 1: Parse Intent and Extract Parameters
Intents: availability_check, capacity_blocks_search, spot_info, pricing_info, general_guidance, escalation_request.
Alias resolution: resolve GPU chip names to instance types using Common Aliases before proceeding.

### Step 2: Check GPU Availability Tracker
Navigate to https://gpu-availability.emea-genai.startups.aws.dev/gpu-availability/ and filter by instance type. Extract availability tables via JS. H200 rule: run for BOTH p5e.48xlarge AND p5en.48xlarge. Do NOT report OD/Spot as available if success rate is 0%.

### Step 3: Search Capacity Blocks Finder (CONDITIONAL)
Run when: general capacity question, explicit CB request, duration >24h, or OD success rate is 0%. Navigate to https://capacity-finder.emea-genai.startups.aws.dev/capacity-finder/ and fill the Streamlit form.

### Step 4: Check EC2 GPU Genie — Pricing (CONDITIONAL)
Run when user explicitly asks about pricing/cost.

### Step 5: Check Spot GPU Capacity via Wiki Dashboard (CONDITIONAL)
Run when user asks about Spot capacity. Navigate to https://w.amazon.com/bin/view/Users/hyangelo/EC2SpotGPUCapacity/ and filter by zone and instance type.

### Step 6: Decision Tree Guidance
Choose Spot (fault-tolerant, batch, budget), Capacity Blocks (planned runs, >24h, guaranteed), On-Demand (dev/proto), or ODCRs (production inference, guaranteed AZ) based on workload.

### Step 7: Compose Response
Structure: (1) Direct answer, (2) Availability table, (3) Recommendation, (4) Self-service links. Always include success rate caveat for OD/Spot percentages.

### Step 8: Escalation Paths (CONDITIONAL)
Run when all self-serve options show 0% or user asks about escalation. Tiers: Slack #accelerated-computing-interest → Geo POC pre-qualification → Formal ML Demand Intake (for P4*/P5*/P6*/Trn* when console self-serve fails).

## Output
Concise, actionable response with direct answer, availability table, recommendation, and self-service links.
