From pre-flight VRAM checks to run analysis, code diagnosis, and cost tracking. Try the interactive scan, then scroll to see what a real Alloc workflow looks like.
Pick a model and GPU below. Alloc estimates VRAM, cost, and feasibility instantly.
GPU Comparison
You just saw the pre-flight. Here's what happens when you train.
After training, Alloc surfaces bottlenecks, phase breakdowns, and right-sizing recommendations.
4x A100-80GB · FSDP · feat/llama-finetune
Peak VRAM
51.2 GB
/ 80 GB per GPU
GPU Busy %
47%
across 4 GPUs
Step Time (p50)
284 ms
p90: 312 ms
Dataloader Wait
42%
of step time
Step Phase Breakdown
GPU Utilization
46%
VRAM Usage (GB)
51 GB
Recommendations
42% of step time spent waiting on data loading. Your GPUs are idle during this time.
FSDP with gradient checkpointing could reduce per-GPU VRAM by ~30%, enabling larger batch sizes.
Peak VRAM is 51.2 GB on 80 GB GPUs (64% utilization). With FSDP sharding across 4 GPUs, per-GPU usage drops to ~13 GB — an A10G (24 GB) could handle this at lower cost.
Config Comparison
| GPU | Strategy | Est. Cost | Status |
|---|---|---|---|
| 4x A100-80GB | FSDP | $12.40/hr | current |
| 4x A10G-24GB | FSDP | $5.40/hr | explore |
| 2x A100-80GB | FSDP | $6.20/hr | in fleet |
| 4x H100-80GB | FSDP | $16.80/hr | in fleet |
Now let's catch issues before they cost you GPU hours.
Alloc statically analyzes your training code to find common performance issues and suggests improvements.
$ alloc diagnose train.py
⚠DL001num_workers=0 (default)hightrain.py:47
⚠PREC001No mixed precision detectedhightrain.py:82
✔DIST002FSDP configured correctlytrain.py:31
2 issues found, 1 check passed
Run alloc diagnose --diff for patches
$
@@ -45,3 +45,5 @@
train_dataset = load_dataset("custom/data")
-train_loader = DataLoader(train_dataset, batch_size=32)
+train_loader = DataLoader(
+ train_dataset, batch_size=32,
+ num_workers=8, pin_memory=True, prefetch_factor=4
+)
All of this rolls up into budget and savings tracking.
Track GPU spend, see realized savings, and set budget guardrails for your team.
Monthly Budget
$2,847 / $5,000
$2,153 remaining
57% burned
Potential Savings
$1,200
Realized Savings
$680
Jobs Right-Sized
4
OOMs Prevented
2
Recent Runs
| Run | GPU | Cost | Status |
|---|---|---|---|
| llama3-8b-finetune | 4x A100-80GB | $4.23 | underutilized |
| mistral-7b-eval | 1x A100-80GB | $0.89 | balanced |
| qwen-72b-pretrain | 8x H100-80GB | $18.41 | compute bound |
| llama3-70b-lora | 4x A100-80GB | $6.72 | failed |
Install Alloc in one line. Get VRAM estimates, bottleneck detection, and cost tracking from your first run.