How do I install Alloc?

Install the Alloc CLI from PyPI with 'pip install alloc'. For GPU monitoring support, install 'pip install alloc[gpu]'. Works on Linux and macOS with Python 3.8+.

What does 'alloc run' do?

The 'alloc run' command wraps your training command with a lightweight GPU monitoring sidecar. By default it uses calibrate-and-exit mode, automatically stopping when GPU metrics stabilize (typically 30-60 seconds). It captures VRAM usage, GPU utilization, and hardware context without modifying your training code.

Does Alloc modify my training code?

No. Alloc operates as an external monitoring layer. It never modifies your training code, never changes exit codes, and never blocks your jobs. If Alloc encounters an error, your training continues unaffected.

Does Alloc work in air-gapped environments?

Yes. All profiling happens locally. Artifacts are saved to disk and can be uploaded later when connectivity is available. Ghost Scan and Probe both work fully offline with no outbound internet required.

QUICKSTART

Get Started with Alloc in 60 Seconds

Profile GPU workloads, estimate VRAM requirements, and get right-sizing suggestions, without modifying your training code.

FROM ALLOCGUY

The single biggest time sink in ML isn't training. It's the iteration loop before training. Picking a GPU, guessing at batch sizes, waiting in a cloud queue, hitting OOM, adjusting, re-queuing. I've watched teams burn through an entire week just finding a config that doesn't crash.

Cloud GPU costs are steep: H100s at $3–5/hr, A100s at $2–4/hr, even a modest L4 at $0.50–0.80/hr. Every wasted hour has a real price tag. A single failed 8-GPU training run can easily waste $100+ before you even see an error message.

Alloc takes 60 seconds to install and run. That first scan could save you hours of trial-and-error and hundreds in wasted compute. I'd rather you find out your job needs 67 GB of VRAM on your laptop for free than on an A100 for $3/hr.

– allocguy

Install

Install the Alloc CLI from PyPI. Works on Linux and macOS with Python 3.8+.

pip install alloc

For GPU monitoring with NVIDIA Management Library (NVML) support, install the GPU extras:

pip install alloc[gpu]

Verify the installation:

alloc version

Ghost Scan: Static Analysis

Ghost scan analyzes your training script and model definition without executing anything. It produces a VRAM breakdown estimate showing parameters, optimizer states, activations, and framework overhead.

alloc ghost train_7b.py --dtype bf16

Ghost scan will output:

Estimated VRAM range for your model and dtype
Parameter count and memory breakdown by category
Whether your model fits on the target GPU
Suggestions for reducing memory if it does not fit

No GPU required. Ghost scan runs entirely on CPU using static analysis.

Probe Run: Live GPU Profiling

Wrap your training command with alloc run to profile actual GPU utilization during execution. By default, Alloc uses calibrate-and-exit mode: it automatically stops when GPU metrics stabilize, so you don't need to wait for a full training run.

alloc run python train.py

The probe captures:

Real-time VRAM usage, GPU utilization, and memory bandwidth
Multi-GPU process-tree discovery
Hardware context (driver version, CUDA version, SM architecture)
A fit/no-fit verdict with right-sizing suggestions

For a full-duration profile instead of calibrate-and-exit:

alloc run --full python train.py

Alloc never modifies your training code or changes its exit code. If Alloc encounters an error, your training continues unaffected.

Remote Scan: No GPU Needed

Run a VRAM estimation from anywhere: your laptop, a CI runner, or a Slack bot. No GPU required. Alloc uses its model catalog to estimate memory requirements for known architectures.

alloc scan --model llama-3-70b --gpu A100-80GB

Returns an estimated VRAM breakdown and fit verdict for the specified model and GPU combination. Useful for capacity planning before provisioning hardware.

Upload: Dashboard Integration

Optionally, log in and upload your profiling artifacts to the Alloc dashboard for historical tracking, team sharing, and AI-powered optimization suggestions.

alloc login

alloc upload alloc_artifact.json.gz

Uploads are optional and never block your training. All profiling works fully offline. Upload when you're ready.

All Commands

Command	Description
alloc ghost	Static VRAM estimation from a training script or model definition
alloc run	Live GPU profiling with auto-calibrate (wraps your training command)
alloc scan	Remote VRAM scan using the model catalog (no GPU needed)
alloc login	Authenticate with your Alloc account
alloc upload	Upload a profiling artifact to the Alloc dashboard
alloc catalog	List supported models and GPU configurations
alloc version	Print the installed Alloc CLI version

Configuration

Alloc can be configured through environment variables. All are optional. The CLI works out of the box with sensible defaults.

Variable	Description
ALLOC_API_URL	API endpoint for uploads and remote scans. Defaults to the Alloc cloud API. Set this to your own endpoint for air-gapped deployments.
ALLOC_TOKEN	Authentication token. Automatically set by `alloc login`. Can also be set manually for CI/CD environments.
ALLOC_UPLOAD	Set to `1` to automatically upload artifacts after each run. Disabled by default.

Python API

You can also use Alloc programmatically from Python. The alloc.ghost() function accepts a PyTorch model and returns a VRAM estimation report.

import alloc

# Pass your model to ghost for static VRAM estimation
report = alloc.ghost(model)

# Access the estimation results
print(report.estimated_vram)
print(report.breakdown)
print(report.verdict)

The Python API performs the same static analysis as the CLI ghost command, making it easy to integrate VRAM checks into your training pipelines or notebooks.

We Wasted $12k on GPUs in One Month

BLOG

A100 vs H100 vs L40S: Which GPU?

Get Started with Alloc in 60 Seconds

Install

Ghost Scan: Static Analysis

Probe Run: Live GPU Profiling

Remote Scan: No GPU Needed

Upload: Dashboard Integration

All Commands

Configuration

Python API

Next Steps

Documentation

Ghost Scan Deep Dive

Right-Sizing Guide

Create an Account