Playground
Tools.
Working calculators, visualizers, and benchmarks for ML systems work. Open source, no auth, no telemetry. Pick a tool below.
Throughput Calculator
· LIVEBack-of-envelope tokens/sec for a given model, precision, and hardware. Memory-bound regime only; assumes batched serving with a healthy KV cache headroom.
Model size70B params
Batch size8
Sequence length4096 tokens
Precision
GPU
Estimate
Tokens / sec / GPU
383
Memory required
155.9 GB
Fits on 1 GPU?
needs sharding
Param memory (fp8)70.0 GB
KV cache @ batch=8, seq=409685.9 GB
HBM bandwidth (H100 SXM)3350 GB/s
Compute @ fp81979 TFLOPS
⚠ Estimate is memory-bound roofline only. Actual numbers depend on kernel quality, continuous batching, speculative decoding, and a dozen other things this tool doesn't model.
- Beta
GPU memory planner
Estimate VRAM for any combination of model size, precision, batch, sequence length, optimizer, and parallelism strategy — for both training and inference.
Open tool → - Experimental
Tokenizer explorer
Paste any text and see how seven popular tokenizers split it — BPE, WordPiece, SentencePiece, tiktoken, and friends, side by side.
Open tool → - Soon
Attention pattern atlas
A gallery of attention head signatures from popular open models — induction heads, name movers, sink tokens, retrieval circuits — annotated and searchable.
Open tool →