Skip to main content
All terms
Inference & Serving

Static Batching

Grouping requests into fixed batches that all start and finish together.

Definition

Static batching groups requests into fixed batches that are assembled before execution and run together until the whole batch completes. It is simple but inefficient for generation, since fast requests must wait for the slowest one in the batch to finish before any new work begins. Continuous batching was developed to replace it, letting requests enter and leave a batch at each step.