Skip to main content
All terms
Patterns

Late Chunking

Embedding a whole document first, then splitting it, so each chunk keeps full-document context.

Definition

Late chunking embeds an entire document before splitting it into chunks, so each chunk's vector reflects the meaning of the whole text rather than the chunk in isolation. This preserves context — like what a pronoun refers to — that ordinary chunk-then-embed pipelines lose.