All terms
Patterns
Late Chunking
Embedding a whole document first, then splitting it, so each chunk keeps full-document context.
Definition
Late chunking embeds an entire document before splitting it into chunks, so each chunk's vector reflects the meaning of the whole text rather than the chunk in isolation. This preserves context — like what a pronoun refers to — that ordinary chunk-then-embed pipelines lose.