All terms
Data
Data Lineage
The traceable record of where data came from and how it was transformed over time.
Definition
Data lineage is the traceable record of where a dataset originated and how it was transformed at each step, from collection through filtering, cleaning, and merging. It documents sources, processing operations, and dependencies so a final dataset can be audited and reproduced. For AI work, lineage supports debugging, licensing and privacy compliance, and trust in the data behind a model.