All terms
Foundations
Big Data
Datasets too large or fast-moving for traditional tools, often described by volume, velocity, and variety.
Definition
Big data refers to datasets whose volume, velocity, and variety exceed the capabilities of traditional data processing systems. Handling it relies on distributed systems, data lakes, and specialized pipelines. Modern AI training depends on massive corpora of web text, images, code, and other sources, which are collected and cleaned through such pipelines before training begins.