All terms
Hardware & Systems
Tensor Core
Specialized units in NVIDIA GPUs that accelerate the matrix math behind deep learning.
Definition
Tensor Cores are specialized execution units inside NVIDIA GPUs, introduced with the Volta architecture, that accelerate the matrix multiply-accumulate operations dominating deep learning. Each performs a small matrix multiplication and accumulation per clock cycle, taking low-precision inputs such as FP16, BF16, FP8, or INT8 while accumulating in higher precision. This mixed-precision approach sharply increases throughput and reduces memory use, and it is the main source of the advertised performance in NVIDIA's data-center GPUs.