Skip to main content
All terms
Frameworks & Tools

Megatron-LM

NVIDIA's framework for training very large language models with combined parallelism strategies.

Definition

Megatron-LM is an open-source framework from NVIDIA Research for training large language models efficiently. It implements tensor parallelism, pipeline parallelism, and data parallelism together — sometimes called 3D parallelism — within an optimized PyTorch codebase, splitting both the model and the workload across many GPUs. Many of the largest publicly released models have been trained using Megatron-LM or frameworks derived from it.