All terms
Optimization
Rejection Sampling Fine-Tuning
Fine-tuning on a model's own outputs after filtering for the best ones.
Definition
Rejection sampling fine-tuning generates many candidate outputs from a model, keeps only those that pass a scorer or reward model, and fine-tunes the model on that filtered set. By training on its own high-quality samples, the model reinforces behaviors a judge considers good. It is a simple way to improve reasoning or instruction-following without the complexity of full reinforcement learning, and is often used to build distillation or preference datasets.