The effectiveness of deep learning is often limited by spurious correlations in training data. We propose Diffusing DeBias (DDB), which exploits conditional diffusion probabilistic models to generate synthetic bias-aligned images. These synthetic samples are used to train a robust Bias Amplifier (BA) that avoids memorization of bias-conflicting real samples and can be plugged into both two-step and end-to-end unsupervised debiasing recipes. DDB yields state-of-the-art results on multiple popular biased benchmarks while not degrading performance on unbiased data.
DDB trains a class-conditional diffusion model on the (biased) training set and then uses classifier-free guidance to sample a large set of synthetic images that amplify the dataset's bias patterns. A Bias Amplifier is trained on these synthetic bias-aligned samples; because it never sees the real dataset, it does not memorize the scarce bias-conflicting examples. The BA is then used to extract pseudo-labels or per-sample signals used by downstream debiasing algorithms (e.g., GroupDRO or an LfF-style reweighting).
Massimiliano Ciranni, Vito Paolo Pastore, Roberto Di Via, Enzo Tartaglione, Francesca Odone, Vittorio Murino. Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing. NeurIPS 2025.