Introduction to Generative Modelling
Learn how machines dream up new data, why we care about $p(\mathbf{x})$, and how to do it responsibly.
"A generator walked into a bar. The barman said, 'We don't serve your type.' The generator replied, 'Let me sample a different persona.'"
What is Generative Modelling?
Generative modeling focuses on creating models that learn the underlying probability distribution of data, \(p(\mathbf{x})\). Once trained, they can sample new data that feels like the real thing—images, audio, text, even strategies. Two big tribes: explicit models that write down densities, and implicit models that learn transformations from simple noise to rich structure.
Explicit Density Models
- Write down \(p(\mathbf{x})\) and learn its parameters by maximizing likelihood.
- Examples: Gaussian Mixture Models, Hidden Markov Models.
- Variational Autoencoders encode → sample → decode using variational tricks.
Implicit Density Models
- Skip the explicit density; learn a transformation from simple noise to data space.
- Examples: GANs (generator vs discriminator), Normalizing Flows (invertible steps), Autoregressive models (PixelCNN, WaveNet).
- Sampling feels like a party trick: draw noise, press go, watch structure appear.
🧠 Quick Quiz Earn +10 vibes
Which option best describes an implicit density model?
Where do we use these models?
Medicine (GANs in scrubs)
- Image synthesis for data scarcity; MRI → CT translation for easier diagnosis.
- De novo molecular design and drug-target interaction prediction.
- Personalized treatment effects and prognosis prediction via patient-specific generators.
Refs: han2018synthesis, yi2019generative, dar2019image, olivecrona2017molecular, esteban2017real.
Sports (simulators that trash-talk)
- Player trajectory synthesis to explore offensive/defensive strategies.
- Simulated game situations for practice without injuries.
- Performance prediction and virtual coaching with synthetic scenarios.
Refs: lowe2018multi, le2017coordinated, seoane2019human, andrychowicz2017hindsight.
Entertainment (pixels that perform)
- Image, music, and text generation to co-create with artists.
- Style transfer for visuals and video; procedural worlds for games.
- NPC behavior generation for more immersive play.
Refs: goodfellow2014generative, karras2019style, briot2019deep, justesen2019deep.
Education (teachers with turbo mode)
- Automatic question generation and practice sets.
- Essay scoring assistants (with human oversight).
- Personalized learning materials and knowledge tracing.
Refs: du2017learning, taghipour2016neural, piech2015deep.
Datasets that fuel the magic
Text
- Wikipedia (bojanowski2017enriching)
- Common Crawl (najafabadi2015deep)
- Gutenberg (lahiri2014text)
Video
- Kinetics (kay2017kinetics)
- Sports-1M (karpathy2014large)
- HMDB-51 (kuehne2011hmdb)
Audio
- Free Music Archive (de2016fma)
- Speech Commands (warden2018speech)
- UrbanSound8K (salamon2014dataset)
Images
- ImageNet (deng2009imagenet)
- COCO (lin2014microsoft)
- Places365 (zhou2017places)
How do we judge a generator?
Text
Perplexity for likelihood vibes; BLEU for machine translation faithfulness; humans for fluency and coherence.
Video
PSNR/SSIM for fidelity; human checks for realism and story flow.
Audio
MOS for human delight; MSE/STOI for signal quality and intelligibility.
Images
FID and Inception Score for distribution match; humans for "does this feel right?".
Dangers & Responsible Use
- Misinformation: generators can hallucinate with confidence.
- Ethics: biases in data become biases in outputs.
- Manipulation: deepfakes and disinformation campaigns.
- Creativity drain: over-reliance can flatten human originality.
- Privacy: sensitive data might leak through generations.
- Mitigate: content verification, transparency, and human review.
Tiny Mind Map
Foundation
Data distribution \(p(\mathbf{x})\) + sampling
Toolkits
VAEs · GANs · Flows · Autoregressive
Guardrails
Metrics · Human eval · Safety checks
A 3-line pep talk for your generator:
"Start from noise, embrace the sway,
Bend the likelihood your way,
Check with humans—ship then play."
Mini Lab: Sample a story
Click to sample a playful scenario showing how noise can morph into meaning. It's a toy version of "implicit generation".