Introduction to Generative Modelling

Learn how machines dream up new data, why we care about $p(\mathbf{x})$, and how to do it responsibly.

"A generator walked into a bar. The barman said, 'We don't serve your type.' The generator replied, 'Let me sample a different persona.'"

🎨

What is Generative Modelling?

Generative modeling focuses on creating models that learn the underlying probability distribution of data, $p(\mathbf{x})$. Once trained, they can sample new data that feels like the real thing—images, audio, text, even strategies. Two big tribes: explicit models that write down densities, and implicit models that learn transformations from simple noise to rich structure.

Explicit Density Models

Write down $p(\mathbf{x})$ and learn its parameters by maximizing likelihood.
Examples: Gaussian Mixture Models, Hidden Markov Models.
Variational Autoencoders encode → sample → decode using variational tricks.

Implicit Density Models

Skip the explicit density; learn a transformation from simple noise to data space.
Examples: GANs (generator vs discriminator), Normalizing Flows (invertible steps), Autoregressive models (PixelCNN, WaveNet).
Sampling feels like a party trick: draw noise, press go, watch structure appear.

🧠 Quick Quiz Earn +10 vibes

Which option best describes an implicit density model?

🌍

Where do we use these models?

Medicine (GANs in scrubs)

Image synthesis for data scarcity; MRI → CT translation for easier diagnosis.
De novo molecular design and drug-target interaction prediction.
Personalized treatment effects and prognosis prediction via patient-specific generators.

Refs: han2018synthesis, yi2019generative, dar2019image, olivecrona2017molecular, esteban2017real.

Sports (simulators that trash-talk)

Player trajectory synthesis to explore offensive/defensive strategies.
Simulated game situations for practice without injuries.
Performance prediction and virtual coaching with synthetic scenarios.

Refs: lowe2018multi, le2017coordinated, seoane2019human, andrychowicz2017hindsight.

Entertainment (pixels that perform)

Image, music, and text generation to co-create with artists.
Style transfer for visuals and video; procedural worlds for games.
NPC behavior generation for more immersive play.

Refs: goodfellow2014generative, karras2019style, briot2019deep, justesen2019deep.

Education (teachers with turbo mode)

Automatic question generation and practice sets.
Essay scoring assistants (with human oversight).
Personalized learning materials and knowledge tracing.

Refs: du2017learning, taghipour2016neural, piech2015deep.

🧺

Datasets that fuel the magic

Text

Wikipedia (bojanowski2017enriching)
Common Crawl (najafabadi2015deep)
Gutenberg (lahiri2014text)

Video

Kinetics (kay2017kinetics)
Sports-1M (karpathy2014large)
HMDB-51 (kuehne2011hmdb)

Audio

Free Music Archive (de2016fma)
Speech Commands (warden2018speech)
UrbanSound8K (salamon2014dataset)

Images

ImageNet (deng2009imagenet)
COCO (lin2014microsoft)
Places365 (zhou2017places)

🧮

How do we judge a generator?

Text

Perplexity for likelihood vibes; BLEU for machine translation faithfulness; humans for fluency and coherence.

Video

PSNR/SSIM for fidelity; human checks for realism and story flow.

Audio

MOS for human delight; MSE/STOI for signal quality and intelligibility.

Images

FID and Inception Score for distribution match; humans for "does this feel right?".

⚠️

Dangers & Responsible Use

Misinformation: generators can hallucinate with confidence.
Ethics: biases in data become biases in outputs.
Manipulation: deepfakes and disinformation campaigns.

Creativity drain: over-reliance can flatten human originality.
Privacy: sensitive data might leak through generations.
Mitigate: content verification, transparency, and human review.

🧭

Tiny Mind Map

Foundation

Data distribution $p(\mathbf{x})$ + sampling

Toolkits

VAEs · GANs · Flows · Autoregressive

Guardrails

Metrics · Human eval · Safety checks

A 3-line pep talk for your generator:

"Start from noise, embrace the sway,
Bend the likelihood your way,
Check with humans—ship then play."

🧪

Mini Lab: Sample a story

Click to sample a playful scenario showing how noise can morph into meaning. It's a toy version of "implicit generation".

Starting from noise...