Introduction to Generative Modelling

Learn how machines dream up new data, why we care about $p(\mathbf{x})$, and how to do it responsibly.

"A generator walked into a bar. The barman said, 'We don't serve your type.' The generator replied, 'Let me sample a different persona.'"

🎨

What is Generative Modelling?

Generative modeling focuses on creating models that learn the underlying probability distribution of data, \(p(\mathbf{x})\). Once trained, they can sample new data that feels like the real thing—images, audio, text, even strategies. Two big tribes: explicit models that write down densities, and implicit models that learn transformations from simple noise to rich structure.

Explicit Density Models

  • Write down \(p(\mathbf{x})\) and learn its parameters by maximizing likelihood.
  • Examples: Gaussian Mixture Models, Hidden Markov Models.
  • Variational Autoencoders encode → sample → decode using variational tricks.

Implicit Density Models

  • Skip the explicit density; learn a transformation from simple noise to data space.
  • Examples: GANs (generator vs discriminator), Normalizing Flows (invertible steps), Autoregressive models (PixelCNN, WaveNet).
  • Sampling feels like a party trick: draw noise, press go, watch structure appear.

🧠 Quick Quiz Earn +10 vibes

Which option best describes an implicit density model?

🌍

Where do we use these models?

Medicine (GANs in scrubs)

  • Image synthesis for data scarcity; MRI → CT translation for easier diagnosis.
  • De novo molecular design and drug-target interaction prediction.
  • Personalized treatment effects and prognosis prediction via patient-specific generators.

Refs: han2018synthesis, yi2019generative, dar2019image, olivecrona2017molecular, esteban2017real.

Sports (simulators that trash-talk)

  • Player trajectory synthesis to explore offensive/defensive strategies.
  • Simulated game situations for practice without injuries.
  • Performance prediction and virtual coaching with synthetic scenarios.

Refs: lowe2018multi, le2017coordinated, seoane2019human, andrychowicz2017hindsight.

Entertainment (pixels that perform)

  • Image, music, and text generation to co-create with artists.
  • Style transfer for visuals and video; procedural worlds for games.
  • NPC behavior generation for more immersive play.

Refs: goodfellow2014generative, karras2019style, briot2019deep, justesen2019deep.

Education (teachers with turbo mode)

  • Automatic question generation and practice sets.
  • Essay scoring assistants (with human oversight).
  • Personalized learning materials and knowledge tracing.

Refs: du2017learning, taghipour2016neural, piech2015deep.

🧺

Datasets that fuel the magic

Text

  • Wikipedia (bojanowski2017enriching)
  • Common Crawl (najafabadi2015deep)
  • Gutenberg (lahiri2014text)

Video

  • Kinetics (kay2017kinetics)
  • Sports-1M (karpathy2014large)
  • HMDB-51 (kuehne2011hmdb)

Audio

  • Free Music Archive (de2016fma)
  • Speech Commands (warden2018speech)
  • UrbanSound8K (salamon2014dataset)

Images

  • ImageNet (deng2009imagenet)
  • COCO (lin2014microsoft)
  • Places365 (zhou2017places)
🧮

How do we judge a generator?

Text

Perplexity for likelihood vibes; BLEU for machine translation faithfulness; humans for fluency and coherence.

Video

PSNR/SSIM for fidelity; human checks for realism and story flow.

Audio

MOS for human delight; MSE/STOI for signal quality and intelligibility.

Images

FID and Inception Score for distribution match; humans for "does this feel right?".

⚠️

Dangers & Responsible Use

  • Misinformation: generators can hallucinate with confidence.
  • Ethics: biases in data become biases in outputs.
  • Manipulation: deepfakes and disinformation campaigns.
  • Creativity drain: over-reliance can flatten human originality.
  • Privacy: sensitive data might leak through generations.
  • Mitigate: content verification, transparency, and human review.
🧭

Tiny Mind Map

Foundation

Data distribution \(p(\mathbf{x})\) + sampling

Toolkits

VAEs · GANs · Flows · Autoregressive

Guardrails

Metrics · Human eval · Safety checks

A 3-line pep talk for your generator:

"Start from noise, embrace the sway,
Bend the likelihood your way,
Check with humans—ship then play."

🧪

Mini Lab: Sample a story

Click to sample a playful scenario showing how noise can morph into meaning. It's a toy version of "implicit generation".

Starting from noise...