πŸ“ˆ

Convex Functions

From Jensen's Inequality to Log-Sum-Exp: The Core of Convex Optimization.

πŸ“˜

1. What is a Convex Function?

Mathematical Definition

A function \( f: \mathbb{R}^n \rightarrow \mathbb{R} \) is convex if \( \textbf{dom } f \) is a convex set and for all \( x, y \in \textbf{dom } f \) and \( 0 \leq \theta \leq 1 \):

\[ f(\theta x + (1 - \theta) y) \leq \theta f(x) + (1-\theta) f(y) \]

Geometric Interpretation

The line segment joining points \( (x, f(x)) \) and \( (y, f(y)) \) lies above the graph of \( f \).

Terminology
  • Strictly Convex: Inequality is strict for \( x \neq y, 0 < \theta < 1 \).
  • Concave: \( -f \) is convex.
  • Strictly Concave: \( -f \) is strictly convex.

Visual Definition

Convex Function Geometric Definition

The line segment (chord) connecting any two points on the graph lies on or above the curve.

1️⃣

2. First Order Condition

Gradient Inequality

Suppose \( f \) is differentiable. Then \( f \) is convex if and only if \( \textbf{dom } f \) is convex and:

\[ f(y) \geq f(x) + \nabla f(x)^T (y - x) \]

for all \( x, y \in \textbf{dom } f \).

Interpretation: The first-order Taylor approximation is a global underestimator of the function. The tangent plane always lies below the graph!

First Order Convexity Condition

The tangent plane (first-order approximation) is always a global underestimator.

2️⃣

3. Second Order Condition

Hessian Condition

Assume \( f \) is twice differentiable. Then \( f \) is convex if and only if its Hessian is positive semidefinite:

\[ \nabla^2 f(x) \succeq 0 \]

For 1D functions, this means \( f''(x) \geq 0 \) (positive curvature).

Second Order Convexity Condition

Positive definite Hessian implies positive curvature in all directions (bowl shape).

Example: Quadratic Function

Consider \( f(x) = \frac{1}{2} x^T P x + q^T x + r \).

The gradient is \( \nabla f(x) = Px + q \).

The Hessian is \( \nabla^2 f(x) = P \).

Conclusion: \( f \) is convex if and only if \( P \succeq 0 \).

⚠️ Important Note

The domain MUST be convex! Example: \( f(x) = 1/x^2 \) has \( f''(x) > 0 \) everywhere, but its domain \( \mathbb{R} \setminus \{0\} \) is not convex, so the function is not convex.

πŸ’‘

4. Famous Examples

Log-Sum-Exp

\[ f(x) = \log\left( \sum_{i=1}^n e^{x_i} \right) \]

This is a smooth approximation of the max function. It is convex.

Log-Determinant

\[ f(X) = \log \det X, \quad X \succ 0 \]

This function is concave on the set of positive definite matrices.

Norms

\[ f(x) = \|x\|_p \]

Every norm is convex due to the triangle inequality.

Negative Entropy

\[ f(x) = x \log x \]

Convex on \( \mathbb{R}_{++} \) (since \( f''(x) = 1/x > 0 \)).

πŸ”οΈ

5. Epigraph and Hypograph

Epigraph ("Above Graph")

The epigraph of a function \( f: \mathbb{R}^n \rightarrow \mathbb{R} \) is the set of points lying on or above its graph:

\[ \textbf{epi } f = \{ (x,t) \mid x \in \textbf{dom } f, f(x) \leq t \} \]

Hypograph ("Below Graph")

The hypograph is the set of points lying on or below the graph:

\[ \textbf{hypo } f = \{ (x,t) \mid x \in \textbf{dom } f, t \leq f(x) \} \]

Key Theorems

  • A function is convex if and only if its epigraph is a convex set.
  • A function is concave if and only if its hypograph is a convex set.
Epigraph Visualization

The epigraph is the shaded region above the curve.

βš–οΈ

6. Jensen's Inequality

Jensen's Inequality Visualization

The Inequality

For a convex function \( f \), the function value of the average is less than or equal to the average of the function values:

\[ f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y) \]

General Form (Expectation):

\[ f(\mathbb{E}[x]) \leq \mathbb{E}[f(x)] \]

Applications

  • AM-GM Inequality: \( \sqrt{ab} \leq \frac{a+b}{2} \)

    Proof: Use \( f(x) = -\log x \) (convex) and \( \theta = 0.5 \).

  • HΓΆlder's Inequality: Relates \( L_p \) norms.

    Proof uses \( f(x) = -\log x \) with specific weights.

πŸ”§

7. Operations Preserving Convexity

1. Non-Negative Weighted Sum

If \( f_1, \dots, f_m \) are convex and \( w_i \geq 0 \), then \( f = \sum w_i f_i \) is convex.

2. Composition with Affine Map

If \( f \) is convex, then \( g(x) = f(Ax + b) \) is convex.

3. Pointwise Maximum

If \( f_1, f_2 \) are convex, then \( f(x) = \max\{ f_1(x), f_2(x) \} \) is convex.

Example: The max of affine functions \( \max_i (a_i^T x + b_i) \) is convex.

4. Composition Rules

For \( f(x) = h(g(x)) \):

  • \( f \) is convex if \( h \) is convex & non-decreasing, and \( g \) is convex.
  • \( f \) is convex if \( h \) is convex & non-increasing, and \( g \) is concave.
πŸŽ“

8. Advanced Example: Geometric Mean

Theorem: Geometric Mean is Concave

The function \( f(x) = (\prod_{i=1}^n x_i)^{1/n} \) is concave on \( \mathbb{R}_{++}^n \).

Proof Sketch:

Consider the Hessian \( \nabla^2 f(x) \). After some derivation, we find:

\[ v^T \nabla^2 f(x) v = -\frac{\prod x_i^{1/n}}{n^2} \left( n \sum \frac{v_i^2}{x_i^2} - (\sum \frac{v_i}{x_i})^2 \right) \]

By the Cauchy-Schwarz inequality (with vectors \( \mathbf{1} \) and \( v_i/x_i \)), the term in the parentheses is non-negative.

Thus, \( v^T \nabla^2 f(x) v \leq 0 \), so the Hessian is negative semidefinite. Therefore, \( f \) is concave. \(\blacksquare\)

9. Extended Value Extensions

To handle domains implicitly, we define the extended value extension \( \tilde{f} \):

\[ \tilde{f}(x) = \begin{cases} f(x) & x \in \textbf{dom } f \\ \infty & x \notin \textbf{dom } f \end{cases} \]

Indicator Function

For a set \( C \), the indicator function \( I_C(x) \) is 0 if \( x \in C \) and \( \infty \) otherwise.

Fact: A set \( C \) is convex if and only if its indicator function \( I_C \) is convex.

⭐

10. Conjugate Functions

Definition

Let \( f: \mathbb{R}^n \rightarrow \mathbb{R} \). The conjugate function \( f^*: \mathbb{R}^n \rightarrow \mathbb{R} \) is defined as:

\[ f^*(y) = \sup_{x \in \textbf{dom } f} (y^T x - f(x)) \]

The domain of \( f^* \) consists of \( y \) for which this supremum is finite.

Key Property

\( f^* \) is always convex, regardless of whether \( f \) is convex!

Reason: It is the pointwise supremum of a family of affine functions of \( y \).

Conjugate Function Geometric Interpretation

Geometric interpretation: The maximum gap between the linear function \( y^T x \) and \( f(x) \).

Examples of Conjugates

Affine: \( ax + b \)

\( f^*(y) = -b \) if \( y=a \), else \( \infty \)

Neg Log: \( -\log x \)

\( f^*(y) = -1 - \log(-y) \) for \( y < 0 \)

Exp: \( e^x \)

\( f^*(y) = y \log y - y \) for \( y \geq 0 \)

Entropy: \( x \log x \)

\( f^*(y) = e^{y-1} \)

Inverse: \( 1/x \)

\( f^*(y) = -2\sqrt{-y} \) for \( y \leq 0 \)

✍️

11. Test Your Understanding

1. Is the function \( f(x) = x^3 \) convex on \( \mathbb{R} \)?

2. Is the conjugate function \( f^* \) always convex?

πŸ” Spot the Mistake!

A student claims:

"The function \( f(x) = x^2 \) is strictly convex, so its Hessian must be strictly positive definite everywhere (i.e., \( f''(x) > 0 \))."

True. Strictly convex means strictly positive curvature.
False. Counterexample: \( f(x) = x^4 \).
←

Previous Chapter

Analysis

Next Chapter

Convex Optimization

β†’