Convex Functions | Optimization Methods

📘

1. What is a Convex Function?

Mathematical Definition

A function \( f: \mathbb{R}^n \rightarrow \mathbb{R} \) is convex if \( \textbf{dom } f \) is a convex set and for all \( x, y \in \textbf{dom } f \) and \( 0 \leq \theta \leq 1 \):

\[ f(\theta x + (1 - \theta) y) \leq \theta f(x) + (1-\theta) f(y) \]

Geometric Interpretation

The line segment joining points \( (x, f(x)) \) and \( (y, f(y)) \) lies above the graph of \( f \).

Terminology

Strictly Convex: Inequality is strict for \( x \neq y, 0 < \theta < 1 \).
Concave: \( -f \) is convex.
Strictly Concave: \( -f \) is strictly convex.

Visual Definition

The line segment (chord) connecting any two points on the graph lies on or above the curve.

1️⃣

2. First Order Condition

Gradient Inequality

Suppose \( f \) is differentiable. Then \( f \) is convex if and only if \( \textbf{dom } f \) is convex and:

\[ f(y) \geq f(x) + \nabla f(x)^T (y - x) \]

for all \( x, y \in \textbf{dom } f \).

Interpretation: The first-order Taylor approximation is a global underestimator of the function. The tangent plane always lies below the graph!

The tangent plane (first-order approximation) is always a global underestimator.

2️⃣

3. Second Order Condition

Hessian Condition

Assume \( f \) is twice differentiable. Then \( f \) is convex if and only if its Hessian is positive semidefinite:

\[ \nabla^2 f(x) \succeq 0 \]

For 1D functions, this means \( f''(x) \geq 0 \) (positive curvature).

Positive definite Hessian implies positive curvature in all directions (bowl shape).

Example: Quadratic Function

Consider \( f(x) = \frac{1}{2} x^T P x + q^T x + r \).

The gradient is \( \nabla f(x) = Px + q \).

The Hessian is \( \nabla^2 f(x) = P \).

Conclusion: \( f \) is convex if and only if \( P \succeq 0 \).

⚠️ Important Note

The domain MUST be convex! Example: \( f(x) = 1/x^2 \) has \( f''(x) > 0 \) everywhere, but its domain \( \mathbb{R} \setminus \{0\} \) is not convex, so the function is not convex.

💡

4. Famous Examples

Log-Sum-Exp

\[ f(x) = \log\left( \sum_{i=1}^n e^{x_i} \right) \]

This is a smooth approximation of the max function. It is convex.

Log-Determinant

\[ f(X) = \log \det X, \quad X \succ 0 \]

This function is concave on the set of positive definite matrices.

Norms

\[ f(x) = \|x\|_p \]

Every norm is convex due to the triangle inequality.

Negative Entropy

\[ f(x) = x \log x \]

Convex on \( \mathbb{R}_{++} \) (since \( f''(x) = 1/x > 0 \)).

🏔️

5. Epigraph and Hypograph

Epigraph ("Above Graph")

The epigraph of a function \( f: \mathbb{R}^n \rightarrow \mathbb{R} \) is the set of points lying on or above its graph:

\[ \textbf{epi } f = \{ (x,t) \mid x \in \textbf{dom } f, f(x) \leq t \} \]

Hypograph ("Below Graph")

The hypograph is the set of points lying on or below the graph:

\[ \textbf{hypo } f = \{ (x,t) \mid x \in \textbf{dom } f, t \leq f(x) \} \]

Key Theorems

A function is convex if and only if its epigraph is a convex set.
A function is concave if and only if its hypograph is a convex set.

The epigraph is the shaded region above the curve.

⚖️

6. Jensen's Inequality

The Inequality

For a convex function \( f \), the function value of the average is less than or equal to the average of the function values:

\[ f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta)f(y) \]

General Form (Expectation):

\[ f(\mathbb{E}[x]) \leq \mathbb{E}[f(x)] \]

Applications

AM-GM Inequality: \( \sqrt{ab} \leq \frac{a+b}{2} \)
Proof: Use \( f(x) = -\log x \) (convex) and \( \theta = 0.5 \).
Hölder's Inequality: Relates \( L_p \) norms.
Proof uses \( f(x) = -\log x \) with specific weights.

🔧

7. Operations Preserving Convexity

1. Non-Negative Weighted Sum

If \( f_1, \dots, f_m \) are convex and \( w_i \geq 0 \), then \( f = \sum w_i f_i \) is convex.

2. Composition with Affine Map

If \( f \) is convex, then \( g(x) = f(Ax + b) \) is convex.

3. Pointwise Maximum

If \( f_1, f_2 \) are convex, then \( f(x) = \max\{ f_1(x), f_2(x) \} \) is convex.

Example: The max of affine functions \( \max_i (a_i^T x + b_i) \) is convex.

4. Composition Rules

For \( f(x) = h(g(x)) \):

\( f \) is convex if \( h \) is convex & non-decreasing, and \( g \) is convex.
\( f \) is convex if \( h \) is convex & non-increasing, and \( g \) is concave.

🎓

8. Advanced Example: Geometric Mean

Theorem: Geometric Mean is Concave

The function \( f(x) = (\prod_{i=1}^n x_i)^{1/n} \) is concave on \( \mathbb{R}_{++}^n \).

Proof Sketch:

Consider the Hessian \( \nabla^2 f(x) \). After some derivation, we find:

\[ v^T \nabla^2 f(x) v = -\frac{\prod x_i^{1/n}}{n^2} \left( n \sum \frac{v_i^2}{x_i^2} - (\sum \frac{v_i}{x_i})^2 \right) \]

By the Cauchy-Schwarz inequality (with vectors \( \mathbf{1} \) and \( v_i/x_i \)), the term in the parentheses is non-negative.

Thus, \( v^T \nabla^2 f(x) v \leq 0 \), so the Hessian is negative semidefinite. Therefore, \( f \) is concave. \(\blacksquare\)

9. Extended Value Extensions

To handle domains implicitly, we define the extended value extension \( \tilde{f} \):

\[ \tilde{f}(x) = \begin{cases} f(x) & x \in \textbf{dom } f \\ \infty & x \notin \textbf{dom } f \end{cases} \]

Indicator Function

For a set \( C \), the indicator function \( I_C(x) \) is 0 if \( x \in C \) and \( \infty \) otherwise.

Fact: A set \( C \) is convex if and only if its indicator function \( I_C \) is convex.

⭐

10. Conjugate Functions

Definition

Let \( f: \mathbb{R}^n \rightarrow \mathbb{R} \). The conjugate function \( f^*: \mathbb{R}^n \rightarrow \mathbb{R} \) is defined as:

\[ f^*(y) = \sup_{x \in \textbf{dom } f} (y^T x - f(x)) \]

The domain of \( f^* \) consists of \( y \) for which this supremum is finite.

Key Property

\( f^* \) is always convex, regardless of whether \( f \) is convex!

Reason: It is the pointwise supremum of a family of affine functions of \( y \).

Conjugate Function Geometric Interpretation

Geometric interpretation: The maximum gap between the linear function \( y^T x \) and \( f(x) \).

Examples of Conjugates

Affine: \( ax + b \)

\( f^*(y) = -b \) if \( y=a \), else \( \infty \)

Neg Log: \( -\log x \)

\( f^*(y) = -1 - \log(-y) \) for \( y < 0 \)

Exp: \( e^x \)

\( f^*(y) = y \log y - y \) for \( y \geq 0 \)

Entropy: \( x \log x \)

\( f^*(y) = e^{y-1} \)

Inverse: \( 1/x \)

\( f^*(y) = -2\sqrt{-y} \) for \( y \leq 0 \)

✍️

11. Test Your Understanding

1. Is the function \( f(x) = x^3 \) convex on \( \mathbb{R} \)?

2. Is the conjugate function \( f^* \) always convex?

🔍 Spot the Mistake!

A student claims:

"The function \( f(x) = x^2 \) is strictly convex, so its Hessian must be strictly positive definite everywhere (i.e., \( f''(x) > 0 \))."

True. Strictly convex means strictly positive curvature.

False. Counterexample: \( f(x) = x^4 \).