Programming lesson
Mastering Subdifferentials: Convex Analysis Homework Tutorial (Ece490)
Struggling with subdifferential proofs for convex functions? This tutorial breaks down closedness, convexity, and subgradients of L1, L-infinity, and L2 norms with real-world AI and trend examples.
Understanding Subdifferentials in Convex Analysis
Subdifferentials are a cornerstone of convex analysis, essential for optimization in machine learning, finance, and AI. In this tutorial, we'll tackle key concepts from Ece490 homework 5, focusing on proving that the subdifferential of a convex function is closed and convex, and computing subdifferentials for common norm functions. We'll connect these ideas to current trends like AI model training and portfolio optimization.
What is a Subdifferential?
For a convex function f : ℝn → ℝ, the subdifferential at a point x is the set of all subgradients g such that f(y) ≥ f(x) + g·(y – x) for all y. This generalizes the derivative for nonsmooth functions, crucial in applications like training neural networks with ReLU activations or optimizing L1-regularized models (LASSO).
Proving Closedness and Convexity of ∂f(x)
Let f be convex and x ∈ dom(f). We show ∂f(x) is closed and convex.
Closedness
Take a convergent sequence (gk) in ∂f(x) with limit g. For any y, f(y) ≥ f(x) + gk·(y – x). Taking limits, f(y) ≥ f(x) + g·(y – x), so g ∈ ∂f(x). Hence ∂f(x) is closed.
Convexity
Take g1, g2 ∈ ∂f(x) and λ ∈ [0,1]. For any y, f(y) ≥ f(x) + g1·(y – x) and similarly for g2. Multiply by λ and (1-λ) and add: f(y) ≥ f(x) + (λg1 + (1-λ)g2)·(y – x). So λg1+(1-λ)g2 ∈ ∂f(x), proving convexity.
This property is analogous to how a well-trained AI model's parameter updates form a convex set, ensuring stable convergence.
Subdifferentials of Norm Functions
Norms are convex and widely used in regularization. We compute ∂f(x) and directional derivative f'(x; v) for three norms.
(a) ℓ1 Norm: f(x) = ‖x‖1
The subdifferential at x is: ∂f(x) = { g : gi = sign(xi) if xi ≠ 0, gi ∈ [-1,1] if xi = 0 }. Directional derivative: f'(x; v) = ∑i (sign(xi)vi if xi ≠ 0 else |vi|).
Trend connection: In LASSO regression for predicting stock trends, the ℓ1 norm induces sparsity, selecting only the most influential features.
(b) ℓ∞ Norm: f(x) = ‖x‖∞
The subdifferential: ∂f(x) = conv{ sign(xi)ei : |xi| = ‖x‖∞ }. Directional derivative: f'(x; v) = max{ sign(xi)vi : |xi| = ‖x‖∞ }.
Trend connection: In robust optimization for AI systems, the ℓ∞ norm bounds worst-case perturbations, akin to ensuring a self-driving car's sensor errors stay within limits.
(c) ℓ2 (Euclidean) Norm: f(x) = ‖x‖2
If x ≠ 0, ∂f(x) = { x / ‖x‖2 }. If x = 0, ∂f(0) = { g : ‖g‖2 ≤ 1 }. Directional derivative: f'(x; v) = (x·v)/‖x‖2 for x ≠ 0, and ‖v‖2 for x = 0.
Trend connection: In training generative AI models like diffusion models, the ℓ2 norm measures reconstruction error, with subgradients guiding gradient descent.
Subdifferential of a Piecewise-Linear Convex Function
Let f(x) = maxi=1,…,m (ai·x + bi). The subdifferential is: ∂f(x) = conv{ ai : ai·x + bi = f(x) }. This is a convex polytope.
Trend connection: Piecewise-linear functions appear in AI decision trees and boosting algorithms, where subgradients enable efficient optimization.
Subdifferential of a Maximum of Smooth Functions
If f(x) = maxi=1,…,m fi(x) with each fi convex and differentiable, then: ∂f(x) = { ∑i∈I(x) λi ∇fi(x) : λi ≥ 0, ∑λi = 1 }, where I(x) = { i : fi(x) = f(x) }.
Trend connection: In ensemble methods for AI, like random forests, the maximum of multiple models' outputs is used, and subdifferentials help in backpropagation-like updates.
Practical Tips for Homework
- For proofs, always start with the definition of subdifferential.
- Use the convexity inequality to manipulate sequences.
- For norm functions, consider cases where x = 0 separately.
- Check that your subdifferential set is indeed convex and closed as proven.
Conclusion
Understanding subdifferentials is key to advanced optimization in AI, finance, and engineering. This tutorial covered the closedness and convexity of ∂f(x), subdifferentials of ℓ1, ℓ∞, ℓ2 norms, piecewise-linear functions, and maximum of smooth functions. Apply these concepts to your Ece490 homework and beyond.