Lecture 3: Optimization
Advanced Microeconomics I
Yosuke YASUDA
Osaka University, Department of Economics yasuda@econ.osaka-u.ac.jp
October 14, 2014
What and How to Optimize?
Optimization is a set of mathematical procedures to find the optimal value of some function.
We frequently adopt the assumption that an economic agent seeks to maximize or minimize some function, for example:
Consumer maximizes her utility function. Firm minimizes its cost function.
Seller at the auction maximizes (expected) revenue function. Government tries to maximize the social welfare function. We study (and apply) static optimization.
→ Dynamic optimization, a common tool in modern macroeconomics, will not be covered in our lectures...
Equality Constraints
Consider choosing x1 and x2 to maximize f (x1, x2) , when x1 and x2 must satisfy some particular relation to each other that we write implicit form as g(x1, x2) = 0 .
Formally, we write this problem as follows: maxx1,x2f(x1, x2) subject to g(x1, x2) = 0.
f(x1, x2): objective function. x1 and x2: choice variables. g(x1, x2): constraint.
The set of all (x1, x2) that satisfy the constraint: feasible set.
✞
✝
☎
Q Does solution always exist?✆
Continuity (1)
Intuitively, a function is continuous if a “small movement” in the domain does not cause a “big jump” in the range.
Definition 1
A function f : D (⊂ Rn) → R is called
1 continuous at a point x0 if, for all ε > 0, there exists δ > 0 such that d(x, x0) < δ implies that d(f (x), f (x0)) < ε.
2 continuous if it is continuous at every point in its domain.
3 uniformly continuous if, for all ε > 0, there exists δ > 0 such that for all x ∈ D, d(x, x0) < δ implies that
d(f (x), f (x0)) < ε.
✞
✝
☎
Rm If function f is uniformly continuous, then it must be✆ continuous. However, the converse is not true.
Continuity (2)
Now we consider more general functions.
Suppose D ⊂ Rm and R = Rn, i.e., f : Rm→ Rn.
The image of f is the set of points in the range into which some point in the domain is mapped:
I := {y|y = f (x) for some x ∈ D} ⊂ R
The inverse image of a set of points S ⊂ I is defined as: f−1(S) := {x|x ∈ D, f (x) ∈ S}
Definition 2 (Def A1.9)
Let D be a subset of Rm, and let f : D → Rn. Then f is
continuousat a point x0 if, for all ε > 0, there exists δ > 0 such that
f(Bδ(x0) ∩ D) ⊂ Bε(f (x0)).
Existence of Solutions
Compact Set
A set S in Rnis called bounded if there exists some ε > 0 such that S ⊂ Bε(x) for some x ∈ Rn.
A set S in Rnis called compact if it is closed and bounded. Theorem 3 (Weierstrass, Existence of Extreme Values) Let f: S → R be a continuous real-valued function where S is a non-empty compact subset of Rn. Then f has its maximum and minimum values. That is, there exists vectors x and x such that
f(x) ≤ f (x) ≤ f (x) for all x ∈ S.
✞
✝
☎
Fg Figure A1.18 (see JR, pp.522)✆
Most problems in economics have compact domains and continuous objective functions. ⇒ Solutions guaranteed!
Partial Derivative
Definition 4
Let {hk} be a (infinite) sequence of real number converging to 0, and let f : S(⊂ Rn) → R be a continuous function. Then, we say that f is differentiable with respect to xi at x if
k→∞lim
f(x1, ..., xi+ hk, ..., xn) − f (x1, ..., xi, ..., xn) hk
always exists and does not depend on hk. We call this limit the partial derivativewith respect to xi, denoted by ∂f(x)∂x
i or fi(x).
Note that the partial derivative itself is a function of x.
If f is differentiable with respect to all the element xi, i= 1, 2, ..., n, at x, then we say that f is differentiable at x. If f is differentiable at all x ∈S, then we say that f is a differentiable function.
Total Derivative
Definition 5
For a differentiable function f , the total differential is defined as follows:
df(x) = ∂f(x)
∂x1 dx1+
∂f(x)
∂x2 dx2+ · · · +
∂f(x)
∂xn dxn.
If f is a differentiable function and fi(x) is a continuous function for all i, then we say f is continuously
differentiable.
Lagrange’s Method (1)
Lagrange’s method is a powerful way to solve (equality) constrained optimization problems, which essentially translates them into unconstrained problems.
Again, consider the following problem: maxx1,x2f(x1, x2) subject to g(x1, x2) = 0.
Let us construct a new function L(x1, x2, λ), called the Lagrangian function as follows:
L(x1, x2, λ) = f (x1, x2) + λg(x1, x2).
Lagrange’s Method (2)
Then maximize this Lagrangian function, that is, derive the first order conditions:
∂L
∂x1 =
∂f(x∗1, x∗2)
∂x1 + λ
∗∂g(x∗1, x∗2)
∂x1 = 0
∂L
∂x2 =
∂f(x∗1, x∗2)
∂x2 + λ
∗∂g(x∗1, x∗2)
∂x2 = 0
∂L
∂λ = g(x
∗ 1, x
∗ 2) = 0.
Lagrange’s method asserts that if we find values x∗1, x∗2, and λ∗ that solves these three equations simultaneously, then we will have a critical point of along the constraint g(x1, x2) = 0.
Practice of Lagrange’s Method
✞
✝
☎
Ex Example A2.11 (see JR, pp.508)✆
maxx1,x2x1x2 subject to a − 2x1− 4x2 = 0 Forming the Lagrangian, we get
L = x1x2+ λ[a − 2x1− 4x2], with first order conditions:
∂L
∂x1 = x2− 2λ = 0,
∂L
∂x2 = x1− 4λ = 0.
∂L
∂λ = a − 2x1− 4x2= 0. These can be solved to find
x1= a 4, x2 =
a 8, λ=
a 16.
Note that the solution of the problem is a function of parameter a.
Envelope Theorem (1)
Consider the following constrained optimization problem P 1: P1 : max
x
f(x, a) s.t. g(x, a) = 0.
where x is a vector of choice variables, and a := (a1, ..., am) is a vector of parameters that may enter the objective function, the constraint, or both. Suppose that for each vector a, the solution is unique and denoted by x(a).
A maximum-value function, denoted by M (a), is defined as follows:
M(a) := max
x
f(x, a) s.t. g(x, a) = 0, or equivalently, M (a) := f (x(a), a).
Envelope Theorem (2)
If the objective function, constraint, and the solutions are differentiable in the parameters, there is a very powerful theorem that shows how the solutions vary with the parameters.
Theorem 6 (Envelope Theorem)
Consider P1 and suppose the objective function and constraint are continuously differentiable in a. For each a, let x(a) ≫ 0 uniquely solve P1 and assume that it is also continuously differentiable in the parameters a. Then, the Envelope theorem states that
∂M(a)
∂aj =
∂L
∂aj|x(a),λ(a) j= 1, ..., m,
where the right hand side denotes the partial derivative of the Lagrangian function with respect to the parameter aj evaluated at the point(x(a),λ(a)).
See JR, pp.506-507 for the proof.
Practice of Envelope Theorem
✞
✝
☎
Ex Example A2.11 (again)✆
maxx1,x2x1x2 s.t. a − 2x1− 4x2= 0.
We form the maximum-value function by substituting the solutions for x1 and x2 into the objective function. Thus,
M(a) = x1(a)x2(a) = a 4·
a 8 =
a2 32. Differentiating M (a) with respect to a, we get
dM(a)
da =
a 16,
which tells us how the maximized value varies with a. Applying the Envelope theorem, we directly obtain
dM(a)
da =
∂L
∂a|x(a),λ(a)= λ(a).
Since λ(a) = 16a, we verified that the Envelope theorem works.
Intuitions of Lagrange’s Method (1)
✞
✝
☎
Q Why does Lagrange’s method work?✆ Take the total differential of the Lagrangian:
dL = ∂L
∂x1dx1+
∂L
∂x2dx2+
∂L
∂λdλ.
When (x1, x2, λ) = (x∗1, x∗2, λ∗), it can be re-written as follows: dL = (∂f(x
∗ 1, x
∗ 2)
∂x1 + λ
∗∂g(x∗1, x∗2)
∂x1 )dx1 + (∂f(x
∗ 1, x
∗ 2)
∂x2 + λ
∗∂g(x∗1, x∗2)
∂x2 )dx2+ g(x
∗ 1, x
∗
2)dλ = 0.
Since g(x∗1, x∗2) = 0, 0 = dL = ∂f(x
∗ 1, x∗2)
∂x1 dx1+
∂f(x∗1, x∗2)
∂x2 dx2 + λ∗(∂g(x
∗ 1, x∗2)
∂x1 dx1+
∂g(x∗1, x∗2)
∂x2 dx2) for all dx1 and dx2 that satisfy the constraint g.
Intuitions of Lagrange’s Method (2)
Note that, dg = ∂g(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂g(x∗1, x∗2)
∂x2 dx2= 0. So, we can show that
dL = ∂f(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂f(x∗1, x∗2)
∂x2 dx2 = 0
for all dx1 and dx2 that satisfy the constraint g. Thus, (x∗1, x∗2) is indeed a critical point of f given that the variables must satisfy the constraint.
Lagrange’s method is very clever and useful. In effect, it offers us an algorithm for identifying the constrained optima in a wide class of practical problems.