Lecture 3: Optimization
Advanced Microeconomics I
Yosuke YASUDA
National Graduate Institute for Policy Studies
October 10, 2013
What and How to Optimize?
Optimization is a set of mathematical procedures to find the optimal value of some function.
We frequently adopt the assumption that an economic agent seeks to maximize or minimize some function, for example:
◮ Consumer maximizes her utility function.
◮ Firm minimizes its cost function.
◮ Seller at the auction maximizes (expected) revenue function.
◮ Government tries to maximize the social welfare function. We study (and apply) static optimization.
→ Dynamic optimization, a common tool in modern macroeconomics, will not be covered in our lectures...
Equality Constraints
Consider choosing x1 and x2 to maximize f (x1, x2) , when x1 and x2 must satisfy some particular relation to each other that we write implicit form as g(x1, x2) = 0 .
Formally, we write this problem as follows: maxx1,x2f(x1, x2) subject to g(x1, x2) = 0.
◮ f(x
1, x2): objective function.
◮ x
1 and x2: choice variables.
◮ g(x
1, x2): constraint.
◮ The set of all (x1, x2) that satisfy the constraint: feasible set.
✞
✝
☎
Q Does solution always exist?✆
Continuity (1)
Intuitively, a function is continuous if a “small movement” in the domain does not cause a “big jump” in the range.
Def A function f : D (⊂ Rn) → R is called
1. continuous at a point x0 if, for all ε > 0, there exists δ > 0 such that d(x, x0) < δ implies that d(f (x), f (x0)) < ε. 2. continuous if it is continuous at every point in its domain. 3. uniformly continuous if, for all ε > 0, there exists δ > 0 such
that for all x ∈ D, d(x, x0) < δ implies that d(f (x), f (x0)) < ε.
✞
✝
☎
Rm If function f is uniformly continuous, then it must be✆ continuous. However, the converse is not true.
Continuity (2)
Now we consider more general functions.
Suppose D ⊂ Rm and R = Rn, i.e., f : Rm→ Rn.
◮ The image of f is the set of points in the range into which some point in the domain is mapped:
I := {y|y = f (x) for some x ∈ D} ⊂ R
◮ The inverse image of a set of points S ⊂ I is defined as: f−1(S) := {x|x ∈ D, f (x) ∈ S}
Def A1.9 Let D be a subset of Rm, and let f : D → Rn. Then f is continuous at a point x0 if, for all ε > 0, there exists δ > 0 such that
f(Bδ(x0) ∩ D) ⊂ Bε(f (x0)).
Existence of Solutions
Compact Set
◮ A set S in Rnis called bounded if there exists some ε > 0 such that S ⊂ Bε(x) for some x ∈ Rn.
◮ A set S in Rnis called compact if it is closed and bounded. Thm A1.10 (Weierstrass) Existence of Extreme Values Let f : S → R be a continuous real-valued function where S is a non-empty compact subset of Rn. Then f has its maximum and minimum values. That is, there exists vectors x and x such that
f(x) ≤ f (x) ≤ f (x) for all x ∈ S.
◮
✞
✝
☎
Fg Figure A1.18 (see JR, pp.522)✆
◮ Most problems in economics have compact domains and continuous objective functions. ⇒ Solutions guaranteed!
Partial Derivative
Def Let {hk} be a (infinite) sequence of real number converging to 0, and let f : S(⊂ Rn) → R be a continuous function. Then, we say that f is differentiable with respect to xi at x if
k→∞lim
f(x1, ..., xi+ hk, ..., xn) − f (x1, ..., xi, ..., xn) hk
always exists and does not depend on hk. We call this limit the partial derivative with respect to xi, denoted by ∂f∂x(x)i or fi(x). Note that the partial derivative itself is a function of x.
◮ If f is differentiable with respect to all the element xi, i= 1, 2, ..., n, at x, then we say that f is differentiable at x.
◮ If f is differentiable at all x ∈S, then we say that f is a differentiable function.
Total Derivative
Def For a differentiable function f , the total differential is defined as follows:
df(x) = ∂f(x)
∂x1 dx1+
∂f(x)
∂x2 dx2+ · · · +
∂f(x)
∂xn dxn.
◮ If f is a differentiable function and fi(x) is a continuous function for all i, then we say f is continuously
differentiable.
Lagrange’s Method (1)
Lagrange’s method is a powerful way to solve (equality) constrained optimization problems, which essentially translates them into unconstrained problems.
Again, consider the following problem: maxx1,x2f(x1, x2) subject to g(x1, x2) = 0.
Let us construct a new function L(x1, x2, λ), called the Lagrangian function as follows:
L(x1, x2, λ) = f (x1, x2) + λg(x1, x2).
Lagrange’s Method (2)
Then maximize this Lagrangian function, that is, derive the first order conditions:
∂L
∂x1 =
∂f(x∗1, x∗2)
∂x1 + λ
∗∂g(x∗1, x∗2)
∂x1 = 0
∂L
∂x2 =
∂f(x∗1, x∗2)
∂x2 + λ
∗∂g(x∗1, x∗2)
∂x2 = 0
∂L
∂λ = g(x
∗ 1, x
∗ 2) = 0.
◮ Lagrange’s method asserts that if we find values x∗1, x∗2, and λ∗ that solves these three equations simultaneously, then we will have a critical point of along the constraint g(x1, x2) = 0.
Practice of Lagrange’s Method
✞✝
☎
Ex Example A2.11 (see JR, pp.508)✆
maxx1,x2x1x2 subject to a − 2x1− 4x2 = 0 Forming the Lagrangian, we get
L = x1x2+ λ[a − 2x1− 4x2], with first order conditions:
∂L
∂x1 = x2− 2λ = 0,
∂L
∂x2 = x1− 4λ = 0.
∂L
∂λ = a − 2x1− 4x2= 0. These can be solved to find
x1= a 4, x2 =
a 8, λ=
a 16.
Envelope Theorem (1)
Consider the following constrained optimization problem P 1: P1 : max
x
f(x, a) s.t. g(x, a) = 0.
where x is a vector of choice variables, and a := (a1, ..., am) is a vector of parameters that may enter the objective function, the constraint, or both. Suppose that for each vector a, the solution is unique and denoted by x(a).
◮ A maximum-value function, denoted by M (a), is defined as follows:
M(a) := max
x
f(x, a) s.t. g(x, a) = 0, or equivalently, M (a) := f (x(a), a).
Envelope Theorem (2)
If the objective function, constraint, and the solutions are differentiable in the parameters, there is a very powerful theorem that shows how the solutions vary with the parameters.
Thm Envelope Theorem
Consider P 1 and suppose the objective function and constraint are continuously differentiable in a. For each a, let x(a) ≫ 0 uniquely solve P 1 and assume that it is also continuously differentiable in the parameters a. Then, the Envelope theorem states that
∂M(a)
∂aj =
∂L
∂aj|x(a),λ(a) j= 1, ..., m,
where the right hand side denotes the partial derivative of the Lagrangian function with respect to the parameter aj evaluated at the point (x(a),λ(a)).
Practice of Envelope Theorem
✞✝
☎
Ex Example A2.11 (again)✆
maxx1,x2x1x2 s.t. a − 2x1− 4x2= 0.
We form the maximum-value function by substituting the solutions for x1 and x2 into the objective function. Thus,
M(a) = x1(a)x2(a) = a 4·
a 8 =
a2 32. Differentiating M (a) with respect to a, we get
dM(a)
da =
a 16,
which tells us how the maximized value varies with a. Applying the Envelope theorem, we directly obtain
dM(a)
da =
∂L
∂a|x(a),λ(a)= λ(a).
Since λ(a) = 16a, we verified that the Envelope theorem works.
Intuitions of Lagrange’s Method (1)
✞✝
☎
Q Why does Lagrange’s method work?✆ Take the total differential of the Lagrangian:
dL = ∂L
∂x1dx1+
∂L
∂x2dx2+
∂L
∂λdλ.
When (x1, x2, λ) = (x∗1, x∗2, λ∗), it can be re-written as follows: dL = (∂f(x
∗ 1, x
∗ 2)
∂x1 + λ
∗∂g(x∗1, x∗2)
∂x1 )dx1 + (∂f(x
∗ 1, x
∗ 2)
∂x2 + λ
∗∂g(x∗1, x∗2)
∂x2 )dx2+ g(x
∗ 1, x
∗
2)dλ = 0.
Since g(x∗1, x∗2) = 0, 0 = dL = ∂f(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂f(x∗1, x∗2)
∂x2 dx2 + λ∗(∂g(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂g(x∗1, x∗2)
∂x2 dx2)
Intuitions of Lagrange’s Method (2)
Note that, dg = ∂g(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂g(x∗1, x∗2)
∂x2 dx2= 0. So, we can show that
dL = ∂f(x
∗ 1, x
∗ 2)
∂x1 dx1+
∂f(x∗1, x∗2)
∂x2 dx2 = 0
for all dx1 and dx2 that satisfy the constraint g. Thus, (x∗1, x∗2) is indeed a critical point of f given that the variables must satisfy the constraint.
Lagrange’s method is very clever and useful. In effect, it offers us an algorithm for identifying the constrained optima in a wide class of practical problems.