On global minimization for general $p$-regularized subproblems with $p>2$ (Nonlinear Analysis and Convex Analysis)
全文
(2) 48. order regularization term. The. (p‐RS) where. ,. It is often assumed that. f. is smooth. decrease in the value of. f. ,. global. accepted;. it is. f. enough. convergence. If the. global. takes the. following model. \displaystyle \min_{x\in \mathb {R}^{n} \{g(x)=\frac{1}{2}x^{T}Hx+c^{T}x+\frac{ $\sigma$}{p}\Vert x\Vert^{p}\},. H is the Hessian of. $\sigma$>0, p>2 and. the desired. subproblem. but. at any. iterate, regardless of its definiteness.. to have. a. symmetric Hessian and. minimizer of. rejected. (p‐RS). renders. otherwise with. to obtain. satisfactory. a. increase in. an. $\sigma$. to. enhance the regularization force.. literature, (p‐RS). In most. common. p=3. is known. choice among all others.. due to Griewank convergence and. [9]. and later. and. by. Gould and Toint. Cartis,. regularization. The idea of the cubic. considered. was. the cubic. as. and. [2].. which is the. regularization. many authors with. complexity analysis. See Nesterov. [17];. and Erdmann. with. Polyak [15];. When p=4. first. thorough global. Weiser Deuflhard. (p‐RS). ,. was. reduces to. a. form of the double well potential function which has many applications in solid mechanics and quantum mechanics. general p>2. in. [5, 18]. Gould,. Robinson and Thorne. [8]. studied. (p‐RS). for. a. comparison with the the trust‐region subproblem. (TRS). \displaystyle \min. \displaystyle \frac{1}{2}x^{T}Hx+c^{T}x. \mathrm{s}. \mathrm{t}. \Vert x\Vert^{2}\leq\triangle, x\in \mathbb{R}^{n}. In this. article,. we. characterize. necessary and sufficient. (p‐RS) completely. global optimality conditions for p=3. using the secular function (to be specified later) for p=4 Notations. matrix. Let. v. denote the. P\in \mathbb{R}^{n\times n}, P\succ(\succeq)0. means. P is denoted. by \det(P) whereas. Diag (x). diagonal. $\beta$\in \mathbb{R},. is. a. \displaystyle\mathrm{s}\mathrm{i}\mathrm{g}\mathrm{n}($\beta$)=\frac{$\beta$}{|$\beta$|}. eigenvalue of. P.. the. matrix with. if. p>2 by (i) extending the. for any. $\beta$\neq 0. ,. in. in. [2]. identity. n. diagonal components being. otherwise. For any. (semi)definite.. matrix of order. analysis. by. I. x_{1} ,. .. symmetric. The determinant of For. \cdots. \mathrm{s}\mathrm{i}\mathrm{g}\mathrm{n}( $\beta$)=0 Finally, $\lambda$_{i}(P) .. the. [18].. optimal value of problem. that P is positive. (ii). and. ,. a. x_{n}. .. vector. For. a. x\in \mathbb{R}^{n}, number. is the ith smallest.
(3) 49. 2Main results for We first observe that the. global minimization. objective. function. g(x). (p‐RS). of. is. coercive, i.e.,. \displaystyle \lim g(x)=+\infty.. | x\Vert\rightarrow+\infty. Consequently, the global analysis is the first. minimizer of. (p‐RS) always. exists.. The. starting point of the. order and the second order necessary conditions for any local minimizer. of g.. Lemma 1 Assume that \underline{x} is. a. local minimizer. of (p‐RS), p>2. \nabla g(\underline{x})=(H+ $\sigma$\Vert\underline{x}\Vert^{p-2}I)\underline{x}+c=0. .. It holds that. (1). ,. \nabla^{2}g(\underline{x})=(H+ $\sigma$\Vert\underline{x}\Vert^{p-2}I)+ $\sigma$(p-2)\Vert\underline{x}\Vert^{p-4}\underline{xx}^{T}\succeq 0 where. \nabla g, \nabla^{2}g denote. the. gradient and the Hessian of g(x) respectively. ,. The next theorem shows. $\sigma$\Vert\underline{x}\Vert^{p-2}I\succeq sufficiency. O. The. that,. a. local minimizer \underline{x} becomes. necessity has been shown by Theorem. The point x^{*} is. critical point. satisfy ing \nabla g(x^{*})=0. all the. minimizers. global. If x^{*}=0_{n}. ,. are. then. a. global. minimizer. 2 in. [8].. if and We. and. of (p‐RS) for p>2 if. only. only. and. H+ $\sigma$\Vert x^{*}\Vert^{p-2}I\succeq 0 Moreover, .. if H+. prove the. only if it. the. P_{2}. is. norms. a. of. equal.. $\sigma$\Vert x^{*}\Vert^{p-2}=0. so. that. H=H+ $\sigma$\Vert x^{*}\Vert^{p-2}I\succeq 0 Consequently, x^{T}Hx\geq 0, .. global. global. here.. Theorem 1. Proof. (2). ,. c=-(H+ $\sigma$\Vert x^{*}\Vert^{p-2}I)x^{*}=0. and. \forall x\in \mathbb{R}^{n} It follows that x^{*}=0_{n} is .. minimizer since. g(x)=\displaystyle \frac{1}{2}x^{T}Hx+c^{T}x+\frac{ $\sigma$}{p}\Vert x\Vert^{p}\geq\frac{ $\sigma$}{p}\Vert x\Vert^{p}>0=g(0) , \foral x\neq 0_{n}=x^{*}.. a.
(4) 50. Now. x^{*}\neq 0_{n} i.e., \Vert x^{*}\Vert>. we assume. ,. assumption, Q\succeq 0 Then, for .. O. Define. any x\in \mathbb{R}^{n} and. Q=H+ $\sigma$\Vert x^{*}\Vert^{p-2}I According. to the. .. x\neq x^{*}. ,. it holds that. g(x) = \displaystyle \frac{1}{2}x^{T}Hx+c^{T}x+\frac{ $\sigma$}{p}\Vert x\Vert^{p} = \displaystyle \frac{1}{2}x^{T}Qx+c^{T}x-\frac{1}{2}( $\sigma$\Vert x^{*}\Vert^{p-2})x^{T}x+\frac{ $\sigma$}{p}\Vert x\Vert^{p}. =\displaystyle\frac{1}{2}x^{T}Qx+c^{T_{X} +\frac{$\sigma$}{p}\Vertx^{*}\Vert^{p}( \frac{|x\Vert^{2} {\Vertx^{*}\Vert^{2} )^{2}e-\frac{p}{2}\frac{|x ^{2} {\Vertx^{*}|^{2} ). Define. f(t)=t^{2}2,. p>2. .. It is. strictly. convex. for t>0. .. (3). Therefore,. f(t)=t^{B}2\displaystyle \geq f(1)+f'(1)(t-1)=1+\frac{p}{2}(t-1) , \forall t>0. By substituting. t with. \displaystle\frac{|x\Vert^{2} \Vertx^{*}\Vert^{2}. ,. we. have. (\displaystyle\frac{|x|^{2} {\Vertx^{*}|^{2} )^{2}-\frac{p}{2}\frac{|x|^{2} {\Vertx^{*}|^{2} \geq21-\frac{p}{2}. Then,. g(x) \displaystyle \geq \frac{1}{2}x^{T}Qx+c^{T}x+\frac{ $\sigma$}{p}\Vert x^{*}\Vert^{p}(1-\frac{p}{2}) By Q\succeq 0 quadratic. ,. the lower. in terms of. minimizer of the. x. bounding function of .. convex. Since x^{*} satisfies function in the. g in the. right. (4). .. hand side of. (H+ $\sigma$\Vert x^{*}\Vert^{p-2}I)x^{*}=Qx^{*}=-c,. right. hand side of. (4).. As. a. (4). is. x^{*} is. convex a. global. consequence,. g(x)\displaystyle \geq\frac{1}{2}(x^{*})^{T}Qx^{*}+c^{T}x^{*}+\frac{ $\sigma$}{p}\Vert x^{*}\Vert^{p}(1-\frac{p}{2})=g(x^{*}) and x^{*} is. a. Finally,. global from. \displaystyle \frac{1}{2}x^{T}Qx+c^{T}x Qx=-c. and. Remark 1. found. minimizer of. (3),. and. (p‐RS).. if \hat{x} is also. a. global. minimizer of. (\displaystyle\frac{|x\Vert^{2} {\Vertx^{*}|^{2} )^{2}B-2\Vertx\Vert^{2}. 2\overline{\Vert x^{*}| ^{2} simultaneously.. (p‐RS),. This. can. \hat{x} must minimize both. happen. if and. only. if. can. be. \Vert\hat{x}\Vert=\Vert x^{*}\Vert.. When p=3 , two other. in Theorem 3.1 in. l2J. proofs of the. and Theorem 10 in. necessary and. sufficient. [15J respectively. ,. condition. We notice that the. proof.
(5) 51. in. [2]. is inherited. from. of. that. the necessary and. subproblem l37 and the proof in l15\displaystyle \int highly relies Our. proof is. between. much easier in. g(x). and. g(x^{*}). understanding,. sufficient condition for the trust‐region the special structure. on. based. simply. since it is. on a. of the direct. case. p=3.. comparison. .. To characterize the set of. global. (p‐RS),. minimizers of. we. may. assume. that H is. diagonal, i.e.,. H=\mathrm{D}\mathrm{i}\mathrm{a}\mathrm{g}(\mathrm{a}_{1}, \ldots, $\alpha$_{n}). (5). ,. where. $\alpha$_{1}=\ldots=$\alpha$_{k}<$\alpha$_{k+1}\leq\ldots\leq$\alpha$_{n} and k is the. multiplicity. of the smallest. eigenvalue decomposition of a. diagonal. version of. Theorem 2 The set. (p‐RS). H. .. Let. eigenvalue. y=U^{T}x. .. Let x^{*} be any. that t^{*} is. independent of the. global. minimizers. If H+ $\sigma$ t^{*}I is. choice of x^{*} since the .. .. .. specific. we. obtain. singleton. n. ,. .. is,. k ‐dimensional. t^{*}=\Vert x^{*}\Vert^{p-2}\geq 0. of all the. That. or a. global. Notice. .. minimizers. are. t^{*}\displaystyle \in(\max\{- $\alpha \Delta \sigma$, 0\}, +\infty). define the set of the. minimizer x^{*} is. uniquely. defined. .. global. by (the. that) . on a. norms. satisfying (H+ $\sigma$ t^{*}I)x^{*}=-c. invertible, the global. a. and define. \ell_{2}. x_{i}^{*}=\underline{-c_{i}}i=1 $\alpha$_{i}+ $\sigma$ t^{*} By summing all. is either. (p‐RS). minimizer of. ,. x^{*}. and. be the. in terms of y.. equal. By Theorem 1, $\alpha$_{i}+ $\sigma$ t^{*}\geq 0, \forall i=1 2, Moreover, the solutions. Otherwise, let H=U $\Sigma$ U^{T}. .. \Vert y\Vert=\Vert U^{T}x\Vert=\Vert x\Vert. Then. of global minimizers of (p‐RS). Proof. still unknown t^{*}. $\alpha$_{1}. (x_{i}^{*})^{2}, t^{*}. is. necessarily. a. ,. nonnegative. ... .. ,. n.. root of the. following secular function. open interval:. h(t)=\displaystyle \sum_{i=1}^{n}\frac{c_{i}^{2} {($\alpha$_{i}+ $\sigma$ t)^{2} -t\frac{2}{\mathrm{p}-2}, t\in I_{g}=(\max\{-\frac{$\alpha$_{1} { $\sigma$},0\}, +\infty). .. (6).
(6) 52. \displaystyle \lim_{t\rightar ow\max\{-\lrcorner^{$\alpha$_{0\}} $\sigma$},h(t)>0, \displaystyle \lim_{t\rightar ow+\infty}h(t)=-\infty. Since. secular function. On the other not. can. h(t). has. unique. a. hand, H+ $\sigma$ t^{*}I. root. is. happen for $\alpha$_{1}>0. ) Then,. I_{g}. on. singular. ,. h(t). and. is. strictly decreasing. on. I_{g}. the. ,. which must be t^{*}.. in which. c_{1}^{2}+\ldots+c_{k}^{2}=0. ,. case. t^{*}=\displaystyle \frac{- $\alpha$ 1}{ $\sigma$} (Obviously, .. this. and $\alpha$_{i}+ $\sigma$ t^{*}>0, i=k+1, k+2 ,. case. ... .. ,. n. such that. is a. \displaystyle \hat{x}^{*}=(0, \ldots, 0, \frac{-c_{k+1} {$\alpha$_{k+1}-$\alpha$_{1} , \ldots, \frac{-c_{ $\eta$} {$\alpha$_{n}-$\alpha$_{1} )^{T}. one. (H-$\alpha$_{1}I)x^{*}=-c By summing. trivial solution to. .. all. (\hat{x}_{i}^{*})^{2}. in. (7) (7),. we. again obtain. secular function. \displaystyle\hat{h}(t)=\sum_{i=k+1}^{n}\frac{ _{i}^{2} {($\alpha$_{i}+$\sigma$t)^{2} -t\frac{2}{p-2},t\inI_{\hat{g} =[-\frac{$\alpha$_{1} {$\sigma$},+\infty) Notice that. then. \hat{h}(t). If does. is the. t^{*}=- $\alpha$\lrcorner $\sigma$. minimizer of. is also. strictly decreasing root of. unique. \hat{h}(t). on. on. I_{\hat{g}. and. I_{\overline{g} Thus, .. \displaystyle \lim_{t\rightar ow+\infty}\hat{h}(t)=-\infty. \hat{x}^{*} defined. by (7). (8). .. .. If. is the. \hat{h}. (-\displaystyle \frac{ $\alpha$ 1}{ $\sigma$})=0,. unique global. (p‐RS).. \hat{h}. (-\displaystyle \frac{ $\alpha$ 1}{ $\sigma$})<0 then (8) has no solution and the trivial solution \hat{x}^{*} to (H-$\alpha$_{1}I)x^{*}=-c not satisfy t^{*}=\displaystyle \frac{- $\alpha$ 1}{ $\sigma$}=\Vert\hat{x}^{*}\Vert^{p-2} Then, any x^{*} satisfying ,. .. (x_{1}^{*})^{2}+\displaystyle\ldots+(x_{k}^{*})^{2}+\sum_{i=k+1}^{n}\frac{ _{i}^{2} {($\alpha$_{i}-$\alpha$_{1})^{2} =(\frac{-$\alpha$_{1} {$\sigma$})^{\frac{2}{p-2} is. minimizer of. global. a. dimensional. (p‐RS). Namely,. sphere centered. at. the. global. (9). minimum solution set forms. a. k‐. (0, \displaystyle \cdots, 0, -\frac{\mathrm{C}k+1}{$\alpha$_{k+1- $\alpha$ 1} , \cdots, -\frac{\mathrm{c}_{n} {$\sigma$_{n}- $\sigma$ 1}) with the radius. \sqrt{(\frac{$\alpha$_{1}{$\sigma$})^{\frac{2}p-2} \sum_{i-k+1}^{n}\frac{ _i}^{2}{($\alpha$_{i}-$\alpha$_{1})^{2} . Otherwise, \hat{h} obtain. a. contradiction that. Finally, can. we. show that. be obtained. have. (-\displaystyle \frac{ $\alpha$ 1}{ $\sigma$})>0. ,. then. (p‐RS). (p‐RS). by solving. (8). an. has. has. no. no. possesses. solution and. global. some. equivalently. (9). cannot hold for any x^{*}. minimizer. The. hidden. proof. convexity that. reformulated. convex. its. is thus. .. We. complete.. global minimizer. programming. We first.
(7) 53. Proposition. 1. Suppose. H is. Let x^{*} be any. diagonal.. c_{t}x_{i}^{*}\leq 0, Proof. Comparing. x^{*} with. i=1 ,. .. .. .. ,. global. minimizer. of (p‐RS),. then. n.. \overline{x}=(-x_{1}^{*}, x_{2}^{*}, x_{3}^{*}, \ldots, x_{n}^{*}). ,. we. immediately have. 0\geq g(x^{*})-g(\overline{x})=c_{1}(x_{1}^{*}-\overline{x}_{1})=2c_{1}x_{1}^{*}. A similar argument. to all other. applying. components yields the result. This completes. proof.. the. By Proposition 1, (p‐RS). and. (10). below share the. same. optimal solution. \displaystyle\min\sum_{i=1}^{n}\{-$\alpha$_{2^{X_{i}^{2} $\Delta$+c_{$\eta$}x_{i}\+\frac{$\sigma$}{p}(\sum_{i=1}^{n}x_{i}^{2})^{2}$\epsilon$. \mathrm{s}.. Introducing. the. i=1 ,. c_{t}x_{i}\leq 0,. ... .. ,. (10). n.. the nonlinear one‐to‐one map:. x_{i}=\left\{ begin{ar y}{l \sqrt{z_i},\mathrm{i}\mathrm{f}c_{\dot{$\tau$}\leq0,\ i=1,. n,\ -\sqrt{z_i},\mathrm{i}\mathrm{f}c_{\dot{$\tau$}>0, \end{ar y}\right.. problem (10) becomes the following min \mathrm{s}.. The. \mathrm{t}.. global optimal. mation. set.. \mathrm{t}.. convex. (11). program:. -\displaystyle\sum_{i=1}^{n}|c_{$\eta$}|\sqrt{z_{i}+\frac{1}{2}\sum_{i=1}^{n}$\alpha$_{i}z_{i}+\frac{$\sigma$}{p}(\sum_{i=1}^{n}z_{i})^{2}$\epsilon$. z_{i}\geq 0,. solution of. (12). i=1 , can. ... .. ,. (12). n.. be converted to generate x^{*}. through. (11).. Remark 2 The. first. two derivatives. of the. secular. function h(t). h'(t)=\displaystyle\sum_{i=1}^{n}\frac{-2$\sigma$c_{i}^{2}{($\alpha$_{i}+$\sigma$t)^{3}-\frac{2}{p-2}t^{\frac{4-}{p-}f_{2} and. h'(t)=\displaystyle\sum_{i=1}^{n}\frac{6$\sigma$^{2}c_{i}^{2}{($\alpha$_{i}+$\sigma$t)^{4}-\frac{2(4-p)}{(p-2)^{2}t^{\frac{6-}{\mathrm{p}\text{∽}2-2.. are. the transfor‐.
(8) 54. h(t). strictly decreasing. is. that. ensures. choosing. the. c\neq 0). is. always. convex. convex. a. only for p\geq 4. finite. h(t). $\sigma$,. .. subinterval. can. be made. For. p=3 if H\not\geq 0 (which ,. of I_{g} covering t^{*} by properly ,. convex.. Notice that the secular. though.. Conclusions. solving. regularization subproblem. unconstrained optimization. regularization subproblem on. and. regularization parameter. Since the cubic for. I_{g}. and h is restricted to. function for (TRS). 3. on. the. between the two types of. problems,. is very close to. p‐‐regularized subproblems. for. can. (TRS). be used to it is in. modify Newtons method. generally. believed that the cubic. spirit. Our comprehensive analysis. general p>2 gives. the most detailed. subproblems and confirms that they do share. comparison. many similar. features.. References [1]. D. Bienstock and A.. Michalka, Polynomial Solvability of. Subproblem, ACM‐SIAM Symposium. [2]. [3]. C.. Cartis,. N. I. M.. optimization. Part. Mathematical. Programming, 127 (2011),. A. R.. Conn,. E.. Erkut,. search 46. N. I. M. on. Discrete. Algorithms (2014),. Gould, Ph.. I:. L.. motivation,. pp. 380‐390.. convergence and numerical. The discrete p ‐dispersion pp. 48−60. results,. pp. 245‐295.. Toint, Trust‐Region Methods, Number. optimization, Philadelphia, SIAM,. (1990),. of the Tmst‐Region. Gould, Ph. L. Toint, Adaptive cubic regularisation methods for. unconstrained. SIAM Series. [4]. on. Variants. 01 MPS‐. 2000.. problem, European Journal of Operational Re‐.
(9) 55. [5]. S. C. Fang, D. and Its. 0ptimization. (2014). Mech. Solids. [6]. M. R.. Gao, G.. GAREY,. X.. Lin, R. L. Sheu,. in The. W.. n ‐dimensional. Real. [8]. D. M.. D. S.. 2(1981),. pp. 186‐197.. N. I. M.. Gould, D.. P.. Robinson,. regularised subproblems. [9]. A.. Part. I,. to appear in Math.. JOHNSON, Computers and Intractability: A guide. Gay, Computing optimal locally. 2(2010),. Space‐. arXiv: 1410.5925. of NP‐completeness, Freeman, San Francisco,. [7]. Xing, Double Well Potential Function. in. to the. 1979.. constrained steps, SIAM J. Sci. Stat.. H. Sue. theory. Comput.,. Thorne, On solving trust‐region and other. optimization, Mathematical Programming Computation,. pp. 21‐57.. Griewank,. The. modification of Newton^{\mathrm{Z}}s method for unconstrained optimization. by bounding cubic terms, Technical Report \mathrm{D}\mathrm{A}\mathrm{M}\mathrm{T}\mathrm{P}/\mathrm{N}\mathrm{A}12 Department of Applied ,. Mathematics and Theoretical. [10]. P.. Hansen,. I.. TIMS/ORSA [11]. [12]. [13]. Y.. Hsia, R.. Sheu,. Trust. on. a. network, Presentation. 1981.. at. the. Meeting, Washington, D.C. (1988). Region Subproblem with. a. Fixed Number of Additional. Linear. Inequality Constraints has Polynomial Complexity, arXiv:1312.1398,. R. A.. Horn, C. R. Johnson, Matrix Analysis, Cambridge: Cambridge University. Press,. 1985.. J. M.. Martínez, Local. spheres, SIAM. [14]. Moon, Dispersing facilities. Joint National. L.. Physics, Cambridge University, Cambridge, UK,. J. J.. J.. minimizers. of quadratic function. optimization, 4(1994),. Moré, D. C. Sorensen, Computing. COMpUt., 4(1983),. pp. 553‐572.. on. 2013. Euclidean balls and. pp. 159‐176.. a. trust. region step, SIAM J. Sci. Statist..
(10) 56. [15]. Y.. Nesterov, B. T. Polyak, Cubic regularization of. performance,. [16]. D. C.. M.. Program., 108(2006),. Sorensen, Newtons method with. Numer.. [17]. Math.. Anal., 19(1982),. Weiser,. P.. Deuflhard,. Y.. global. pp. 177‐205.. model trust. region modification, SIAM J.. pp. 409‐426.. B.. nonlinear elastomechanics,. [18]. a. Newton method and its. Erdmann, Affine conjugate adaptive Newton methods for. Optim.. Methods. Softw., 22(2007),. pp. 413‐431.. Xia, S. C. Fang, R. L. Sheu, W. Xing, Double Well Potential Function and Its. optimization Solids. (2014). in The. n ‐dimensional. arXiv: 1404. 1963. Real. Space—. Part. II,. to appear in Math. Mech..
(11)
関連したドキュメント
We present a local convergence analysis for deformed Halley method in order to approximate a solution of a nonlinear equation in a Banach space setting.. Our methods include the
We make use of the Guo-Krasnoselskii fixed point theorem on cones to prove ex- istence of positive solutions to a non local p-Laplacian boundary value problem on time scales arising
We start with some introductory materials to the over-relaxed η- proximal point algorithm based on the notion of maximal η-monotonicity, and recall some investigations on
In Section 3, we establish local integral estimates for Hessian operators (Theorem 3.1), while in Section 4, we establish local L p estimates for k-convex functions and their
W loc 2,p regularity for the solutions of the approximate equation This section is devoted to prove the W 2,p local regularity of the solutions of equations (5) and, as a by-product,
RIMS Summer School (COSS 2018), Kyoto, July 2018.. Discrete Convex
Consider the minimization problem with a convex separable objective function over a feasible region defined by linear equality constraint(s)/linear inequality constraint of the
GENERAL p-CURL SYSTEMS AND DUALITY MAPPINGS ON SOBOLEV SPACES FOR MAXWELL EQUATIONS..