블로그 이미지
Leeway is... the freedom that someone has to take the action they want to or to change their plans.
maetel

Notice

Recent Post

Recent Comment

Recent Trackback

Archive

calendar

1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
  • total
  • today
  • yesterday

Category

sedumi

Kolman & Beck, Ch.1 Eg.1
: Activity Analysis or Product Mix

>> c = [-120; -100; 0; 0];
>> A = [ 2, 2, 1, 0; 5, 3, 0, 1];
>> b = [8, 15];
>> x= sedumi(A,b,c)
SeDuMi 1.1R3 by AdvOL, 2006 and Jos F. Sturm, 1998-2003.
Alg = 2: xz-corrector, Adaptive Step-Differentiation, theta = 0.250, beta = 0.500
eqs m = 2, order n = 5, dim = 5, blocks = 1
nnz(A) = 6 + 0, nnz(ADA) = 4, nnz(L) = 3
 it :     b*y       gap    delta  rate   t/tP*  t/tD*   feas cg cg  prec
  0 :            4.18E-002 0.000
  1 : -5.02E+002 1.25E-002 0.000 0.2989 0.9000 0.9000   1.77  1  1  3.8E+000
  2 : -4.29E+002 3.65E-003 0.000 0.2927 0.9000 0.9000   2.52  1  1  6.4E-001
  3 : -4.29E+002 2.84E-004 0.000 0.0778 0.9900 0.9900   1.19  1  1  4.5E-002
  4 : -4.30E+002 9.99E-007 0.000 0.0035 0.9990 0.9990   1.03  1  1 
iter seconds digits       c*x               b*y
  4      0.2   Inf -4.3000000000e+002 -4.3000000000e+002
|Ax-b| =  2.3e-015, [Ay-c]_+ =  0.0E+000, |x|= 2.9e+000, |y|= 3.6e+001

Detailed timing (sec)
   Pre          IPM          Post
3.125E-002    1.563E-001    0.000E+000   
Max-norms: ||b||=15, ||c|| = 120,
Cholesky |add|=0, |skip| = 0, ||L.L|| = 2.11427.
x =
   (1,1)       1.5000
   (2,1)       2.5000
>>








Kolman & Beck, Ch.1 Eg.3
: The Transportation Problem

>> c = [5; 7; 9; 6; 7; 10; 0; 0; 0; 0; 0];
>> A = [1,1,1,0,0,0,1,0,0,0,0; 0,0,0,1,1,1,0,1,0,0,0; 1,0,0,1,0,0,0,0,-1,0,0; 0,1,0,0,1,0,0,0,0,-1,0; 0,0,1,0,0,1,0,0,0,0,-1];
>> b = [120; 140; 100; 60; 80];
>> x = sedumi(A,b,c)
SeDuMi 1.1R3 by AdvOL, 2006 and Jos F. Sturm, 1998-2003.
Alg = 2: xz-corrector, Adaptive Step-Differentiation, theta = 0.250, beta = 0.500
eqs m = 5, order n = 12, dim = 12, blocks = 1
nnz(A) = 17 + 0, nnz(ADA) = 17, nnz(L) = 12
 it :     b*y       gap    delta  rate   t/tP*  t/tD*   feas cg cg  prec
  0 :            3.24E+003 0.000
  1 :  1.16E+003 9.69E+002 0.000 0.2991 0.9000 0.9000   2.37  1  1  1.6E+000
  2 :  1.63E+003 2.05E+002 0.000 0.2114 0.9000 0.9000   1.34  1  1  3.0E-001
  3 :  1.69E+003 3.84E+001 0.000 0.1875 0.9000 0.9000   1.20  1  1  5.1E-002
  4 :  1.70E+003 1.08E+000 0.000 0.0282 0.9900 0.9900   1.03  1  1 
iter seconds digits       c*x               b*y
  4      0.5  15.9  1.7000000000e+003  1.7000000000e+003
|Ax-b| =  5.6e-014, [Ay-c]_+ =  7.6E-016, |x|= 1.1e+002, |y|= 1.4e+001

Detailed timing (sec)
   Pre          IPM          Post
3.125E-001    4.531E-001    1.250E-001   
Max-norms: ||b||=140, ||c|| = 10,
Cholesky |add|=0, |skip| = 0, ||L.L|| = 1.
x =
   (1,1)      68.6083
   (3,1)      51.3917
   (4,1)      31.3917
   (5,1)      60.0000
   (6,1)      28.6083
   (8,1)      20.0000







posted by maetel
SeDuMi - Home
optimization package over symmetric cones

SeDuMi = Self-Dual-Minimization/ Optimization over self-dual homogeneous cones
: MatLab Toolbox developed by Jos F. Sturm


1) SeDuMi를 다운로드 받기 위해 홈페이지에 회원 등록
2) 메뉴의 Download에서 적당한 버전 다운로드
3) Matlab을 실행시키고, 메뉴에서 File > Set Path > Add Folder 클릭, SeDuMi가 들어 있는 폴더 지정 후 저장
4) command window에서 'help sedumi' 입력하여 toolbox 추가 확인


>> help sedumi
                                           [x,y,info] = sedumi(A,b,c,K,pars)
   
SEDUMI  Self-Dual-Minimization/ Optimization over self-dual homogeneous
          cones.
 
  >  X = SEDUMI(A,b,c) yields an optimal solution to the linear program
       MINIMIZE c'*x SUCH THAT A*x = b, x >= 0
       x is a vector of decision variables.
       If size(A,2)==length(b), then it solves the linear program
       MINIMIZE c'*x SUCH THAT A'*x = b, x >= 0
 
  >  [X,Y,INFO] = SEDUMI(A,b,c) also yields a vector of dual multipliers Y,
       and a structure INFO, with the fields INFO.pinf, INFO.dinf and
       INFO.numerr.
 
     (1) INFO.pinf=INFO.dinf=0: x is an optimal solution (as above)
       and y certifies optimality, viz.\ b'*y = c'*x and c - A'*y >= 0.
       Stated otherwise, y is an optimal solution to
       MAXIMIZE b'*y SUCH THAT c-A'*y >= 0.
       If size(A,2)==length(b), then y solves the linear program
       MAXIMIZE b'*y SUCH THAT c-A*y >= 0.
 
     (2) INFO.pinf=1: there cannot be x>=0 with A*x=b, and this is certified
       by y, viz. b'*y > 0 and A'*y <= 0. Thus y is a Farkas solution.
 
     (3) INFO.dinf=1: there cannot be y such that c-A'*y >= 0, and this is
       certified by x, viz. c'*x <0, A*x = 0, x >= 0. Thus x is a Farkas
       solution.
 
     (I)   INFO.numerr = 0: desired accuracy achieved (see PARS.eps).
     (II)  INFO.numerr = 1: numerical problems warning. Results are accurate
           merely to the level of PARS.bigeps.
     (III) INFO.numerr = 2: complete failure due to numerical problems.
 
     INFO.feasratio is the final value of the feasibility indicator. This
     indicator converges to 1 for problems with a complementary solution, and
     to -1 for strongly infeasible problems. If feasratio in somewhere in
     between, the problem may be nasty (e.g. the optimum is not attained),
     if the problem is NOT purely linear (see below). Otherwise, the reason
     must lie in numerical problems: try to rescale the problem.
 
  >  [X,Y,INFO] = SEDUMI(A,b,0) or SEDUMI(A,b) solves the feasibility problem
     FIND x>=0 such that A*x = b
 
  >  [X,Y,INFO] = SEDUMI(A,0,c) or SEDUMI(A,c) solves the feasibility problem
     FIND y such that A'*y <= c
 
  >  [X,Y,INFO] = SEDUMI(A,b,c,K) instead of the constraint "x>=0", this
       restricts x to a self-dual homogeneous cone that you describe in the
       structure K. Up to 5 fields can be used, called K.f, K.l, K.q, K.r and
       K.s, for Free, Linear, Quadratic, Rotated quadratic and Semi-definite.
       In addition, there are fields K.xcomplex, K.scomplex and K.ycomplex
       for complex-variables.
 
     (1) K.f is the number of FREE, i.e. UNRESTRICTED primal components.
       The dual components are restricted to be zero. E.g. if
       K.f = 2 then x(1:2) is unrestricted, and z(1:2)=0.
       These are ALWAYS the first components in x.
 
     (2) K.l is the number of NONNEGATIVE components. E.g. if K.f=2, K.l=8
       then x(3:10) >=0.
 
     (3) K.q lists the dimensions of LORENTZ (quadratic, second-order cone)
       constraints. E.g. if K.l=10 and K.q = [3 7] then
           x(11) >= norm(x(12:13)),
           x(14) >= norm(x(15:20)).
       These components ALWAYS immediately follow the K.l nonnegative ones.
       If the entries in A and/or c are COMPLEX, then the x-components in
       "norm(x(#,#))" take complex-values, whenever that is beneficial.
        Use K.ycomplex to impose constraints on the imaginary part of A*x.
 
     (4) K.r lists the dimensions of Rotated LORENTZ
       constraints. E.g. if K.l=10, K.q = [3 7] and K.r = [4 6], then
           2*x(21)x(22) >= norm(x(23:24))^2,
           2*x(25)x(26) >= norm(x(27:30))^2.
       These components ALWAYS immediately follow the K.q ones.
       Just as for the K.q-variables, the variables in "norm(x(#,#))" are
       allowed to be complex, if you provide complex data. Use K.ycomplex
       to impose constraints on the imaginary part of A*x.
 
     (5) K.s lists the dimensions of POSITIVE SEMI-DEFINITE (PSD) constraints
       E.g. if K.l=10, K.q = [3 7] and K.s = [4 3], then
           mat( x(21:36),4 ) is PSD,
           mat( x(37:45),3 ) is PSD.
       These components are ALWAYS the last entries in x.
 
     (a) K.xcomplex lists the components in f,l,q,r blocks that are allowed
      to have nonzero imaginary part in the primal. For f,l blocks, these
     (b) K.scomplex lists the PSD blocks that are Hermitian rather than
       real symmetric.
     (c) Use K.ycomplex to impose constraints on the imaginary part of A*x.
 
     The dual multipliers y have analogous meaning as in the "x>=0" case,
     except that instead of "c-A'*y>=0" resp. "-A'*y>=0", one should read that
     c-A'*y resp. -A'*y are in the cone that is described by K.l, K.q and K.s.
     In the above example, if z = c-A'*y and mat(z(21:36),4) is not symmetric/
     Hermitian, then positive semi-definiteness reflects the symmetric/
     Hermitian parts, i.e. Z + Z' is PSD.
 
     If the model contains COMPLEX data, then you may provide a list
     K.ycomplex, with the following meaning:
       y(i) is complex if ismember(i,K.ycomplex)
       y(i) is real otherwise
     The equality constraints in the primal are then as follows:
           A(i,:)*x = b(i)      if imag(b(i)) ~= 0 or ismember(i,K.ycomplex)
           real(A(i,:)*x) = b(i)  otherwise.
     Thus, equality constraints on both the real and imaginary part
     of A(i,:)*x should be listed in the field K.ycomplex.
 
     You may use EIGK(x,K) and EIGK(c-A'*y,K) to check that x and c-A'*y
     are in the cone K.
 
  >  [X,Y,INFO] = SEDUMI(A,b,c,K,pars) allows you to override the default
       parameter settings, using fields in the structure `pars'.
 
     (1) pars.fid     By default, fid=1. If fid=0, then SeDuMi runs quietly,
       i.e. no screen output. In general, output is written to a device or
       file whose handle is fid. Use fopen to assign a fid to a file.
 
     (2) pars.alg     By default, alg=2. If alg=0, then a first-order wide
       region algorithm is used, not recommended. If alg=1, then SeDuMi uses
       the centering-predictor-corrector algorithm with v-linearization.
       If alg=2, then xz-linearization is used in the corrector, similar
       to Mehrotra's algorithm. The wide-region centering-predictor-corrector
       algorithm was proposed in Chapter 7 of
         J.F. Sturm, Primal-Dual Interior Point Approach to Semidefinite Pro-
         gramming, TIR 156, Thesis Publishers Amsterdam, 1997.
 
     (3) pars.theta, pars.beta   By default, theta=0.25 and beta=0.5. These
       are the wide region and neighborhood parameters. Valid choices are
       0 < theta <= 1 and 0 < beta < 1. Setting theta=1 restricts the iterates
       to follow the central path in an N_2(beta)-neighborhood.
 
     (4) pars.stepdif, pars.w. By default, stepdif = 2 and w=[1 1].
        This implements an adaptive heuristic to control ste-differentiation.
        You can enable primal/dual step length differentiation by setting stepdif=1 or 0.
       If so, it weights the rel. primal, dual and gap residuals as
       w(1):w(2):1 in order to find the optimal step differentiation.
 
     (5) pars.eps     The desired accuracy. Setting pars.eps=0 lets SeDuMi run
       as long as it can make progress. By default eps=1e-8.
 
     (6) pars.bigeps  In case the desired accuracy pars.eps cannot be achieved,
      the solution is tagged as info.numerr=1 if it is accurate to pars.bigeps,
      otherwise it yields info.numerr=2.
 
     (7) pars.maxiter Maximum number of iterations, before termination.
 
     (8) pars.stopat  SeDuMi enters debugging mode at the iterations specified in this vector.
 
     (9) pars.cg      Various parameters for controling the Preconditioned conjugate
      gradient method (CG), which is only used if results from Cholesky are inaccurate.
     (a) cg.maxiter   Maximum number of CG-iterates (per solve). Theoretically needed
           is |add|+2*|skip|, the number of added and skipped pivots in Cholesky.
           (Default 49.)
     (b) cg.restol    Terminates if residual is a "cg.restol" fraction of duality gap.
           Should be smaller than 1 in order to make progress (default 5E-3).
     (c) cg.refine    Number of refinement loops that are allowed. The maximum number
           of actual CG-steps will thus be 1+(1+cg.refine)*cg.maxiter. (default 1)
     (d) cg.stagtol  Terminates if relative function progress less than stagtol (5E-14).
     (e) cg.qprec    Stores cg-iterates in quadruple precision if qprec=1 (default 0).
 
     (10) pars.chol   Various parameters for controling the Cholesky solve.
      Subfields of the structure pars.chol are:
     (a) chol.canceltol: Rel. tolerance for detecting cancelation during Cholesky (1E-12)
     (b) chol.maxu:   Adds to diagonal if max(abs(L(:,j))) > chol.maxu otherwise (5E5).
     (c) chol.abstol: Skips pivots falling below abstol (1e-20).
     (d) chol.maxuden: pivots in dense-column factorization so that these factors
       satisfy max(abs(Lk)) <= maxuden (default 5E2).
 
     (11) pars.vplot  If this field is 1, then SeDuMi produces a fancy
       v-plot, for research purposes. Default: vplot = 0.
 
     (12) pars.errors  If this field is 1 then SeDuMi outputs some error
     measures as defined in the Seventh DIMACS Challenge. For more details
     see the User Guide.
 
  Bug reports can be submitted at http://sedumi.mcmaster.ca.
 
  See also mat, vec, cellK, eyeK, eigK





posted by maetel

http://en.wikipedia.org/wiki/Linear_programming


1.1 The Linear Programming Problem



general  linear programming problem
- objective function
- constraints

standard form
canonical form

Every linear programming problem that has unconstrained variables can be solved by solving a corresponding linear programming problem in which all the variables are constrained to be nonnegative.

Every linear programming problem can be formulated as a corresponding standard linear programming problem or as a corresponding canonical linear programming problem.   (53p)


Minmization Problem as a Maximization Problem
: To minimize the objective function we could maximize its negative instead and then change the sign of the answer.

Reversing an Inequality

Changing an Equality to Inequality

Unconstrained Variables

Scaling
to make all coefficients in a linear programming problem approximately the same size



1.2 Matrix Notation


feasible solution
: a vector satisfying the constraints of a linear programming problem

optimal solution
: a feasible solution maximizing or minimizing the objective function of a linear programming



1.4




ref.
http://en.wikipedia.org/wiki/Convex_polyhedron
http://mathworld.wolfram.com/ConvexPolyhedron.html
http://en.wikipedia.org/wiki/Extreme_point
http://en.wikipedia.org/wiki/Krein%E2%80%93Milman_theorem
http://library.wolfram.com/infocenter/MathSource/440/
http://www.ifor.math.ethz.ch/~fukuda/fukuda.html



slack & surplus variable in the siimplex algorithm
http://www-fp.mcs.anl.gov/otc/Guide/CaseStudies/simplex/standard.html

posted by maetel
By Bernard Kolman, Robert Edward Beck
Contributor Robert Edward Beck

Edition: 2, illustrated
Published by Academic Press, 1995
ISBN 012417910X, 9780124179103
449 pages



Prologue
: Introduction to Operations Research




Definitions

Operations research (OR) is a scientific method for providing a quantitative basis for decision making (that can be used in almost any field of endeavor).

http://en.wikipedia.org/wiki/Operations_research

The techniques of OR give a logical and systematic way of formulating a problem so that the tools of mathematics can be applied to find a solution (when the established way of doing things can be improved.

A central problem in OR is the optimal allocation of scarce resources (including raw materials, labor, capital, energy, and processing time).

eg. T. C. Koopmans and L. V. Kantorovich, the 1975 Nobel Prize awardees



Development

WWII, Great Britain
the United States Air Force
1947, George B. Dantzig, the simplex algorithm (for solving linear programming problems)
programmable digital computer (for solving large-scale linear programming problems)
1952, the National Bureau of Standards SEAC machine



Phases

> Steps
1. Problem definition and formulation
to clarify objectives in undertaking the study
to identify the decision alternatives
to consider the limitations, restrictions, and requirement of the various alternatives
2. Model construction
to develop the appropriate mathematical description of the problem
: limitations, restrictions, and requirements -> constraints
: the goal of the study -> quantified as to be maximized or minimized
: decision alternatives -> variables
3. Solution of the model
: the solution to the model need not be the solution to the original problem
4. Sensitivity analysis
to determine the sensitivity of the solution to changes in the inpuit data
: how the solution will vary with the variation in the input data
5. Model evalution
to determine whether the answers produced by the model are realistic, acceptable and capable of implementation
6. Implementation of the study



The Structure of Mathematical Models

There are two  types of mathematical models: deterministic and probabilistic.

A deterministic model will always yield the same set of output values for a given set of input values, whereas a probabilistic model will typically yield many different sets of output values according to some probability distribution.


Decision variables or unknowns
to describe an optimal allocation of the scarce resources represented by the model
Parameters
: inputs, not adjustable but exactly or approximately known
Constraints
: conditions that limit the values that the decision variables can assume
Objective function
to measures the effectiveness of the system as a function of the decision variables
: The decisin variables are to be determined so that the objective function will be optimized.



Mathematical Techniques

mathematical programming

Mathematical models call for finding values of the decision variables that maximize or minimize the objective function subject to a system of inequality and equality constraints.

Linear programming

Integer programming

Stochastic programming

Nonlinear programming

network flow analysis

standard techniques vs. heuristic techniques (improvised for the particular problem and without firm mathematical basis)


Journals

Computer and Information Systems Abstracts Journal
Computer Journal
Decision Sciences
European Journal of Operational Research
IEEE Transactions on Automatic Control
Interfaces
International Abstracts in Operations Research
Journal of Computer and System Sciences
Journal of Research of the National Bureau of Standards
Journal of the ACM
Journal of the Canadian Operational Research Society
Management Science (published by The Institute for Management Science - TIMS)
Mathematical Programming
Mathematics  in Operations Research
Naval Research Logisics (published by the Office of Naval Research - ONR)
Operational Research Quarterly
Operations Research (published by the Operations Research Society of America - ORSA)
Operations Research Letters
OR/MS Today
ORSA Journal on Computing
SIAM Journal on Computing
Transportation Science
posted by maetel
4.1 Optimization Problems



eg.4.1


ezplot('x*log(x)')



* MATLAB에서 그래프 그리기

ezplot('x*log(x)')
로 해서 나왔으나, 일반적으로는 다음과 같은 식으로 한다.

x=0:0.1:10;
y=x.*log2(x);  ## 여기서, x 다음의 .은 element끼리의 scalar 연산을 지시함
plot(x,y)






http://en.wikipedia.org/wiki/Slack_variable


4.2 Convex Optimization




4.3 Linear Optimization Problems

http://en.wikipedia.org/wiki/Linear_program

 

 

4.4 Quadratic Optimization Problems

http://en.wikipedia.org/wiki/Quadratic_program


Markowitz portfolio optimization

 

4.4.2 Second-order cone programming

http://en.wikipedia.org/wiki/SOCP



http://en.wikipedia.org/wiki/Minimal_surface







posted by maetel
lecture 3 video
lecture 4 video
@ Stanford Engineering Everywhere
Linear Systems and Optimization | Convex Optimization I
Instructor: Boyd, Stephen 

EE364a: Convex Optimization I
Professor Stephen Boyd, Stanford University, Winter Quarter 2007–08


3.1 Basic properties and examples


3.1.1 Definition

http://en.wikipedia.org/wiki/Convex_function
a function is convex if and only if its epigraph (the set of points lying on or above the graph) is a convex set

Pictorially, a function is called 'convex' if the function lies below the straight line segment connecting two points, for any two points in the interval


3.1.2 Extended-value extensions

http://en.wikipedia.org/wiki/Indicator_function



3.1.3 First-order conditions

http://en.wikipedia.org/wiki/Taylor_approximation



3.1.4 Second-order conditions

3.1.5 Examples

3.1.6 Sublevel sets

3.1.7 Epigraph

3.1.8 Jensen's inequality and extensions

3.1.9 Inequalities

3.2 Operations that preserve convexity


3.2.1 Nonnegative weighted sums

3.2.2 Composition with an affine mapping

3.2.3 Pointwise maximum and supremum

3,2.4 Composition

3.2.5 Minimization

3.2.6 Perspective of a function


3.3 The conjugate function

3.3.1 Definition and examples

http://en.wikipedia.org/wiki/Convex_conjugate



3.3.2 Basic properties



3.4 Quasiconvex functions

3.4.1 Definition and examples

http://en.wikipedia.org/wiki/Quasiconvex_function
A quasiconvex function is a real-valued function defined on an interval or on a convex subset of a real vector space such that the inverse image of any set of the form  is a convex set.


3.4.2 Basic properties

3.4.3 Differentiable quasiconvex functions

3.4.4 Operations that preserve quasiconvexity

3.4.5 Representation via family of convex functions



3.5 Log-concave and log-convex functions

3.5.1 Definition

http://en.wikipedia.org/wiki/Logarithmically_concave_function

http://en.wikipedia.org/wiki/Logarithmically_convex_function

an indicator function or a characteristic function is a function defined on a set X that indicates membership of an element in a subset A of X.

http://en.wikipedia.org/wiki/Geometric_mean


http://en.wikipedia.org/wiki/Bohr%E2%80%93Mollerup_theorem

ref.
http://mathworld.wolfram.com/GammaFunction.html


Convexity in the theory of the gamma function
International Journal of Applied Mathematics & Statistics ,  Nov, 2007   by Milan Merkle

THU NGỌC DƯƠNG, THE GAMMA FUNCTION


Mohamed A. Khamsi, The Gamma Function
 



http://en.wikipedia.org/wiki/Determinant
http://mathworld.wolfram.com/Determinant.html


http://en.wikipedia.org/wiki/Lebesgue_measure


http://en.wikipedia.org/wiki/Laplace_transform
http://mathworld.wolfram.com/LaplaceTransform.html


ref.
Ernesto Schirmacher <Log-Concavity and the Exponential Formula>

András Prékopa, Logarithmic Concave Measures with Applications to Stochastic Programming, 1971
András Prékopa, On logarithmic concave measures and functions, 1973
András Prékopa, Logarithmic Concave Measures and Related Topics, 1980

Ole E. Barndorff-Nielsen,

Bruce E. Sagan, Log Concave Sequences of Symmetric Functions and Analogs of the Jacobi-Trudi Determinants. 1992
Thomas M. Cover and A. Thomas, Determinant Inequalities via Information Theory




3.5.2 Properties

http://en.wikipedia.org/wiki/Cumulative_distribution_function

http://en.wikipedia.org/wiki/Polyhedron



3.6 Convexity with respect to generalized inequalities

3.6.1 Monotonocity with respect to a generalized inequality

3.6.2 Convexity with respect to a generalized inequality


posted by maetel
2.1 Affine and convex sets


2.2 Some important examples


2.3 Operations that preserve convexity



2.4 Generalized inequalities



2.5 Separating and supporting hyperplanes


2.6 Dual cones and generalized inequalities

posted by maetel

http://www.stanford.edu/class/ee364/lectures.html

 

1.1 Mathematical optimization


mathematical optimization problem

A vector of optimization variables is optimal (or a solution) of a problem,
if it has the smallest objective value satisfying the constraints.

 

objective and constraint functions: linear  -->  optimization problem: linear program
: convex --> convex optimization problem

 

Convexitiy is more general than linearity.

Convex optimization can be a generalization of linear programming.
 

1.1.1 Applications

optimization variable  <=> choice
objective function 
constraint functions  <=>  requirements / specifications
objective value  <=>  cost
limit / bound
solution  <=>  choice with minimum cost (/ maximum utility)

http://en.wikipedia.org/wiki/Supply_chain_management

data fitting
decision making
system design/analysis/operation

automatic control systems + embedded optimization


1.1.2 Solving optimization problems

A problem is sparse if each constraint function depends on only a small number of the variables.

> effective algorithms 
least-squares problems
linear programs
convex optimization


1.2 Least-squares and linear programming

1.2.1 Least-squares problems

no constraints

http://en.wikipedia.org/wiki/Least_squares
The best fit in the least-squares sense is that instance of the model for which the sum of squared residuals has its least value, a residual being the difference between an observed value and the value given by the model. The method was first described by Carl Friedrich Gauss around 1794.

http://en.wikipedia.org/wiki/Moore%27s_law

http://en.wikipedia.org/wiki/Regression_analysis
techniques for the modeling and analysis of numerical data consisting of values of a dependent variable (also called response variable or measurement) and of one or more independent variables (also known as explanatory variables or predictors)
The dependent variable in the regression equation is modeled as a function of the independent variables, corresponding parameters ("constants"), and an error term. The error term is treated as a random variable. It represents unexplained variation in the dependent variable. The parameters are estimated so as to give a "best fit" of the data. Most commonly the best fit is evaluated by using the least squares method, but other criteria have also been used.

http://en.wikipedia.org/wiki/Maximum_likelihood_estimation
The modeling of real world data using estimation by maximum likelihood offers a way of tuning the free parameters of the model to provide a good fit. For a fixed set of data and underlying probability model, maximum likelihood picks the values of the model parameters that make the data "more likely" than any other values of the parameters would make them.


standard least-squares
weighted least-squares : linear measurements corrupted by errors with unequal variances

regularization
Regularization involves introducing additional information in order to solve an ill-posed problem or prevent overfitting. This information is usually of the form of a penalty for complexity, such as restrictions for smoothness or bounds on the vector space norm. A theoretical justification for regularization is that it attempts to impose Occam's razor on the solution. From a Bayesian point of view, many regularization techniques correspond to imposing certain prior distributions on model parameters


1.2.2 Linear programming

http://en.wikipedia.org/wiki/Linear_programming

Dantzig's simplex method


Simplex Method A good tutorial for Simplex Method with examples (also two-phase and M-method). Department of Mathematics, CUHK

Interior-point method (barrier methods)
http://en.wikipedia.org/wiki/Interior_point_method


http://en.wikipedia.org/wiki/Chebyshev_approximation#Chebyshev_approximation
expanding the given function in terms of Chebyshev polynomials and then cutting off the expansion at the desired degree

Chebyshev approximation problem can be reduced to a linear program.



1.3 Convex optimization

The least-squares problem and linear programming problem are both special cases of the general convex optimization problem.

http://en.wikipedia.org/wiki/Second-order_cone_programming
equivalent to a convex Quadratically constrained quadratic program

http://en.wikipedia.org/wiki/Geometric_programming
Geometric programs are not (in general) convex optimization problems, but they can be transformed to convex problems by a change of variables and a transformation of the objective and constraint functions.


many tricks for transforming convex optimization problems

"recognizing and formulating convex optimization problems"



1.4 Nonlinear optimization

http://en.wikipedia.org/wiki/Nonlinear_programming

1.4.1 Local optimization

to require
- differentiability of the objective and constraint functions
- initial guess (or starting point) for the optimization variable

1.4.2 Global optimization

worst-case analysis or verification of high value or safety-critical system

object funtion --> utility function
constraints --> prior knowledge about the possible paramter values
If the worst-case value is acceptable, we can certify the system as safe or reliable.
In cases where the value of certifying the performance is high, or the cost of being wrong about the reliability or safety is high.

1.4.3 Role of convex optimizatin in nonconvex problems

Initialization for local optimization

Convex heuristics for nonconvex optimization
- finding a sparse vector
- randomized algorithms

Bounds for global optimization
- relaxation
- Lagrangian relaxation --> Lagrangian dual problem

1.5 Outline

1.6 Notation


posted by maetel
C.1 Matrix structure and algorithm complexity


C.2 Solving linear equations with factored matrices


C.3 LU, Cholesky, and LDL^t facotrization


C.4 Block elimination and Schur complements


C.5 Solving underdetermined linear equations

http://en.wikipedia.org/wiki/QR_decomposition

http://mathworld.wolfram.com/QRDecomposition.html


http://en.wikipedia.org/wiki/LU_decomposition

http://mathworld.wolfram.com/LUDecomposition.html



posted by maetel
B.1 Single constraint quadratic optimization


posted by maetel
해석학
대수학과 기하학에 대하여, 함수의 연속성에 관한 성질을 미분 및 적분의 개념을 기초로 하여 연구하는 수학. 미적분학, 미분 방정식론, 적분 방정식론, 집합론, 실함수론, 복소수 함수론 따위가 있다

http://en.wikipedia.org/wiki/Mathematical_analysis

대수학 (代數學)
개개의 숫자 대신에 숫자를 대표하는 일반적인 문자를 사용하여 수의 관계, 성질, 계산 법칙 따위를 연구하는 학문. 현재는 덧셈이나 곱셈 같은 요소 간의 결합이 정의된 집합, 즉 대수계를 연구하는 학문도 포괄한다.

http://en.wikipedia.org/wiki/Algebra
the study of structure, relation, and quantity
Linear algebra, in which the specific properties of vector spaces are studied (including matrices);

선형대수학 (線型代數學)
벡터, 행렬, 행렬식, 벡터 공간, 선형 사상 따위를 연구하는 학문. 대수학의 한 분야이다.


A.1 Norms
A.1.1 Inner product, Euclidean norm, and angle

In mathematics, the dot product, also known as the scalar product, is an operation which takes two vectors over the real numbers R and returns a real-valued scalar quantity. It is the standard inner product of the orthonormal Euclidean space.
( Inner product spaces generalize Euclidean spaces (in which the inner product is the dot product, also known as the scalar product) to vector spaces of any (possibly infinite) dimension, and are studied in functional analysis. )

http://en.wikipedia.org/wiki/Norm_(mathematics)#Euclidean_norm
a norm is a function that assigns a strictly positive length or size to all vectors in a vector space, other than the zero vector.
The Euclidean norm is often known as the magnitude.

http://en.wikipedia.org/wiki/Lp_space
spaces of p-power integrable functions, and corresponding sequence spaces. They are sometimes called Lebesgue spaces

http://en.wikipedia.org/wiki/Cauchy-Schwarz_inequality

http://en.wikipedia.org/wiki/Frobenius_norm#Frobenius_norm


A.1.2 Norms, distance, and unit ball

http://en.wikipedia.org/wiki/Homogeneous_function
a homogeneous function is a function with multiplicative scaling behaviour: if the argument is multiplied by a factor, then the result is multiplied by some power of this factor.
(In mathematics, science (including computer science), linguistics and engineering, an argument is, generally speaking, an independent variable or input to a function.)

http://en.wikipedia.org/wiki/Triangle_inequality
http://upload.wikimedia.org/wikipedia/en/thumb/c/ca/Triangle_inequality.svg/290px-Triangle_inequality.svg.png

http://en.wikipedia.org/wiki/Unit_sphere
a closed unit ball is the set of points of distance less than or equal to 1 from a fixed central point
http://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Vector_norms.svg/140px-Vector_norms.svg.png

http://en.wikipedia.org/wiki/Closure_(topology)
Let S be a subset of a topological space X. Then x is a point of closure of S if every neighbourhood of x contains a point of S.

A.2 Analysis

A.3 Functions


A.4 Derivatives
A.4.1 Derivative and gradient

http://en.wikipedia.org/wiki/Derivative

int dom = interior domain

http://en.wikipedia.org/wiki/Domain_(mathematics)

In general, the interior of something refers to the space or part inside of it, excluding any kind of wall or boundary around its outside.
In mathematics, the interior of a set S consists of all points of S that are intuitively "not on the edge of S". A point that is in the interior of S is an interior point of S.
http://planetmath.org/?op=getobj&from=objects&id=3123

http://en.wikipedia.org/wiki/Jacobian
Jacobian Matrix = the matrix of all first-order partial derivatives of a vector-valued function. That is, the Jacobian of a function describes the orientation of a tangent plane to the function at a given point.

http://en.wikipedia.org/wiki/Exterior_derivative

http://en.wikipedia.org/wiki/Open_set
a set U is called open if, intuitively speaking, starting from any point x in U one can move by a small amount in any direction and still be in the set U. In other words, the distance between any point x in U and the edge of U is always greater than zero. (The openness property of a set is not topologically intrinsic. Taking a half-interval as a stand-alone space then it is open, since by definition the whole space is both open and closed. Virtually, the term open is applied to subsets.)

http://en.wikipedia.org/wiki/Gradient
the gradient of a scalar field is a vector field which points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change.
A generalization of the gradient for functions on a Euclidean space which have values in another Euclidean space is the Jacobian. A further generalization for a function from one Banach space to another is the Fréchet derivative.
 
http://en.wikipedia.org/wiki/Symmetric_matrix
(Using the Jordan normal form, one can prove that every square real matrix can be written as a product of two real symmetric matrices, and every square complex matrix can be written as a product of two complex symmetric matrices. (Bosch, 1986))

http://en.wikipedia.org/wiki/Basis_(linear_algebra)
a basis is a set of vectors that, in a linear combination, can represent every vector in a given vector space or free module, and such that no element of the set can be represented as a linear combination of the others. In other words, a basis is a linearly independent spanning set.

http://en.wikipedia.org/wiki/Basis_(topology)
a base (or basis) B for a topological space X with topology T is a collection of open sets in T such that every open set in T can be written as a union of elements of B. We say that the base generates the topology T.

http://en.wikipedia.org/wiki/Affine_transformation
an affine transformation or affine map or an affinity (from the Latin, affinis, "connected with") between two vector spaces (strictly speaking, two affine spaces) consists of a linear transformation followed by a translation:
x \mapsto A x+ b.

In the finite-dimensional case each affine transformation is given by a matrix A and a vector b, satisfying certain properties described below.


http://en.wikipedia.org/wiki/Transformation_(geometry)
a transformation could be any function from a set X to itself. However, often the set X has some additional algebraic or geometric structure and the term "transformation" refers to a function from X to itself which preserves this structure.

http://en.wikipedia.org/wiki/Trace_(linear_algebra)
the trace of a matrix is the sum of its eigenvalues, making it an invariant with respect to a change of basis.
The trace of the identity matrix is the dimension of the space.

http://en.wikipedia.org/wiki/Jordan_normal_form

http://en.wikipedia.org/wiki/Eigenvalue,_eigenvector_and_eigenspace


A.4.2 Chain rule

http://en.wikipedia.org/wiki/Chain_rule
\frac {dy}{dx} = \frac {dy} {du} \cdot\frac {du}{dx}.

http://en.wikipedia.org/wiki/Directional_derivative
\nabla_{\vec{v}}{f}(\vec{x}) = \lim_{h \rightarrow 0}{\frac{f(\vec{x} + h\vec{v}) - f(\vec{x})}{h}}.



A.4.3 Second derivative

posted by maetel
기고가 Lieven Vandenberghe
출판사: Cambridge University Press, 2004
ISBN 0521833787, 9780521833783
posted by maetel