• 検索結果がありません。

Advanced Lecture on Neural Information Processing Systems (Lecture 02)

N/A
N/A
Protected

Academic year: 2021

シェア "Advanced Lecture on Neural Information Processing Systems (Lecture 02)"

Copied!
18
0
0

読み込み中.... (全文を見る)

全文

(1)

Advanced Lecture on

Neural Information Processing Systems (Lecture 02)

Ichiro Takeuchi

Nagoya Institute of Technology

(2)

Computer programs learned by themselves

Learn a computer program

double func(double x1, double x2) {

double y = ???;

return y;

}

that satisfies the following input-output relations:

x

1

= 2, x

2

= 4 y = 5

x

1

= 1, x

2

= 8 y = 2

x

1

= 6, x

2

= 9 y = 7

x

1

= 3, x

2

= 3 y = 4

(3)

Input output relations

(4)

Linear models

Consider linear input-output relations:

double func(double x1, double x2) {

double y = w1*x1 + w2*x2;

return y;

}

Linear model for d-dimensional input x R

d

:

y = f(x

1

, x

2

, . . . , x

d

) = w

1

x

1

+ w

2

x

2

+ . . . + w

d

x

d

(5)

Simple linear regression: 1D input case

f(x) = wx

(6)

Examples

Find a function

y = f(x) = wx

that satisfies the following input-output relations:

Example 1

x = 2 y = 1 x = 6 y = 3 x = 8 y = 4 x = 4 y = 2

Example 2

x = 2 y = 1

x = 6 y = 2

x = 8 y = 4

x = 4 y = 3

(7)

Plots

(8)

Minimizing errors

Training data

input x output y x

1

y

1

x

2

y

2

.. . .. . x

n

y

n

Minimizing the sum of squared errors:

w

= arg min

w∈R

n i=1

(y

i

wx

i

)

2

(9)

Exercise

Compute the optimal w R that minimizes the sum of squared errors:

E :=

n i=1

(y

i

wx

i

)

2

where the training set is given as

x = 2 y = 1

x = 6 y = 2

x = 8 y = 4

x = 4 y = 3

(10)

Multiple linear regression

f(x

1

, x

2

, . . . , x

d

) = w

1

x

1

+ w

2

x

2

+ . . . + w

d

x

d

(11)

Caution: notations of the training data

The training data is represented by X R

n×d

and y R

n

:

n

X

×d

:=

 

 

x

11

x

12

· · · x

1d

x

21

x

22

· · · x

2d

.. . .. . . .. .. . x

n1

x

n2

· · · x

nd

 

  =

 

  x

1

x

2

.. . x

n

 

  , y

n×1

:=

 

  y

1

y

2

.. . y

n

 

 

x

ij

R : the j

th

input variable of the i

th

training instance

x

i

R

d

: the input vector of the i

th

training instance

(12)

Least square linear regression

Training data:

X

n×d

:=

 

 

x

11

x

12

· · · x

1d

x

21

x

22

· · · x

2d

.. . .. . . .. .. . x

n1

x

n2

· · · x

nd

 

  , y

n×1

:=

 

  y

1

y

2

.. . y

n

 

 

Linear model estimation by LS method:

w

= arg min

w∈Rd

n i=1

(y

i

(w

1

x

i1

+ . . . + w

d

x

id

))

2

(13)

LS linear regression in matrix vector form

The sum of squared error is written as

E :=

n i=1

(y

i

(w

1

x

i1

+ . . . + w

d

x

id

))

2

=

n i=1

( y

i

x

i

w )

2

= (y Xw)

(y Xw)

= w

(X

X)w 2(X

y)

w + y

y

(14)

Optimality conditions

(15)

Minimizing the sum of squared errors

The sum of squared error:

E = w

(X

X)w 2(X

y)

w + y

y

The optimality condition:

∂E

∂w = 2(X

X)w 2(X

y) = 0

Normal equations:

(X

X)w = X

y

(16)

Example

(17)

Example

(18)

Final exersize

Consider linear regression problem with a constant term:

Training data: {(xi, yi)R×R}ni=1

Model: f(xi) =w0+w1xi, i= 1, . . . , n

Problem: minw0,w1∈Rn

i=1(yi(w0+w1xi))2

Show that the solution of the problem is formulated as the following system of linear equations:

nw

0

+ (

n

i=1

x

i

)

w

1

=

n i=1

y

i

(

n

i=1

x

i

)

w

0

+ (

n

i=1

x

2i

)

w

1

=

n i=1

x

i

y

i

参照

関連したドキュメント

In Advances in Neural Informa- tion Processing Systems 14 [Neural Information Pro- cessing Systems: Natural and Synthetic, NIPS 2001], pages 625–632.. Efficient methods

and Hinton, G.E.: Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, pp.1097–1105 2012.. Kim, Y.:

Proceedings of workshop on machine learning systems (LearningSys) in the twenty-ninth annual conference on neural information processing

Keyword 〉 urban and regional planning, proposition of vision, planning design Fundamental Lecture 〉 “Advanced Environmental Systems Engineering” (0.2) Relational Lecture

"Optimal sparse decision trees." Advances in Neural Information

genet classification with deep convolutional neural networks, Advances in neural information processing systems, pp.. and Vedaldi, A.: Salient deconvolutional

Information Processing Center of Aichi University Vol.12

T.: Inferring state sequences for non-linear systems with embedded hidden Markov models, Advances in Neural Information Processing Systems 16 2004, pp... c 2016