lab03_Linear_Regression_and_How_to_minimize_cost

Original hypothesis

Simplified hypothesis


▷ What cost(W) looks like?

x y
1 1
2 2
3 3

cost


▷ Gradient descent algorithm

How it works?

graient_descent


▷ Formal definition

$cost(W,b) = \frac 1m \displaystyle\sum_{i=0}^{m}{(Wx_i-y_i)^2}$

$cost(W,b) = \frac {1}{2m} \displaystyle\sum_{i=0}^{m}{(Wx_i-y_i)^2}$


$W := W - α\frac {∂}{∂W} \frac {1}{2m} \displaystyle\sum_{i=0}^{m}{(Wx_i-y_i)^2}$

$W := W - α\frac {∂}{∂W} \frac {1}{2m} \displaystyle\sum_{i=0}^{m}{2(Wx_i-y_i)x_i}$

$W := W - α\frac {∂}{∂W} \frac {1}{m} \displaystyle\sum_{i=0}^{m}{(Wx_i-y_i)x_i}$


$W := W - α\frac {∂}{∂W}cost(W)$

  1. differential cost and multiply it by α(learning rate)
  2. subtract the calculated value from the existing W value
  3. update the result value to W repeatedly #### (The larger learning rate, the more rapidly W changes)

▷ Convex function

ConvexFunction

Local Minimum != Global Minimum ▶ we can't apply Gradient descent Algorithm

ConvexFunction2

Local Minimum == Global Minimum ▶ we can apply Gradient descent Algorithm

We can be guaranteed the lowest point wherever you start


Simplified hypothesis

▷ Cost function in pure Python

▷ Cost function in TensorFlow

Gradient descent