Binary classification : output y is binary variable $\to$Bernoulli distribution over y conditioned on $x$
-
$p(x)=W^Tx+b\quad \quad \quad \quad$ : unbounded -
$p(x) = max{0, min{1, W^Tx+b}}$ : cannot see "diminishing returns" - It is hard to calculate gradient!!
$\log{p(x)} = W^Tx+b\quad \quad \quad$ - solving for p, this gives $$ p(x;w, b)=e^{W^Tx+b} $$
- unbounded in (+) direction
${p(x)} = min{1, e^{W^Tx+b}}$ - It is still hard to calculate gradient!!
-
$\log{\frac{p(x)}{1-p(x)}} = W^Tx+b\quad \quad$ : bounded - solving for p, this gives $$ p(x;w,b) = \frac{e^{W^Tx+b}}{1+e^{W^Tx+b}}=\frac{1}{1+e^{-(W^Tx+b)}} $$
- The decision boundary separating the two predicted classes is the solution of
$W^Tx+b=0$ , which is a point if$x$ is one dimensional, a line if it is two dimensional, etc. -
$p(x)$ is called logistic function, which is a type of sigmoid function.
$$ J(\theta) = -\frac{1}{n}\sum_{i=1}^{n} [y_i\log(h_\theta(x_{i})) + (1-y_{i})\log(1-h_\theta(x_{i}))] \ \frac{\partial}{\partial\theta_j}J(\theta)=\frac{1}{n}\sum^n_{i=1}(h_\theta (x_i) - y_i)x_{ij} $$ $where, h_\theta(x) = \frac{1}{1+e^{-(W^Tx+b)}} $ : logistic function