# 逻辑回归（一）

###### 分类问题（Classification）

Question：
Which of the following statements is true?
A. If linear regression doesn't work on a classification task as in the previous example shown in the video, applying feature scaling may help.
B. If the training set satisfies 0 ≤ y(i) ≤ 1 for every training example (x(i), y(i)), then linear regression's prediction will also satisfy 0 ≤ hθ(x) ≤ 1 for all values of x.
C. If there is a feature x that perfectly predicts y, i.e. if y = 1 where x ≥ c and y = 0 whenever x ﹤c (for some constant c), then linear regression will obtain zero classification error.
D. None of the above statements are true.

###### Classification

To attempt classification, one method is to use linear regression and map all predictions greater than 0.5 as a 1 and all less than 0.5 as a 0. However, this method doesn't work well because classification is not actually a linear function.

The classification problem is just like the regression problem, except that the values we now want to predict take on only a small number of discrete values. For now, we will focus on the binary classification problem in which y can take on only two values, 0 and 1. (Most of what we say here will also generalize to the multiple-class case.) For instance, if we are trying to build a spam classifier for email, then x(i) may be some features of a piece of email, and y may be 1 if it is a piece of spam mail, and 0 otherwise. Hence, y∈{0,1}. 0 is also called the negative class, and 1 the positive class, and they are sometimes also denoted by the symbols “-” and “+.” Given x(i), the corresponding y(i) is also called the label for the training example.

###### 分类问题建模（Hypothesis Representation）

hθ(x) = P(y=0|x;θ) + hθ(x) = P(y=1|x;θ) = 1

###### Hypothesis Representation

We could approach the classification problem ignoring the fact that y is discrete-valued, and use our old linear regression algorithm to try to predict y given x. However, it is easy to construct examples where this method performs very poorly. Intuitively, it also doesn’t make sense for hθ(x) to take values larger than 1 or smaller than 0 when we know that y ∈ {0, 1}. To fix this, let’s change the form for our hypotheses hθ(x) to satisfy 0≤hθ(x)≤1. This is accomplished by plugging θTx into the Logistic Function.

Our new form uses the "Sigmoid Function," also called the "Logistic Function":

The following image shows us what the sigmoid function looks like:

The function g(z), shown here, maps any real number to the (0, 1) interval, making it useful for transforming an arbitrary-valued function into a function better suited for classification.

hθ(x) will give us the probability that our output is 1. For example, hθ(x)=0.7 gives us a probability of 70% that our output is 1. Our probability that our prediction is 0 is just the complement of our probability that it is 1 (e.g. if probability that it is 1 is 70%, then the probability that it is 0 is 30%).

###### 边界判定（Decision Boundary）

• 当z≥0时，即θTX ≥0，则有g(z)≥0.5，我们可以预测y=1
• 当z≥0时，即θTX ＜0，则有g(z)＜0.5，我们可以预测y=0

###### Decision Boundary

In order to get our discrete 0 or 1 classification, we can translate the output of the hypothesis function as follows:

The way our logistic function g behaves is that when its input is greater than or equal to zero, its output is greater than or equal to 0.5:

Remember.

So if our input to g is θTX, then that means:

From these statements we can now say:

The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our hypothesis function.

Example:

In this case, our decision boundary is a straight vertical line placed on the graph where x1=5, and everything to the left of that denotes y = 1, while everything to the right denotes y = 0.

Again, the input to the sigmoid function g(z) (e.g. θTX) doesn't need to be linear, and could be a function that describes a circle (e.g. z=θ01x122x22) or any shape to fit our data.

### 推荐阅读更多精彩内容

• 引用类型有哪些？非引用类型有哪些2.如下代码输出什么？为什么 基本类型值（数值、布尔值、null和undefine...
饥人谷_Young丶K阅读 62评论 0 0
• 阿呆是个绝对的路痴，再加严重的脸肓症，平日里认错人和路闹的笑话实在太多，所以人送外号“阿呆”。 阿呆人随和，同事们...
福二姨阅读 798评论 25 24
• 从小就跟着发小儿一家打游戏，插带的那种。他爸玩，他哥玩，他玩，于是带着我玩。每天十二人街霸，忍者龙剑传，不亦乐...
Men_阅读 119评论 0 0
• 女，三十多岁，怀孕期。感冒，清鼻涕多，头重，打喷嚏，眼里感觉随时有眼泪，头重还有些疼 ，小便深黄，味道难闻，大便干...
奇飞医生阅读 806评论 0 1