正确理解TensorFlow中的logits

【问题】I was going through the tensorflow API docs here. In the tensorflow documentation, they used a keyword called logits. What is it? In a lot of methods in the API docs it is written like
我正想通过tensorflow API文档在这里。在tensorflow文档中,他们使用了一个叫做关键字logits。它是什么?API文档中的很多方法都是这样写的

tf.nn.softmax(logits, name=None)

If what is written is those logits are only Tensors, why keeping a different name like logits?

Another thing is that there are two methods I could not differentiate. They were
如果写的是logits只有这些Tensors,为什么要保留一个不同的名字logits?

另一件事是有两种方法我不能区分。他们是

tf.nn.softmax(logits, name=None)
tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

What are the differences between them? The docs are not clear to me. I know what tf.nn.softmaxdoes. But not the other. An example will be really helpful.
他们之间有什么不同?文档对我不明确。我知道是什么tf.nn.softmax。但不是其他。一个例子会非常有用。
Short version:

Suppose you have two tensors, where y_hat contains computed scores for each class (for example, from y = W*x +b) and y_true contains one-hot encoded true labels.
假设您有两个张量,其中y_hat包含每个类的计算得分(例如,从y = W * x + b),并y_true包含一个热点编码的真实标签。

y_hat  = ... # Predicted label, e.g. y = tf.matmul(X, W) + b
y_true = ... # True label, one-hot encoded

If you interpret the scores in y_hat as unnormalized log probabilities, then they are logits.

Additionally, the total cross-entropy loss computed in this manner:
如果您将分数解释为y_hat非标准化的日志概率,那么它们就是logits。

另外,以这种方式计算的总交叉熵损失:

y_hat_softmax = tf.nn.softmax(y_hat)
total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), [1]))

本质上等价于用函数计算的总交叉熵损失softmax_cross_entropy_with_logits():
is essentially equivalent to the total cross-entropy loss computed with the function softmax_cross_entropy_with_logits():

total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))

Long version:

In the output layer of your neural network, you will probably compute an array that contains the class scores for each of your training instances, such as from a computation y_hat = W*x + b. To serve as an example, below I've created a y_hat as a 2 x 3 array, where the rows correspond to the training instances and the columns correspond to classes. So here there are 2 training instances and 3 classes.
在神经网络的输出层中,您可能会计算一个数组,其中包含每个训练实例的类分数,例如来自计算y_hat = W*x + b。作为一个例子,下面我创建了y_hat一个2×3数组,其中行对应于训练实例,列对应于类。所以这里有2个训练实例和3个类别。

import tensorflow as tf
import numpy as np

sess = tf.Session()

# Create example y_hat.
y_hat = tf.convert_to_tensor(np.array([[0.5, 1.5, 0.1],[2.2, 1.3, 1.7]]))
sess.run(y_hat)
# array([[ 0.5,  1.5,  0.1],
#        [ 2.2,  1.3,  1.7]])

Note that the values are not normalized (i.e. the rows don't add up to 1). In order to normalize them, we can apply the softmax function, which interprets the input as unnormalized log probabilities (aka logits) and outputs normalized linear probabilities.
请注意,这些值没有标准化(即每一行的和不等于1)。为了对它们进行归一化,我们可以应用softmax函数,它将输入解释为非归一化对数概率(又名logits)并输出归一化的线性概率。

y_hat_softmax = tf.nn.softmax(y_hat)
sess.run(y_hat_softmax)
# array([[ 0.227863  ,  0.61939586,  0.15274114],
#        [ 0.49674623,  0.20196195,  0.30129182]])

It's important to fully understand what the softmax output is saying. Below I've shown a table that more clearly represents the output above. It can be seen that, for example, the probability of training instance 1 being "Class 2" is 0.619. The class probabilities for each training instance are normalized, so the sum of each row is 1.0.
充分理解softmax输出的含义非常重要。下面我列出了一张更清楚地表示上面输出的表格。可以看出,例如,训练实例1为“2类”的概率为0.619。每个训练实例的类概率被归一化,所以每行的总和为1.0。

                      Pr(Class 1)  Pr(Class 2)  Pr(Class 3)
                    ,--------------------------------------
Training instance 1 | 0.227863   | 0.61939586 | 0.15274114
Training instance 2 | 0.49674623 | 0.20196195 | 0.30129182

So now we have class probabilities for each training instance, where we can take the argmax() of each row to generate a final classification. From above, we may generate that training instance 1 belongs to "Class 2" and training instance 2 belongs to "Class 1".

Are these classifications correct? We need to measure against the true labels from the training set. You will need a one-hot encoded y_true array, where again the rows are training instances and columns are classes. Below I've created an example y_true one-hot array where the true label for training instance 1 is "Class 2" and the true label for training instance 2 is "Class 3".
所以现在我们有每个训练实例的类概率,我们可以在每个行的argmax()中生成最终的分类。从上面,我们可以生成训练实例1属于“2类”,训练实例2属于“1类”。

这些分类是否正确?我们需要根据训练集中的真实标签进行测量。您将需要一个热点编码y_true数组,其中行又是训练实例,列是类。下面我创建了一个示例y_trueone-hot数组,其中训练实例1的真实标签为“Class 2”,训练实例2的真实标签为“Class 3”。

y_true = tf.convert_to_tensor(np.array([[0.0, 1.0, 0.0],[0.0, 0.0, 1.0]]))
sess.run(y_true)
# array([[ 0.,  1.,  0.],
#        [ 0.,  0.,  1.]])

Is the probability distribution in y_hat_softmax close to the probability distribution in y_true? We can use cross-entropy loss to measure the error.
概率分布是否y_hat_softmax接近概率分布y_true?我们可以使用交叉熵损失来衡量错误。

We can compute the cross-entropy loss on a row-wise basis and see the results. Below we can see that training instance 1 has a loss of 0.479, while training instance 2 has a higher loss of 1.200. This result makes sense because in our example above, y_hat_softmax showed that training instance 1's highest probability was for "Class 2", which matches training instance 1 in y_true; however, the prediction for training instance 2 showed a highest probability for "Class 1", which does not match the true class "Class 3".
我们可以逐行计算交叉熵损失并查看结果。下面我们可以看到,训练实例1损失了0.479,而训练实例2损失了1.200。这个结果是有道理的,因为在我们上面的例子中y_hat_softmax,训练实例1的最高概率是“类2”,它与训练实例1匹配y_true; 然而,训练实例2的预测显示“1类”的最高概率,其与真实类“3类”不匹配。

loss_per_instance_1 = -tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1])
sess.run(loss_per_instance_1)
# array([ 0.4790107 ,  1.19967598])

What we really want is the total loss over all the training instances. So we can compute:
我们真正想要的是所有培训实例的全部损失。所以我们可以计算:

total_loss_1 = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1]))
sess.run(total_loss_1)
# 0.83934333897877944

Using softmax_cross_entropy_with_logits()

We can instead compute the total cross entropy loss using the tf.nn.softmax_cross_entropy_with_logits() function, as shown below.
使用softmax_cross_entropy_with_logits()

我们可以用tf.nn.softmax_cross_entropy_with_logits()函数来计算总的交叉熵损失,如下所示。

loss_per_instance_2 = tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true)
sess.run(loss_per_instance_2)
# array([ 0.4790107 ,  1.19967598])

total_loss_2 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
sess.run(total_loss_2)
# 0.83934333897877922

Note that total_loss_1 and total_loss_2 produce essentially equivalent results with some small differences in the very final digits. However, you might as well use the second approach: it takes one less line of code and accumulates less numerical error because the softmax is done for you inside of softmax_cross_entropy_with_logits().
请注意,total_loss_1并total_loss_2产生基本相同的结果,在最后一位数字中有一些小的差异。但是,你可以使用第二种方法:它只需要少一行代码,并累积更少的数字错误,因为softmax是在你内部完成的softmax_cross_entropy_with_logits()。

form Stack Overflow[https://stackoverflow.com/questions/34240703/what-is-logits-softmax-and-softmax-cross-entropy-with-logits?noredirect=1&lq=1]

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,569评论 4 363
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,499评论 1 294
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 109,271评论 0 244
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,087评论 0 209
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,474评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,670评论 1 222
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,911评论 2 313
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,636评论 0 202
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,397评论 1 246
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,607评论 2 246
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,093评论 1 261
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,418评论 2 254
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,074评论 3 237
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,092评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,865评论 0 196
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,726评论 2 276
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,627评论 2 270

推荐阅读更多精彩内容

  • 开源框架 http://blog.csdn.net/quanqinyang/article/details/453...
    NieFeng1024阅读 155评论 0 0
  • 再度回到单身。 她不知道怎么关心我,就像我不知道怎么关心前任。 这其实是个无解的问题。 对方难过了,我不关心。 现...
    SandmanLi阅读 279评论 7 0
  • 对阳朔的印象,一直停留在2002那年的秋天:明媚的天空,清新的空气,西街上熙熙攘攘的人群,各具特色的餐厅酒吧,手握...
    Vivi遇见未知世界阅读 558评论 3 0