Reference:
https://zhuanlan.zhihu.com/p/20918580
https://zhuanlan.zhihu.com/p/20945670
https://zhuanlan.zhihu.com/p/21102293
Linear Classification
score function (评分函数)
原始图片到各个类别的评分情况, 分越高越接近
每个图像都是 [D x 1] D = pixel X pixel X 3(RGB)
参数
权重
W (weight) = [k X D] k是样本数
在维度空间里 做旋转变换
偏差向量
b (bias vector) = [k x 1]
在维度空间里 做平移变化
图像数据预处理
最常见的就是 normalization
零均值的中心化 把 [0-255] -> [-127-127] -> [-1,1]
Loss Function (损失函数)
量化分类label 与真实label 之间的一致性 也叫 代价函数Cost Function或目标函数Objective
当 Score Function 与真实结果相差越大,Cost Function输出越大
多类支持向量机损失 Multiclass Support Vector Machine Loss
eg: 有三个分类 score 是 s = [13, -7, 11] \delta = 10, 第一个label 是真实正确的
因为 -20 大于 \delta 边界值, 所以最后的损失值为8
线性评分函数 的损失函数公式:
折叶损失(hinge loss): max (0, -)
平方折叶损失SVM(即L2-SVM): max (0, -) ^2
目的是想要正确类别进入红色区域, 如果其他类别进入红色区域甚至更高的时候,计算loss, 我们的目的是找权重W
Regularization 正则化
假设有一个数据集和一个权重集W能够正确地分类每个数据, 可能有很多相似的W都能正确地分类所有的数据
比如有一个权重W 调整系数可以改变Loss score。 我们希望給W添加一些偏好
向损失函数增加一个正则化惩罚
egularization penalty 正则化惩罚 R(W)
多类SVM损失函数 L = 数据损失(data loss),即所有样例的的平均损失L_i + 正则化损失(regularization loss)
展开公式
Code:
def L_i(x, y, W):
"""
unvectorized version. Compute the multiclass svm loss for a single example (x,y)
- x is a column vector representing an image (e.g. 3073 x 1 in CIFAR-10)
with an appended bias dimension in the 3073-rd position (i.e. bias trick)
- y is an integer giving index of correct class (e.g. between 0 and 9 in CIFAR-10)
- W is the weight matrix (e.g. 10 x 3073 in CIFAR-10)
"""
delta = 1.0 # see notes about delta later in this section
scores = W.dot(x) # scores becomes of size 10 x 1, the scores for each class
correct_class_score = scores[y]
D = W.shape[0] # number of classes, e.g. 10
loss_i = 0.0
for j in xrange(D): # iterate over all wrong classes
if j == y:
# skip for the true class to only loop over incorrect classes
continue
# accumulate loss for the i-th example
loss_i += max(0, scores[j] - correct_class_score + delta)
return loss_i
def L_i_vectorized(x, y, W):
"""
A faster half-vectorized implementation. half-vectorized
refers to the fact that for a single example the implementation contains
no for loops, but there is still one loop over the examples (outside this function)
"""
delta = 1.0
scores = W.dot(x)
# compute the margins for all classes in one vector operation
margins = np.maximum(0, scores - scores[y] + delta)
# on y-th position scores[y] - scores[y] canceled and gave delta. We want
# to ignore the y-th position and only consider margin on max wrong class
margins[y] = 0
loss_i = np.sum(margins)
return loss_i
def L(X, y, W):
"""
fully-vectorized implementation :
- X holds all the training examples as columns (e.g. 3073 x 50,000 in CIFAR-10)
- y is array of integers specifying correct class (e.g. 50,000-D array)
- W are weights (e.g. 10 x 3073)
"""
# evaluate loss over all examples in X without using any for loops
# left as exercise to reader in the assignment
delta = 1.0
score_matrix = W.dot(X)
correct_score = score_matrix[y]
#Todo, correct_score_matrix = correct_core * 50
#Todo, delta_matrx repeat delta
loss_matrix = score_matrix - correct_score_matrix + delta_matrx
result = np.sum(loss_matrix, axis=1)
return loss_matrix
Softmax分类器
逻辑回归分类器面对多个分类的一般化归纳
公式:
或者
所有的函数转换成 $$e^z$$
score function =>
softmax 函数, 每个元素都在0-1之间并且和为1
概率解释:
我们就是在最小化正确分类的负对数概率,这可以看做是在进行最大似然估计(MLE)
实现softmax函数计算的时候技巧可以用常数C, 通常$$\log C = -maxf_j$$
f = np.array([123, 456, 789]) # 例子中有3个分类,每个评分的数值都很大
p = np.exp(f) / np.sum(np.exp(f)) # 不妙:数值问题,可能导致数值爆炸
# 那么将f中的值平移到最大值为0:
f -= np.max(f) # f becomes [-666, -333, 0]
p = np.exp(f) / np.sum(np.exp(f)) # 现在OK了,将给出正确结果
折叶损失(hinge loss)替换为交叉熵损失(cross-entropy loss)
SVM和Softmax的比较
Softmax
Summary
SVM 和 Softmax 基于 weight W 和 bias b
define Loss Function (损失函数) 用来更好的定义更好的预测模型
—-sad—-
Test Mathjax, 但是公式实在太复杂。
$$
L{i}=f{yi}+\log (\sum{j} e^{f_j})
$$
$L_{i} = -\log \frac {e^{f_y}} {e^{f_j}}$
$\sum_{j=0}$
$$
a+b=c
$$
TODO