# 使用python机器学习（二）

scipy方便、易于使用、专为科学和工程设计的Python工具包。它包括统计,优化,整合,线性代数模块,傅里叶变换,信号和图像处理,常微分方程求解器等等。

scipy包含的主要模块如下：

Vector quantization / Kmeans: scipy.cluster
Physical and mathematical constants: scipy.constants

Fourier transform: scipy.fftpack

Integration routines: scipy.integrate

Interpolation: scipy.interpolate

Data input and output: scipy.io

Linear algebra routines: scipy.linalg

n-dimensional image package: scipy.ndimage

Orthogonal distance regression: scipy.odr

Optimization: scipy.optimize

Signal processing: scipy.signal

Sparse matrices: scipy.sparse

Spatial data structures and algorithms: scipy.spatial

Any special mathematical functions: scipy.special

Statistics: scipy.stats

``````import numpy as np
from scipy import linalg
arr = np.array([[1, 2],[3, 4]])
##矩阵行列式
print("矩阵行列式：",linalg.det(arr))
print("矩阵的逆：",linalg.inv(arr))
``````
``````矩阵行列式： -2.0

[ 1.5 -0.5]]
``````
``````#奇异值分解
arr = np.arange(9).reshape((3, 3)) + np.diag([1, 0, 1])
uarr, spec, vharr = linalg.svd(arr)
print(spec)
sarr = np.diag(spec)
svd_mat = uarr.dot(sarr).dot(vharr)
print(svd_mat)
np.allclose(arr,svd_mat)
``````
``````[ 14.88982544   0.45294236   0.29654967]
[[ 1.  1.  2.]
[ 3.  4.  5.]
[ 6.  7.  9.]]

True
``````
``````##傅里叶变换
##优化
from scipy import optimize
def f(x):
return x**2 + 10*np.sin(x)
import matplotlib.pyplot as plt
x = np.arange(-10, 10, 0.1)
plt.plot(x, f(x))
plt.show()
##bfgs依赖于初始点，有可能得到局部最小
optimize.fmin_bfgs(f, 0)
``````
``````array([ 3.83746709])
``````
``````optimize.fmin_bfgs(f, 3)
``````
``````Optimization terminated successfully.
Current function value: 8.315586
Iterations: 6
Function evaluations: 21

array([ 3.83746709])
``````
``````##全局最优
optimize.basinhopping(f, 0)
``````

# 1 只求的一个

``````root = optimize.fsolve(f, 1)
root
``````
``````array([ 0.])
``````
``````##曲线拟合
xdata = np.linspace(-10, 10, num=20)
ydata = f(xdata) + np.random.randn(xdata.size)
#假设满足函数f2，然后求a、b
def f2(x, a, b):
return a*x**2 + b*np.sin(x)
guess = [2, 2]
params, params_covariance = optimize.curve_fit(f2, xdata, ydata, guess)
params
``````
``````array([  1.00348624,  10.37354547])
``````
``````#统计
a = np.random.normal(size=1000)
bins = np.arange(-4, 5)
print(bins)
histogram = np.histogram(a, bins=bins, normed=True)[0]
print(histogram)
bins = 0.5*(bins[1:] + bins[:-1])
print(bins)
from scipy import stats
#pdf概率密度函数probability density function
b = stats.norm.pdf(bins)
print("pdf:",b)
plt.plot(bins, histogram)
plt.plot(bins, b)
plt.show()
loc, std = stats.norm.fit(a)
print("loc:"+str(loc)+"std:"+str(std))
#中位数
np.median(a)
``````
``````[-4 -3 -2 -1  0  1  2  3  4]
[ 0.001  0.025  0.137  0.339  0.34   0.136  0.02   0.002]
[-3.5 -2.5 -1.5 -0.5  0.5  1.5  2.5  3.5]
pdf: [ 0.00087268  0.0175283   0.1295176   0.35206533  0.35206533  0.1295176
0.0175283   0.00087268]

loc:-0.00549513299797std:1.00725628853

-0.0037246310284498475
``````

github代码

## 参考

Scipy Lecture Notes