# 重要的机器学习算法

1.决策树。

2.SVM。

3.朴素贝叶斯。

4.KNN。

5.K均值。

6.随机森林。

1.决策树：

Python代码

#Import Library

#Import other necessary libraries like pandas, numpy... from sklearn import tree

#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset

# Create tree object model = tree.DecisionTreeClassifier(criterion='gini')

# for classification, here you can change the algorithm as gini or entropy (information gain) by default it is gini

# model = tree.DecisionTreeRegressor() for regression

# Train the model using the training sets and check score model.fit(X, y) model.score(X, y)

#Predict Output predicted= model.predict(x_test)

R

2.支持向量机（SVM）

Python代码

#Import Library from sklearn import svm

#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset

# Create SVM classification object model = svm.svc()

# there is various option associated with it, this is simple for classification. You can refer link, for mo# re detail.

# Train the model using the training sets and check score model.fit(X, y) model.score(X, y)

#Predict Output predicted= model.predict(x_test)

R

3.朴素叶斯：

·         P（c|x）是给定预测器（属性）的类（目标）的后验概率。

·         P（c）是类的先验概率。

·         P（x|c）是预测器给定类的概率的可能性。

·         P（x）是预测器的先验概率。

Python代码

#Import Library from sklearn.naive_bayes import GaussianNB #Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset

# Create SVM classification object model = GaussianNB()

# there is other distribution for multinomial classes like Bernoulli Naive Bayes, Refer link

# Train the model using the training sets and check score model.fit(X, y)

#Predict Output predicted= model.predict(x_test)

R

4.KNN（最近居）：

KNN可以很容易地映射到我们的真实生活中。如果你想了解一个你不了解的人，你可能会想知道他们的密友和他们进入的圈子，以获得他们的信息！

KNN在计算资源上是昂贵的。

Python代码

#Import Library from sklearn.neighbors import KNeighborsClassifier

#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset

# Create KNeighbors classifier object model KNeighborsClassifier(n_neighbors=6)

# default value for n_neighbors is 5

# Train the model using the training sets and check score model.fit(X, y)

#Predict Output predicted= model.predict(x_test)

R

5.K值：

K-means如何形成一个集群：

K-均值为每个群集选取K个点数，称为质心。

Python代码

#Import Library from sklearn.cluster import KMeans

#Assumed you have, X (attributes) for training data set and x_test(attributes) of test_dataset

# Create KNeighbors classifier object model k_means = KMeans(n_clusters=3, random_state=0)

# Train the model using the training sets and check score model.fit(X)

#Predict Output predicted= model.predict(x_test)

R

6.随机森林：

Python代码

#Import Library from sklearn.ensemble import RandomForestClassifier

#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset

# Create Random Forest object model= RandomForestClassifier()

# Train the model using the training sets and check score model.fit(X, y)

#Predict Output predicted= model.predict(x_test)

R