C5.1

例2.51 kNN分类器进行鸢尾花数据集分类

教材页 教材5.4.1节,第149页
任务 kNN分类器进行鸢尾花数据集分类
Python import numpy as np;
from sklearn.neighbors import KNeighborsClassifier;
from sklearn import datasets;
Iris_ds = datasets.load_iris();
X = Iris_ds.data;
y = Iris_ds.target;
print(y);
m = KNeighborsClassifier(n_neighbors = 3); #设置邻居数量
m.fit(X, y);
pred = m.predict(X);
print(pred);
newX = np.matrix([[4.8, 3.3, 1.5, 0.3],[5.6, 2.9, 3.5, 1.3], [6.8, 3.2, 4.6, 1.4], [6.9, 3.1, 5.2, 2.3]]);
print(m.predict(newX));
C++ #include "orsci.h"
#include "orsci_dm.h"
using namespace orsci;
using namespace dm;
mdouble X = dmt::dataset::iris::iris_X(); //支持数据库直接装载4个输入属性。
vint y = dmt::dataset::iris::iris_y();
dmt::classifier::TkNN m; //定义kNN模型
m.train(y, X);
measure_dist mDist;
mDist.set_euclid();
colint pred = m.predict(X, mDist, 3, 5, 0.5);
cout << pred.T() << endl;
cout << "errorCount = " << (pred - y).count() << endl;
//方式一:定义测试集矩阵
mdouble newX = "[4.8, 3.3, 1.5, 0.3; 5.6, 2.9, 3.5, 1.3; 6.8, 3.2, 4.6, 1.4; 6.9, 3.1, 5.2, 2.3]";
cout << m.predict(newX, mDist, 3, 5, 0.5) << endl;
输出

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1
1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 2 2 2 2
2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
[0 1 1 2]

书籍 姜维.《数据分析与数据挖掘》、《数据分析与数据挖掘建模与工具》,电子工业出版社, 2023,2024。
软件 Python,C++(附加orsci包)。