| @@ -33,7 +33,8 @@ | |||
| "4. 选取距离最小的`k`个点;\n", | |||
| "5. 确定前`k`个点所在类别的出现频率;\n", | |||
| "6. 返回前`k`个点中出现频率最高的类别作为测试数据的预测分类。\n", | |||
| "\n" | |||
| "\n", | |||
| "上述的处理过程,难点有哪些?" | |||
| ] | |||
| }, | |||
| { | |||
| @@ -45,7 +46,7 @@ | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 42, | |||
| "execution_count": 1, | |||
| "metadata": {}, | |||
| "outputs": [ | |||
| { | |||
| @@ -121,12 +122,12 @@ | |||
| "cell_type": "markdown", | |||
| "metadata": {}, | |||
| "source": [ | |||
| "## 3. Simple Program" | |||
| "## 3. 最简单的程序实现" | |||
| ] | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 17, | |||
| "execution_count": 4, | |||
| "metadata": {}, | |||
| "outputs": [ | |||
| { | |||
| @@ -171,7 +172,7 @@ | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 24, | |||
| "execution_count": 5, | |||
| "metadata": {}, | |||
| "outputs": [ | |||
| { | |||
| @@ -193,14 +194,14 @@ | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 31, | |||
| "execution_count": 6, | |||
| "metadata": {}, | |||
| "outputs": [ | |||
| { | |||
| "name": "stdout", | |||
| "output_type": "stream", | |||
| "text": [ | |||
| "Test Accuracy: 95.734597%\n" | |||
| "Test Accuracy: 96.682464%\n" | |||
| ] | |||
| } | |||
| ], | |||
| @@ -218,12 +219,12 @@ | |||
| "cell_type": "markdown", | |||
| "metadata": {}, | |||
| "source": [ | |||
| "## 4. Complex Program" | |||
| "## 4. 通过类实现kNN程序" | |||
| ] | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 43, | |||
| "execution_count": 7, | |||
| "metadata": {}, | |||
| "outputs": [], | |||
| "source": [ | |||
| @@ -277,15 +278,15 @@ | |||
| }, | |||
| { | |||
| "cell_type": "code", | |||
| "execution_count": 44, | |||
| "execution_count": 17, | |||
| "metadata": {}, | |||
| "outputs": [ | |||
| { | |||
| "name": "stdout", | |||
| "output_type": "stream", | |||
| "text": [ | |||
| "train accuracy: 0.986\n", | |||
| "test accuracy: 0.967\n" | |||
| "train accuracy: 98.568507 %\n", | |||
| "test accuracy: 96.682464 %\n" | |||
| ] | |||
| } | |||
| ], | |||
| @@ -296,19 +297,18 @@ | |||
| "\n", | |||
| "# knn classifier\n", | |||
| "clf = KNN(k=3)\n", | |||
| "acc = clf.fit(x_train, y_train).score()\n", | |||
| "\n", | |||
| "print('train accuracy: {:.3}'.format(clf.score()))\n", | |||
| "train_acc = clf.fit(x_train, y_train).score() * 100.0\n", | |||
| "test_acc = clf.score(y_test, y_test_pred) * 100.0\n", | |||
| "\n", | |||
| "y_test_pred = clf.predict(x_test)\n", | |||
| "print('test accuracy: {:.3}'.format(clf.score(y_test, y_test_pred)))" | |||
| "print('train accuracy: %f %%' % train_acc)\n", | |||
| "print('test accuracy: %f %%' % test_acc)" | |||
| ] | |||
| }, | |||
| { | |||
| "cell_type": "markdown", | |||
| "metadata": {}, | |||
| "source": [ | |||
| "## 4. sklearn program" | |||
| "## 5. sklearn program" | |||
| ] | |||
| }, | |||
| { | |||
| @@ -426,13 +426,14 @@ | |||
| "cell_type": "markdown", | |||
| "metadata": {}, | |||
| "source": [ | |||
| "## 5. 深入思考\n", | |||
| "## 6. 深入思考\n", | |||
| "\n", | |||
| "* 如果输入的数据非常多,怎么快速进行距离计算?\n", | |||
| " - kd-tree\n", | |||
| " - Fast Library for Approximate Nearest Neighbors (FLANN)\n", | |||
| "* 如何选择最好的`k`?\n", | |||
| " - https://zhuanlan.zhihu.com/p/143092725" | |||
| " - https://zhuanlan.zhihu.com/p/143092725\n", | |||
| "* kNN存在的问题?" | |||
| ] | |||
| }, | |||
| { | |||