人工智能 sklearn包中对于分类问题，如何计算accuracy和roc

1. 基础条件

import numpy as np

from sklearn import metrics

y_true = np.array([1, 7, 4, 6, 3])

y_prediction = np.array([3, 7, 4, 6, 3])

2. accuracy_score计算

acc = metrics.accuracy_score(y_true, y_prediction)

这个没问题

3. roc_auc_score计算

The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).

因此metrics.roc_auc_score对于multiclasses类的roc_auc_score计算，需要一个二维array，每一列是表示分的每一类，每一行是表示是否为此类。

from sklearn.preprocessing import OneHotEncoder

enc = OneHotEncoder(sparse=False)

enc.fit(y_true.reshape(-1, 1))

y_true_onehot = enc.transform(y_true.reshape(-1, 1))

y_predictions_onehot = \

enc.transform(y_prediction.reshape(-1, 1))

In [201]: y_true_onehot

Out[201]:

array([[1., 0., 0., 0., 0.],

[0., 0., 0., 0., 1.],

[0., 0., 1., 0., 0.],

[0., 0., 0., 1., 0.],

[0., 1., 0., 0., 0.]])

In [202]: y_predictions_onehot

Out[202]:

array([[0., 1., 0., 0., 0.],

[0., 0., 0., 0., 1.],

[0., 0., 1., 0., 0.],

[0., 0., 0., 1., 0.],

[0., 1., 0., 0., 0.]])

In [204]: enc.categories_

Out[204]: [array([1, 3, 4, 6, 7])]

所以结合enc.categories_和y_true_onehot，y_true与y_true_onehot的对应关系如下：

Class13467true value: 11true value: 71true value: 41true value: 61true value: 31

因此，对于y_prediction与y_prediction_onehot的对应关系就是如下：

Class13467Prediction value: 31Prediction value: 71Prediction value: 41Prediction value: 61Prediction value: 31

这就解释了上述y_true_onehot和y_prediction_onehot的返回结果。

ensemble_auc = metrics.roc_auc_score(y_true_onehot,

y_predictions_onehot)

In [200]: ensemble_auc

Out[200]: 0.875

金钥匙

人工智能 sklearn包中对于分类问题，如何计算accuracy和roc

人工智能监督学习【初中生讲机器学习】6. 分类算法中常用的模型评价指标有哪些？here!

算法超详细推导逻辑回归公式与代码实现(二分类与多分类)

发表评论取消回复

金钥匙

人工智能 sklearn包中对于分类问题，如何计算accuracy和roc

人工智能 监督学习 【初中生讲机器学习】6. 分类算法中常用的模型评价指标有哪些？here!

算法 超详细推导逻辑回归公式与代码实现(二分类与多分类)

相关文章

发表评论取消回复

人工智能监督学习【初中生讲机器学习】6. 分类算法中常用的模型评价指标有哪些？here!

算法超详细推导逻辑回归公式与代码实现(二分类与多分类)