sklearn.multiclass.OneVsOneClassifier¶
- class sklearn.multiclass.OneVsOneClassifier(estimator, n_jobs=1)[source]¶
One-vs-one multiclass strategy
This strategy consists in fitting one classifier per class pair. At prediction time, the class which received the most votes is selected. Since it requires to fit n_classes * (n_classes - 1) / 2 classifiers, this method is usually slower than one-vs-the-rest, due to its O(n_classes^2) complexity. However, this method may be advantageous for algorithms such as kernel algorithms which don’t scale well with n_samples. This is because each individual learning problem only involves a small subset of the data whereas, with one-vs-the-rest, the complete dataset is used n_classes times.
Parameters: estimator : estimator object
An estimator object implementing fit and one of decision_function or predict_proba.
n_jobs : int, optional, default: 1
The number of jobs to use for the computation. If -1 all CPUs are used. If 1 is given, no parallel computing code is used at all, which is useful for debugging. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Thus for n_jobs = -2, all CPUs but one are used.
Attributes: estimators_ : list of n_classes * (n_classes - 1) / 2 estimators
Estimators used for predictions.
classes_ : numpy array of shape [n_classes]
Array containing labels.
Methods
decision_function(X) Decision function for the OneVsOneClassifier. fit(X, y) Fit underlying estimators. get_params([deep]) Get parameters for this estimator. predict(X) Estimate the best class label for each sample in X. score(X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels. set_params(**params) Set the parameters of this estimator. - decision_function(X)[source]¶
Decision function for the OneVsOneClassifier.
The decision values for the samples are computed by adding the normalized sum of pair-wise classification confidence levels to the votes in order to disambiguate between the decision values when the votes for all the classes are equal leading to a tie.
Parameters: X : array-like, shape = [n_samples, n_features] Returns: Y : array-like, shape = [n_samples, n_classes]
- fit(X, y)[source]¶
Fit underlying estimators.
Parameters: X : (sparse) array-like, shape = [n_samples, n_features]
Data.
y : array-like, shape = [n_samples]
Multi-class targets.
Returns: self :
- get_params(deep=True)[source]¶
Get parameters for this estimator.
Parameters: deep: boolean, optional :
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params : mapping of string to any
Parameter names mapped to their values.
- predict(X)[source]¶
Estimate the best class label for each sample in X.
This is implemented as argmax(decision_function(X), axis=1) which will return the label of the class with most votes by estimators predicting the outcome of a decision for each possible class pair.
Parameters: X : (sparse) array-like, shape = [n_samples, n_features]
Data.
Returns: y : numpy array of shape [n_samples]
Predicted multi-class targets.
- score(X, y, sample_weight=None)[source]¶
Returns the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: X : array-like, shape = (n_samples, n_features)
Test samples.
y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True labels for X.
sample_weight : array-like, shape = [n_samples], optional
Sample weights.
Returns: score : float
Mean accuracy of self.predict(X) wrt. y.
- set_params(**params)[source]¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The former have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
Returns: self :