6.13.1. scikits.learn.grid_search.GridSearchCV¶
- class scikits.learn.grid_search.GridSearchCV(estimator, param_grid, loss_func=None, score_func=None, fit_params={}, n_jobs=1, iid=True, refit=True, cv=None)¶
Grid search on the parameters of a classifier
Important members are fit, predict.
GridSearchCV implements a “fit” method and a “predict” method like any classifier except that the parameters of the classifier used to predict is optimized by cross-validation
Parameters : estimator: object type that implements the “fit” and “predict” methods :
A object of that type is instanciated for each grid point
param_grid: dict :
a dictionary of parameters that are used the generate the grid
loss_func: callable, optional :
function that takes 2 arguments and compares them in order to evaluate the performance of prediciton (small is good) if None is passed, the score of the estimator is maximized
score_func: callable, optional :
function that takes 2 arguments and compares them in order to evaluate the performance of prediciton (big is good) if None is passed, the score of the estimator is maximized
fit_params : dict, optional
parameters to pass to the fit method
n_jobs: int, optional :
number of jobs to run in parallel (default 1)
iid: boolean, optional :
If True, the data is assumed to be identically distributed across the folds, and the loss minimized is the total loss per sample, and not the mean loss across the folds.
cv : crossvalidation generator
see scikits.learn.cross_val module
refit: boolean :
refit the best estimator with the entire dataset
Notes
The parameters selected are those that maximize the score of the left out data, unless an explicit score_func is passed in which case it is used instead. If a loss function loss_func is passed, it overrides the score functions and is minimized.
Examples
>>> from scikits.learn import svm, grid_search, datasets >>> iris = datasets.load_iris() >>> parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]} >>> svr = svm.SVR() >>> clf = grid_search.GridSearchCV(svr, parameters) >>> clf.fit(iris.data, iris.target) GridSearchCV(n_jobs=1, fit_params={}, loss_func=None, refit=True, cv=None, iid=True, estimator=SVR(kernel='rbf', C=1.0, probability=False, ... ...
Methods
fit(X[, y]) Run fit with all sets of parameters score(X[, y]) - __init__(estimator, param_grid, loss_func=None, score_func=None, fit_params={}, n_jobs=1, iid=True, refit=True, cv=None)¶
- fit(X, y=None, **params)¶
Run fit with all sets of parameters
Returns the best classifier
Parameters : X: array, [n_samples, n_features] :
Training vector, where n_samples in the number of samples and n_features is the number of features.
y: array, [n_samples] or None :
Target vector relative to X, None for unsupervised problems