Fork me on GitHub

sklearn.cross_validation.KFold

class sklearn.cross_validation.KFold(n, n_folds=3, indices=None, shuffle=False, random_state=None)[source]

K-Folds cross validation iterator.

Provides train/test indices to split data in train test sets. Split dataset into k consecutive folds (without shuffling).

Each fold is then used a validation set once while the k - 1 remaining fold form the training set.

Parameters:

n : int

Total number of elements.

n_folds : int, default=3

Number of folds. Must be at least 2.

shuffle : boolean, optional

Whether to shuffle the data before splitting into batches.

random_state : None, int or RandomState

Pseudo-random number generator state used for random sampling. If None, use default numpy RNG for shuffling

See also

StratifiedKFold
take label information into account to avoid building

folds, classification

Notes

The first n % n_folds folds have size n // n_folds + 1, other folds have size n // n_folds.

Examples

>>> from sklearn import cross_validation
>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = cross_validation.KFold(4, n_folds=2)
>>> len(kf)
2
>>> print(kf)  
sklearn.cross_validation.KFold(n=4, n_folds=2, shuffle=False,
                               random_state=None)
>>> for train_index, test_index in kf:
...    print("TRAIN:", train_index, "TEST:", test_index)
...    X_train, X_test = X[train_index], X[test_index]
...    y_train, y_test = y[train_index], y[test_index]
TRAIN: [2 3] TEST: [0 1]
TRAIN: [0 1] TEST: [2 3]
.. automethod:: __init__
Previous
Next