Warning: This documentation is for scikits.learn version 0.7.1. — Latest stable version

Contents

6.5.2. scikits.learn.neighbors.NeighborsRegressor

class scikits.learn.neighbors.NeighborsRegressor(n_neighbors=5, mode='mean', algorithm='auto', window_size=1)

Regression based on k-Nearest Neighbor Algorithm.

The target is predicted by local interpolation of the targets associated of the k-Nearest Neighbors in the training set.

Different modes for estimating the result can be set via parameter mode. ‘barycenter’ will apply the weights that best reconstruct the point from its neighbors while ‘mean’ will apply constant weights to each point.

Parameters :

n_neighbors : int, optional

Default number of neighbors. Defaults to 5.

window_size : int, optional

Window size passed to BallTree

mode : {‘mean’, ‘barycenter’}, optional

Weights to apply to labels.

algorithm : {‘auto’, ‘ball_tree’, ‘brute’, ‘brute_inplace’}, optional

Algorithm used to compute the nearest neighbors. ‘ball_tree’ will construct a BallTree, ‘brute’ and ‘brute_inplace’ will perform brute-force search.’auto’ will guess the most appropriate based on current dataset.

Notes

http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm

Examples

>>> X = [[0], [1], [2], [3]]
>>> y = [0, 0, 1, 1]
>>> from scikits.learn.neighbors import NeighborsRegressor
>>> neigh = NeighborsRegressor(n_neighbors=2)
>>> neigh.fit(X, y)
NeighborsRegressor(n_neighbors=2, window_size=1, mode='mean',
          algorithm='auto')
>>> print neigh.predict([[1.5]])
[ 0.5]

Methods

fit(X, Y, **params) Fit the model using X, y as training data.
kneighbors(data[, return_distance]) Finds the K-neighbors of a point.
predict(X, **params) Predict the target for the provided data.
score(X, y) Returns the mean error rate on the given test data and labels.
__init__(n_neighbors=5, mode='mean', algorithm='auto', window_size=1)
fit(X, Y, **params)

Fit the model using X, y as training data.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training data.

y : array-like, shape = [n_samples]

Target values, array of integer values.

params : list of keyword, optional

Overwrite keywords from __init__

kneighbors(data, return_distance=True, **params)

Finds the K-neighbors of a point.

Returns distance

Parameters :

point : array-like

The new point.

n_neighbors : int

Number of neighbors to get (default is the value passed to the constructor).

return_distance : boolean, optional. Defaults to True.

If False, distances will not be returned

Returns :

dist : array

Array representing the lengths to point, only present if return_distance=True

ind : array

Indices of the nearest points in the population matrix.

Examples

In the following example, we construnct a NeighborsClassifier class from an array representing our data set and ask who’s the closest point to [1,1,1]

>>> samples = [[0., 0., 0.], [0., .5, 0.], [1., 1., .5]]
>>> labels = [0, 0, 1]
>>> from scikits.learn.neighbors import NeighborsClassifier
>>> neigh = NeighborsClassifier(n_neighbors=1)
>>> neigh.fit(samples, labels)
NeighborsClassifier(n_neighbors=1, window_size=1, algorithm='auto')
>>> print neigh.kneighbors([1., 1., 1.])
(array([ 0.5]), array([2]))

As you can see, it returns [0.5], and [2], which means that the element is at distance 0.5 and is the third element of samples (indexes start at 0). You can also query for multiple points:

>>> X = [[0., 1., 0.], [1., 0., 1.]]
>>> neigh.kneighbors(X, return_distance=False)
array([[1],
       [2]])
predict(X, **params)

Predict the target for the provided data.

Parameters :

X : array

A 2-D array representing the test data.

n_neighbors : int, optional

Number of neighbors to get (default is the value passed to the constructor).

Returns :

y: array :

List of target values (one for each data sample).

score(X, y)

Returns the mean error rate on the given test data and labels.

Parameters :

X : array-like, shape = [n_samples, n_features]

Training set.

y : array-like, shape = [n_samples]

Labels for X.

Returns :

z : float