0.6¶
scikits.learn 0.6 was released on december 2010. It is marked by the inclusion of several new modules and a general renaming of old ones. It is also marked by the inclusion of new example, including applications to real-world datasets.
Changelog¶
- New stochastic gradient descent module by Peter Prettenhofer. The module comes with complete documentation and examples.
- Improved svm module: memory consumption has been reduced by 50%, heuristic to automatically set class weights, possibility to assign weights to samples (see SVM: Weighted samples for an example).
- New Gaussian Processes module by Vincent Dubourg. This module also has great documentation and some very neat examples. See Gaussian Processes regression: basic introductory example or Gaussian Processes classification example: exploiting the probabilistic output for a taste of what can be done.
- It is now possible to use liblinear’s Multi-class SVC (option multi_class in svm.LinearSVC)
- New features and performance improvements of text feature extraction.
- Improved sparse matrix support, both in main classes (grid_search.GridSearchCV) as in modules scikits.learn.svm.sparse and scikits.learn.linear_model.sparse.
- Lots of cool new examples and a new section that uses real-world datasets was created. These include: Faces recognition example using eigenfaces and SVMs, Species distribution modeling, Libsvm GUI, Wikipedia princial eigenvector and others.
- Faster Least Angle Regression algorithm. It is now 2x faster than the R version on worst case and up to 10x times faster on some cases.
- Faster coordinate descent algorithm. In particular, the full path version of lasso (linear_model.lasso_path()) is more than 200x times faster than before.
- It is now possible to get probability estimates from a linear_model.LogisticRegression model.
- module renaming: the glm module has been renamed to linear_model, the gmm module has been included into the more general mixture model and the sgd module has been included in linear_model.
- Lots of bug fixes and documentation improvements.
People¶
People that made this release possible preceeded by number of commits:
- 207 Olivier Grisel
- 167 Fabian Pedregosa
- 97 Peter Prettenhofer
- 68 Alexandre Gramfort
- 59 Mathieu Blondel
- 55 Gael Varoquaux
- 33 Vincent Dubourg
- 21 Ron Weiss
- 9 Bertrand Thirion
- 3 Alexandre Passos
- 3 Anne-Laure Fouque
- 2 Ronan Amicel
- 1 Christian Osendorfer
0.5¶
Changelog¶
New classes¶
- Support for sparse matrices in some classifiers of modules svm and linear_model (see svm.sparse.SVC, svm.sparse.SVR, svm.sparse.LinearSVC, linear_model.sparse.Lasso, linear_model.sparse.ElasticNet)
- New pipeline.Pipeline object to compose different estimators.
- Recursive Feature Elimination routines in module Feature selection.
- Addition of various classes capable of cross validation in the linear_model module (linear_model.LassoCV, linear_model.ElasticNetCV, etc.).
- New, more efficient LARS algorithm implementation. The Lasso variant of the algorithm is also implemented. See linear_model.lars_path, linear_model.LARS and linear_model.LassoLARS.
- New Hidden Markov Models module (see classes hmm.GaussianHMM, hmm.MultinomialHMM, hmm.GMMHMM)
- New module feature_extraction (see class reference)
- New FastICA algorithm in module scikits.learn.fastica
Documentation¶
- Improved documentation for many modules, now separating narrative documentation from the class reference. As an example, see documentation for the SVM module and the complete class reference.
Fixes¶
- API changes: adhere variable names to PEP-8, give more meaningful names.
- Fixes for svm module to run on a shared memory context (multiprocessing).
- It is again possible to generate latex (and thus PDF) from the sphinx docs.
Examples¶
- new examples using some of the mlcomp datasets: Classification of text documents: using a MLComp dataset, example_mlcomp_document_classification.py
- Many more examaples. See here the full list of examples.
External dependencies¶
- Joblib is now a dependencie of this package, although it is shipped with (scikits.learn.externals.joblib).
Removed modules¶
- Module ann (Artificial Neural Networks) has been removed from the distribution. Users wanting this sort of algorithms should take a look into pybrain.
Misc¶
- New sphinx theme for the web page.
Authors¶
The following is a list of authors for this release, preceeded by number of commits:
- 262 Fabian Pedregosa
- 240 Gael Varoquaux
- 149 Alexandre Gramfort
- 116 Olivier Grisel
- 40 Vincent Michel
- 38 Ron Weiss
- 23 Matthieu Perrot
- 10 Bertrand Thirion
- 7 Yaroslav Halchenko
- 9 VirgileFritsch
- 6 Edouard Duchesnay
- 4 Mathieu Blondel
- 1 Ariel Rokem
- 1 Matthieu Brucher
0.4¶
Changelog¶
Major changes in this release include:
- Coordinate Descent algorithm (Lasso, ElasticNet) refactoring & speed improvements (roughly 100x times faster).
- Coordinate Descent Refactoring (and bug fixing) for consistency with R’s package GLMNET.
- New metrics module.
- New GMM module contributed by Ron Weiss.
- Implementation of the LARS algorithm (without Lasso variant for now).
- feature_selection module redesign.
- Migration to GIT as content management system.
- Removal of obsolete attrselect module.
- Rename of private compiled extensions (aded underscore).
- Removal of legacy unmaintained code.
- Documentation improvements (both docstring and rst).
- Improvement of the build system to (optionally) link with MKL.
Also, provide a lite BLAS implementation in case no system-wide BLAS is found.
- Lots of new examples.
- Many, many bug fixes ...
Authors¶
The committer list for this release is the following (preceded by number of commits):
- 143 Fabian Pedregosa
- 35 Alexandre Gramfort
- 34 Olivier Grisel
- 11 Gael Varoquaux
- 5 Yaroslav Halchenko
- 2 Vincent Michel
- 1 Chris Filo Gorgolewski