There are different ways to get scikit-learn installed:
- Install the version of scikit-learn provided by your operating system or Python distribution. This is the quickest option for those who have operating systems that distribute scikit-learn.
- Install an official release. This is the best approach for users who want a stable version number and aren’t concerned about running a slightly older version of scikit-learn.
- Install the latest development version. This is best for users who want the latest-and-greatest features and aren’t afraid of running brand-new code.
If you wish to contribute to the project, it’s recommended you install the latest development version.
Installing an official release¶
Getting the dependencies¶
Installing from source requires you to have installed Python (>= 2.6), NumPy (>= 1.6.1), SciPy (>= 0.9), setuptools, Python development headers and a working C++ compiler. Under Debian-based operating systems, which include Ubuntu, you can install all these requirements by issuing:
sudo apt-get install build-essential python-dev python-setuptools \ python-numpy python-scipy \ libatlas-dev libatlas3gf-base
On recent Debian and Ubuntu (e.g. Ubuntu 13.04 or later) make sure that ATLAS is used to provide the implementation of the BLAS and LAPACK linear algebra routines:
sudo update-alternatives --set libblas.so.3 \ /usr/lib/atlas-base/atlas/libblas.so.3 sudo update-alternatives --set liblapack.so.3 \ /usr/lib/atlas-base/atlas/liblapack.so.3
In order to build the documentation and run the example code contains in this documentation you will need matplotlib:
sudo apt-get install python-matplotlib
The above installs the ATLAS implementation of BLAS (the Basic Linear Algebra Subprograms library). Ubuntu 11.10 and later, and recent (testing) versions of Debian, offer an alternative implementation called OpenBLAS.
Using OpenBLAS can give speedups in some scikit-learn modules, but can freeze joblib/multiprocessing prior to OpenBLAS version 0.2.8-4, so using it is not recommended unless you know what you’re doing.
If you do want to use OpenBLAS, then replacing ATLAS only requires a couple of commands. ATLAS has to be removed, otherwise NumPy may not work:
sudo apt-get remove libatlas3gf-base libatlas-dev sudo apt-get install libopenblas-dev sudo update-alternatives --set libblas.so.3 \ /usr/lib/openblas-base/libopenblas.so.0 sudo update-alternatives --set liblapack.so.3 \ /usr/lib/lapack/liblapack.so.3
On Red Hat and clones (e.g. CentOS), install the dependencies using:
sudo yum -y install gcc gcc-c++ numpy python-devel scipy
This is usually the fastest way to install the latest stable release. If you have pip or easy_install, you can install or update with the command:
pip install -U scikit-learn
easy_install -U scikit-learn
for easy_install. Note that you might need root privileges to run these commands.
You can download a Windows installer from downloads in the project’s web page. Note that must also have installed the packages numpy and setuptools.
This package is also expected to work with python(x,y) as of 22.214.171.124.
Installing on Windows 64-bit
To install a 64-bit version of scikit-learn, you can download the binaries from http://www.lfd.uci.edu/~gohlke/pythonlibs/#scikit-learn Note that this will require a compatible version of numpy, scipy and matplotlib. The easiest option is to also download them from the same URL.
Building on windows¶
To build scikit-learn on windows you will need a C/C++ compiler in addition to numpy, scipy and setuptools. At least MinGW (a port of GCC to Windows OS) and Microsoft Visual C++ 2008 should work out of the box. To force the use of a particular compiler, write a file named setup.cfg in the source directory with the content:
[build_ext] compiler=my_compiler [build] compiler=my_compiler
where my_compiler should be one of mingw32 or msvc.
When the appropriate compiler has been set, and assuming Python is in your PATH (see Python FAQ for windows for more details), installation is done by executing the command:
python setup.py install
To build a precompiled package like the ones distributed at the downloads section, the command to execute is:
python setup.py bdist_wininst -b doc/logos/scikit-learn-logo.bmp
This will create an installable binary under directory dist/.
Third party distributions of scikit-learn¶
Some third-party distributions are now providing versions of scikit-learn integrated with their package-management systems.
These can make installation and upgrading much easier for users since the integration includes the ability to automatically install dependencies (numpy, scipy) that scikit-learn requires.
The following is an incomplete list of Python and OS distributions that provide their own version of scikit-learn.
Debian and derivatives (Ubuntu)¶
The Debian package is named python-sklearn (formerly python-scikits-learn) and can be installed using the following command:
sudo apt-get install python-sklearn
Additionally, backport builds of the most recent release of scikit-learn for existing releases of Debian and Ubuntu are available from the NeuroDebian repository .
A quick-‘n’-dirty way of rolling your own .deb package is to use stdeb.
The MacPorts package is named py<XY>-scikits-learn, where XY denotes the Python version. It can be installed by typing the following command:
sudo port install py26-scikit-learn
sudo port install py27-scikit-learn
Archlinux’s package is provided at Arch User Repository (AUR) with name python2-scikit-learn for latest stable version and python2-scikit-learn-git for building from git version. If yaourt is available, it can be installed by typing the following command:
sudo yaourt -S python2-scikit-learn
sudo yaourt -S python2-scikit-learn-git
depending on the version of scikit-learn you want to use.
The Fedora package is called python-scikit-learn for the Python 2 version and python3-scikit-learn for the Python 3 version. Both versions can be installed using yum:
$ sudo yum install python-scikit-learn
$ sudo yum install python3-scikit-learn
Testing requires having the nose library. After installation, the package can be tested by executing from outside the source directory:
nosetests sklearn --exe
This should give you a lot of output (and some warnings) but eventually should finish with a message similar to:
Ran 601 tests in 27.920s OK (SKIP=2)
Alternative testing method
If for some reason the recommended method is failing for you, please try the alternate method:
python -c "import sklearn; sklearn.test()"
This method might display doctest failures because of nosetests issues.
scikit-learn can also be tested without having the package installed. For this you must compile the sources inplace from the source directory:
python setup.py build_ext --inplace
Test can now be run using nosetests:
This is automated by the commands: