.. _example_applications_plot_model_complexity_influence.py: ========================== Model Complexity Influence ========================== Demonstrate how model complexity influences both prediction accuracy and computational performance. The dataset is the Boston Housing dataset (resp. 20 Newsgroups) for regression (resp. classification). For each class of models we make the model complexity vary through the choice of relevant model parameters and measure the influence on both computational performance (latency) and predictive power (MSE or Hamming Loss). .. rst-class:: horizontal * .. image:: images/plot_model_complexity_influence_001.png :scale: 47 * .. image:: images/plot_model_complexity_influence_002.png :scale: 47 * .. image:: images/plot_model_complexity_influence_003.png :scale: 47 **Script output**:: Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.25, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 4516 | Hamming Loss (Misclassification Ratio): 0.2562 | Pred. Time: 0.036248s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.5, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 1668 | Hamming Loss (Misclassification Ratio): 0.2939 | Pred. Time: 0.026719s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.75, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 890 | Hamming Loss (Misclassification Ratio): 0.3194 | Pred. Time: 0.021321s Benchmarking SGDClassifier(alpha=0.001, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.9, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=False, verbose=0, warm_start=False) Complexity: 669 | Hamming Loss (Misclassification Ratio): 0.3347 | Pred. Time: 0.019336s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 69 | MSE: 31.8133 | Pred. Time: 0.000538s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.25, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 136 | MSE: 25.6140 | Pred. Time: 0.000998s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.5, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 243 | MSE: 22.3315 | Pred. Time: 0.001719s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.75, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 350 | MSE: 21.3679 | Pred. Time: 0.002448s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.9, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False) Complexity: 404 | MSE: 21.0915 | Pred. Time: 0.002822s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=10, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 10 | MSE: 28.4402 | Pred. Time: 0.000073s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=50, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 50 | MSE: 7.8822 | Pred. Time: 0.000173s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=100, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 100 | MSE: 6.6961 | Pred. Time: 0.000280s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=200, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 200 | MSE: 5.8514 | Pred. Time: 0.000488s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, n_estimators=500, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 500 | MSE: 6.0121 | Pred. Time: 0.001166s **Python source code:** :download:`plot_model_complexity_influence.py ` .. literalinclude:: plot_model_complexity_influence.py :lines: 16- **Total running time of the example:** 29.36 seconds ( 0 minutes 29.36 seconds)