.. _example_applications_plot_model_complexity_influence.py: ========================== Model Complexity Influence ========================== Demonstrate how model complexity influences both prediction accuracy and computational performance. The dataset is the Boston Housing dataset (resp. 20 Newsgroups) for regression (resp. classification). For each class of models we make the model complexity vary through the choice of relevant model parameters and measure the influence on both computational performance (latency) and predictive power (MSE or Hamming Loss). .. rst-class:: horizontal * .. image:: images/plot_model_complexity_influence_001.png :scale: 47 * .. image:: images/plot_model_complexity_influence_002.png :scale: 47 * .. image:: images/plot_model_complexity_influence_003.png :scale: 47 **Script output**:: Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.25, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True, verbose=0, warm_start=False) Complexity: 4488 | Hamming Loss (Misclassification Ratio): 0.2586 | Pred. Time: 0.029275s Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.5, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True, verbose=0, warm_start=False) Complexity: 1650 | Hamming Loss (Misclassification Ratio): 0.2875 | Pred. Time: 0.020339s Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.75, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True, verbose=0, warm_start=False) Complexity: 893 | Hamming Loss (Misclassification Ratio): 0.3286 | Pred. Time: 0.016850s Benchmarking SGDClassifier(alpha=0.001, average=False, class_weight=None, epsilon=0.1, eta0=0.0, fit_intercept=True, l1_ratio=0.9, learning_rate='optimal', loss='modified_huber', n_iter=5, n_jobs=1, penalty='elasticnet', power_t=0.5, random_state=None, shuffle=True, verbose=0, warm_start=False) Complexity: 666 | Hamming Loss (Misclassification Ratio): 0.3281 | Pred. Time: 0.014667s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.1, shrinking=True, tol=0.001, verbose=False) Complexity: 69 | MSE: 31.8133 | Pred. Time: 0.000412s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.25, shrinking=True, tol=0.001, verbose=False) Complexity: 136 | MSE: 25.6140 | Pred. Time: 0.000745s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.5, shrinking=True, tol=0.001, verbose=False) Complexity: 243 | MSE: 22.3315 | Pred. Time: 0.001311s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.75, shrinking=True, tol=0.001, verbose=False) Complexity: 350 | MSE: 21.3679 | Pred. Time: 0.001847s Benchmarking NuSVR(C=1000.0, cache_size=200, coef0=0.0, degree=3, gamma=3.0517578125e-05, kernel='rbf', max_iter=-1, nu=0.9, shrinking=True, tol=0.001, verbose=False) Complexity: 404 | MSE: 21.0915 | Pred. Time: 0.002115s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 10 | MSE: 27.9672 | Pred. Time: 0.000075s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=50, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 50 | MSE: 8.0288 | Pred. Time: 0.000161s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=100, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 100 | MSE: 6.7578 | Pred. Time: 0.000252s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=200, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 200 | MSE: 5.8592 | Pred. Time: 0.000438s Benchmarking GradientBoostingRegressor(alpha=0.9, init=None, learning_rate=0.1, loss='ls', max_depth=3, max_features=None, max_leaf_nodes=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=500, random_state=None, subsample=1.0, verbose=0, warm_start=False) Complexity: 500 | MSE: 6.0492 | Pred. Time: 0.001019s **Python source code:** :download:`plot_model_complexity_influence.py <plot_model_complexity_influence.py>` .. literalinclude:: plot_model_complexity_influence.py :lines: 16- **Total running time of the example:** 22.11 seconds ( 0 minutes 22.11 seconds)