Let’s build all the models: A logistic regression, a random forest, A Neural Network, and an XGB model. Once we build all the models lets see if we can tune them up to improve the results. After we’ve tried built and tuned all the models, lets stack them all up and see what sort of results we can get for predicting the mushrooms traits that are less likely to kill you.
## *************** Stacking Train & Test: Accuracy & Sensitivity ***************
## Train accuracy: 1
## Train sensitivity: 1
##
## *************** Test parameters ***************
##
## Test accuracy: 1
## Test sensitivity: 1
##
##
## *************** Train parameters ***************
## Confusion Matrix and Statistics
##
## Reference
## Prediction p e
## p 5582 0
## e 0 4418
##
## Accuracy : 1
## 95% CI : (0.9996, 1)
## No Information Rate : 0.5582
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 1
##
## Mcnemar's Test P-Value : NA
##
## Sensitivity : 1.0000
## Specificity : 1.0000
## Pos Pred Value : 1.0000
## Neg Pred Value : 1.0000
## Prevalence : 0.5582
## Detection Rate : 0.5582
## Detection Prevalence : 0.5582
## Balanced Accuracy : 1.0000
##
## 'Positive' Class : p
##
##
##
## *************** Test parameters ***************
## Confusion Matrix and Statistics
##
## Reference
## Prediction p e
## p 5532 0
## e 0 4468
##
## Accuracy : 1
## 95% CI : (0.9996, 1)
## No Information Rate : 0.5532
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 1
##
## Mcnemar's Test P-Value : NA
##
## Sensitivity : 1.0000
## Specificity : 1.0000
## Pos Pred Value : 1.0000
## Neg Pred Value : 1.0000
## Prevalence : 0.5532
## Detection Rate : 0.5532
## Detection Prevalence : 0.5532
## Balanced Accuracy : 1.0000
##
## 'Positive' Class : p
##
## *************** XGBoost Tuned Train & Test: Accuracy & Sensitivity ***************
##
## Train accuracy: 0.9991
## Train sensitivity: 0.9993
##
## Test accuracy: 0.9972
## Test sensitivity: 0.9969
##
## *************** Train parameters ***************
## Confusion Matrix and Statistics
##
## Reference
## Prediction p e
## p 5576 3
## e 6 4415
##
## Accuracy : 0.9991
## 95% CI : (0.9983, 0.9996)
## No Information Rate : 0.5582
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.9982
##
## Mcnemar's Test P-Value : 0.505
##
## Sensitivity : 0.9989
## Specificity : 0.9993
## Pos Pred Value : 0.9995
## Neg Pred Value : 0.9986
## Prevalence : 0.5582
## Detection Rate : 0.5576
## Detection Prevalence : 0.5579
## Balanced Accuracy : 0.9991
##
## 'Positive' Class : p
##
## *************** Test parameters ***************
## Confusion Matrix and Statistics
##
## Reference
## Prediction p e
## p 5518 14
## e 14 4454
##
## Accuracy : 0.9972
## 95% CI : (0.996, 0.9981)
## No Information Rate : 0.5532
## P-Value [Acc > NIR] : <2e-16
##
## Kappa : 0.9943
##
## Mcnemar's Test P-Value : 1
##
## Sensitivity : 0.9975
## Specificity : 0.9969
## Pos Pred Value : 0.9975
## Neg Pred Value : 0.9969
## Prevalence : 0.5532
## Detection Rate : 0.5518
## Detection Prevalence : 0.5532
## Balanced Accuracy : 0.9972
##
## 'Positive' Class : p
##
## ************************* END *************************
Most Important Results for Logistic Regression Model
##
## Call:
## stats::glm(formula = ..y ~ ., family = stats::binomial, data = data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7691 -0.4414 0.0000 0.4391 3.9620
##
## Coefficients: (4 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.62434 0.38887 4.177 2.95e-05 ***
## cap.diameter 0.35225 0.09125 3.860 0.000113 ***
## stem.height 0.43189 0.06869 6.288 3.22e-10 ***
## stem.width -0.47976 0.08761 -5.476 4.34e-08 ***
## cap.shape_f -0.25992 0.22535 -1.153 0.248756
## cap.shape_x 0.04587 0.21845 0.210 0.833680
## cap.shape_b -1.53309 0.23198 -6.609 3.88e-11 ***
## cap.shape_o 1.97582 0.38222 5.169 2.35e-07 ***
## cap.shape_p 0.10738 0.28544 0.376 0.706779
## cap.shape_s 0.25510 0.24270 1.051 0.293217
## cap.surface_s 1.53977 0.22815 6.749 1.49e-11 ***
## cap.surface_t 0.17676 0.22211 0.796 0.426133
## cap.surface_X 1.51110 0.20491 7.374 1.65e-13 ***
## cap.surface_y 0.37901 0.24534 1.545 0.122387
## cap.surface_d 1.10451 0.24151 4.573 4.80e-06 ***
## cap.surface_i -1.66615 0.38525 -4.325 1.53e-05 ***
## cap.surface_h 1.18004 0.24508 4.815 1.47e-06 ***
## cap.surface_g 2.22006 0.24590 9.028 < 2e-16 ***
## cap.surface_w 1.63190 0.28009 5.826 5.67e-09 ***
## cap.surface_k -2.64598 0.30175 -8.769 < 2e-16 ***
## cap.surface_l 3.11686 0.33118 9.411 < 2e-16 ***
## cap.color_n 0.76142 0.13378 5.691 1.26e-08 ***
## cap.color_y -0.61947 0.16037 -3.863 0.000112 ***
## cap.color_k -1.16991 0.29631 -3.948 7.87e-05 ***
## cap.color_g 0.37512 0.16842 2.227 0.025927 *
## cap.color_b 0.96017 0.27555 3.485 0.000493 ***
## cap.color_e -1.56162 0.18783 -8.314 < 2e-16 ***
## cap.color_r -1.79680 0.28911 -6.215 5.14e-10 ***
## cap.color_u -0.82855 0.25629 -3.233 0.001226 **
## cap.color_p -0.80301 0.24494 -3.278 0.001044 **
## cap.color_l 3.35544 0.50466 6.649 2.95e-11 ***
## cap.color_o -0.56969 0.19495 -2.922 0.003475 **
## does.bruise.or.bleed_f -1.39383 0.12780 -10.906 < 2e-16 ***
## gill.attachment_s 2.46543 0.20705 11.907 < 2e-16 ***
## gill.attachment_p 5.73818 0.26282 21.833 < 2e-16 ***
## gill.attachment_X 0.71015 0.17938 3.959 7.53e-05 ***
## gill.attachment_a 0.66511 0.13869 4.796 1.62e-06 ***
## gill.attachment_e 3.69266 0.26306 14.037 < 2e-16 ***
## gill.attachment_d -1.01310 0.15658 -6.470 9.79e-11 ***
## gill.attachment_f 1.75825 0.35736 4.920 8.65e-07 ***
## gill.spacing_X -1.56335 0.11496 -13.599 < 2e-16 ***
## gill.spacing_d 0.11031 0.12623 0.874 0.382198
## gill.spacing_f NA NA NA NA
## gill.color_w 0.08977 0.27238 0.330 0.741712
## gill.color_y -1.51407 0.27905 -5.426 5.77e-08 ***
## gill.color_g -0.24698 0.29778 -0.829 0.406867
## gill.color_p 0.32373 0.29194 1.109 0.267478
## gill.color_u -0.48343 0.41555 -1.163 0.244685
## gill.color_b 1.72566 0.41360 4.172 3.02e-05 ***
## gill.color_o -0.16307 0.31414 -0.519 0.603692
## gill.color_e -0.74304 0.38865 -1.912 0.055894 .
## gill.color_n -0.85240 0.28236 -3.019 0.002537 **
## gill.color_f NA NA NA NA
## gill.color_k -0.93742 0.36218 -2.588 0.009647 **
## stem.root_b 0.14770 0.17440 0.847 0.397037
## stem.root_f -27.00056 1160.23428 -0.023 0.981434
## stem.root_s -1.87412 0.21985 -8.524 < 2e-16 ***
## stem.root_r -23.16360 830.31187 -0.028 0.977744
## stem.root_c -25.15828 1348.02469 -0.019 0.985110
## stem.surface_i -0.22117 0.18567 -1.191 0.233589
## stem.surface_s 0.90046 0.14619 6.160 7.29e-10 ***
## stem.surface_f NA NA NA NA
## stem.surface_t 2.44057 0.21171 11.528 < 2e-16 ***
## stem.surface_k 1.25093 0.32526 3.846 0.000120 ***
## stem.surface_y -1.64926 0.22137 -7.450 9.33e-14 ***
## stem.surface_g -22.09622 803.59251 -0.027 0.978063
## stem.surface_h -21.84114 1675.89502 -0.013 0.989602
## stem.color_n -2.97678 0.12771 -23.309 < 2e-16 ***
## stem.color_f NA NA NA NA
## stem.color_y -2.53608 0.15191 -16.694 < 2e-16 ***
## stem.color_e -4.37711 0.23410 -18.697 < 2e-16 ***
## stem.color_g -1.16903 0.18236 -6.411 1.45e-10 ***
## stem.color_u -4.25189 0.29928 -14.207 < 2e-16 ***
## stem.color_o -0.57425 0.22387 -2.565 0.010315 *
## stem.color_p -5.37972 0.42176 -12.755 < 2e-16 ***
## stem.color_r -2.71567 0.38007 -7.145 8.99e-13 ***
## stem.color_k -5.30276 0.62291 -8.513 < 2e-16 ***
## stem.color_l 17.29866 424.02453 0.041 0.967458
## stem.color_b 15.26536 2676.18626 0.006 0.995449
## veil.type_u -31.77645 469.99168 -0.068 0.946096
## veil.color_w 28.88649 469.99151 0.061 0.950991
## veil.color_n -13.21271 845.22919 -0.016 0.987528
## veil.color_y 49.27535 1343.23913 0.037 0.970737
## veil.color_e -15.77012 3044.22581 -0.005 0.995867
## veil.color_u -17.84059 2009.05781 -0.009 0.992915
## veil.color_k 54.00640 2483.25308 0.022 0.982649
## has.ring_t -1.71861 0.32467 -5.293 1.20e-07 ***
## ring.type_z -39.62268 751.28543 -0.053 0.957939
## ring.type_e -4.92193 0.53593 -9.184 < 2e-16 ***
## ring.type_X 2.34261 0.48505 4.830 1.37e-06 ***
## ring.type_l 0.41309 0.44409 0.930 0.352269
## ring.type_r 4.13473 0.48170 8.584 < 2e-16 ***
## ring.type_g 1.23259 0.51979 2.371 0.017725 *
## ring.type_p -1.02519 0.51576 -1.988 0.046841 *
## ring.type_m 24.20408 2653.73433 0.009 0.992723
## spore.print.color_n -9.87996 995.03437 -0.010 0.992078
## spore.print.color_r 24.95274 3050.28086 0.008 0.993473
## spore.print.color_p -2.32143 0.46950 -4.945 7.63e-07 ***
## spore.print.color_k -29.20845 469.99155 -0.062 0.950446
## spore.print.color_w 2.11061 0.29843 7.072 1.52e-12 ***
## spore.print.color_u -31.00718 2164.21842 -0.014 0.988569
## spore.print.color_g 18.13909 1894.94013 0.010 0.992362
## habitat_m 0.12910 0.23624 0.547 0.584718
## habitat_d 0.02806 0.13204 0.213 0.831695
## habitat_h 0.94718 0.22364 4.235 2.28e-05 ***
## habitat_l 0.64185 0.18456 3.478 0.000506 ***
## habitat_w 17.96183 2027.70107 0.009 0.992932
## habitat_p -20.36161 1861.69393 -0.011 0.991274
## habitat_u 20.17074 3568.27154 0.006 0.995490
## season_a 0.10081 0.07091 1.422 0.155117
## season_s 1.12364 0.21391 5.253 1.50e-07 ***
## season_w 1.36034 0.14191 9.586 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 13727.1 on 9999 degrees of freedom
## Residual deviance: 6145.7 on 9892 degrees of freedom
## AIC: 6361.7
##
## Number of Fisher Scoring iterations: 19
The report was completed in 6.9 seconds.
Source: Ellis, Carl McBride. “Tertiary Mushroom: 1 Million More Mushrooms.” Kaggle, 3 Aug. 2024, www.kaggle.com/datasets/carlmcbrideellis/tertiary-mushroom-1-million-more-mushrooms.