Building models:

Let’s build all the models: A logistic regression, a random forest, A Neural Network, and an XGB model. Once we build all the models lets see if we can tune them up to improve the results. After we’ve tried built and tuned all the models, lets stack them all up and see what sort of results we can get for predicting the mushrooms traits that are less likely to kill you.

## *************** Stacking Train & Test: Accuracy & Sensitivity ***************
##  Train accuracy:    1 
## Train sensitivity: 1 
## 
##  *************** Test parameters *************** 
## 
## Test  accuracy:    1 
## Test sensitivity: 1
## 
## 
##  *************** Train parameters ***************
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    p    e
##          p 5582    0
##          e    0 4418
##                                      
##                Accuracy : 1          
##                  95% CI : (0.9996, 1)
##     No Information Rate : 0.5582     
##     P-Value [Acc > NIR] : < 2.2e-16  
##                                      
##                   Kappa : 1          
##                                      
##  Mcnemar's Test P-Value : NA         
##                                      
##             Sensitivity : 1.0000     
##             Specificity : 1.0000     
##          Pos Pred Value : 1.0000     
##          Neg Pred Value : 1.0000     
##              Prevalence : 0.5582     
##          Detection Rate : 0.5582     
##    Detection Prevalence : 0.5582     
##       Balanced Accuracy : 1.0000     
##                                      
##        'Positive' Class : p          
## 
## 
## 
##  *************** Test parameters ***************
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    p    e
##          p 5532    0
##          e    0 4468
##                                      
##                Accuracy : 1          
##                  95% CI : (0.9996, 1)
##     No Information Rate : 0.5532     
##     P-Value [Acc > NIR] : < 2.2e-16  
##                                      
##                   Kappa : 1          
##                                      
##  Mcnemar's Test P-Value : NA         
##                                      
##             Sensitivity : 1.0000     
##             Specificity : 1.0000     
##          Pos Pred Value : 1.0000     
##          Neg Pred Value : 1.0000     
##              Prevalence : 0.5532     
##          Detection Rate : 0.5532     
##    Detection Prevalence : 0.5532     
##       Balanced Accuracy : 1.0000     
##                                      
##        'Positive' Class : p          
## 
## *************** XGBoost Tuned Train & Test: Accuracy & Sensitivity *************** 
## 
## Train accuracy:    0.9991 
## Train sensitivity: 0.9993 
## 
## Test  accuracy:    0.9972 
## Test sensitivity: 0.9969 
## 
## *************** Train parameters *************** 
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    p    e
##          p 5576    3
##          e    6 4415
##                                           
##                Accuracy : 0.9991          
##                  95% CI : (0.9983, 0.9996)
##     No Information Rate : 0.5582          
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.9982          
##                                           
##  Mcnemar's Test P-Value : 0.505           
##                                           
##             Sensitivity : 0.9989          
##             Specificity : 0.9993          
##          Pos Pred Value : 0.9995          
##          Neg Pred Value : 0.9986          
##              Prevalence : 0.5582          
##          Detection Rate : 0.5576          
##    Detection Prevalence : 0.5579          
##       Balanced Accuracy : 0.9991          
##                                           
##        'Positive' Class : p               
##                                           
## *************** Test parameters *************** 
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    p    e
##          p 5518   14
##          e   14 4454
##                                          
##                Accuracy : 0.9972         
##                  95% CI : (0.996, 0.9981)
##     No Information Rate : 0.5532         
##     P-Value [Acc > NIR] : <2e-16         
##                                          
##                   Kappa : 0.9943         
##                                          
##  Mcnemar's Test P-Value : 1              
##                                          
##             Sensitivity : 0.9975         
##             Specificity : 0.9969         
##          Pos Pred Value : 0.9975         
##          Neg Pred Value : 0.9969         
##              Prevalence : 0.5532         
##          Detection Rate : 0.5518         
##    Detection Prevalence : 0.5532         
##       Balanced Accuracy : 0.9972         
##                                          
##        'Positive' Class : p              
##                                          
## ************************* END *************************

Most Important Results for Logistic Regression Model

## 
## Call:
## stats::glm(formula = ..y ~ ., family = stats::binomial, data = data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.7691  -0.4414   0.0000   0.4391   3.9620  
## 
## Coefficients: (4 not defined because of singularities)
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)               1.62434    0.38887   4.177 2.95e-05 ***
## cap.diameter              0.35225    0.09125   3.860 0.000113 ***
## stem.height               0.43189    0.06869   6.288 3.22e-10 ***
## stem.width               -0.47976    0.08761  -5.476 4.34e-08 ***
## cap.shape_f              -0.25992    0.22535  -1.153 0.248756    
## cap.shape_x               0.04587    0.21845   0.210 0.833680    
## cap.shape_b              -1.53309    0.23198  -6.609 3.88e-11 ***
## cap.shape_o               1.97582    0.38222   5.169 2.35e-07 ***
## cap.shape_p               0.10738    0.28544   0.376 0.706779    
## cap.shape_s               0.25510    0.24270   1.051 0.293217    
## cap.surface_s             1.53977    0.22815   6.749 1.49e-11 ***
## cap.surface_t             0.17676    0.22211   0.796 0.426133    
## cap.surface_X             1.51110    0.20491   7.374 1.65e-13 ***
## cap.surface_y             0.37901    0.24534   1.545 0.122387    
## cap.surface_d             1.10451    0.24151   4.573 4.80e-06 ***
## cap.surface_i            -1.66615    0.38525  -4.325 1.53e-05 ***
## cap.surface_h             1.18004    0.24508   4.815 1.47e-06 ***
## cap.surface_g             2.22006    0.24590   9.028  < 2e-16 ***
## cap.surface_w             1.63190    0.28009   5.826 5.67e-09 ***
## cap.surface_k            -2.64598    0.30175  -8.769  < 2e-16 ***
## cap.surface_l             3.11686    0.33118   9.411  < 2e-16 ***
## cap.color_n               0.76142    0.13378   5.691 1.26e-08 ***
## cap.color_y              -0.61947    0.16037  -3.863 0.000112 ***
## cap.color_k              -1.16991    0.29631  -3.948 7.87e-05 ***
## cap.color_g               0.37512    0.16842   2.227 0.025927 *  
## cap.color_b               0.96017    0.27555   3.485 0.000493 ***
## cap.color_e              -1.56162    0.18783  -8.314  < 2e-16 ***
## cap.color_r              -1.79680    0.28911  -6.215 5.14e-10 ***
## cap.color_u              -0.82855    0.25629  -3.233 0.001226 ** 
## cap.color_p              -0.80301    0.24494  -3.278 0.001044 ** 
## cap.color_l               3.35544    0.50466   6.649 2.95e-11 ***
## cap.color_o              -0.56969    0.19495  -2.922 0.003475 ** 
## does.bruise.or.bleed_f   -1.39383    0.12780 -10.906  < 2e-16 ***
## gill.attachment_s         2.46543    0.20705  11.907  < 2e-16 ***
## gill.attachment_p         5.73818    0.26282  21.833  < 2e-16 ***
## gill.attachment_X         0.71015    0.17938   3.959 7.53e-05 ***
## gill.attachment_a         0.66511    0.13869   4.796 1.62e-06 ***
## gill.attachment_e         3.69266    0.26306  14.037  < 2e-16 ***
## gill.attachment_d        -1.01310    0.15658  -6.470 9.79e-11 ***
## gill.attachment_f         1.75825    0.35736   4.920 8.65e-07 ***
## gill.spacing_X           -1.56335    0.11496 -13.599  < 2e-16 ***
## gill.spacing_d            0.11031    0.12623   0.874 0.382198    
## gill.spacing_f                 NA         NA      NA       NA    
## gill.color_w              0.08977    0.27238   0.330 0.741712    
## gill.color_y             -1.51407    0.27905  -5.426 5.77e-08 ***
## gill.color_g             -0.24698    0.29778  -0.829 0.406867    
## gill.color_p              0.32373    0.29194   1.109 0.267478    
## gill.color_u             -0.48343    0.41555  -1.163 0.244685    
## gill.color_b              1.72566    0.41360   4.172 3.02e-05 ***
## gill.color_o             -0.16307    0.31414  -0.519 0.603692    
## gill.color_e             -0.74304    0.38865  -1.912 0.055894 .  
## gill.color_n             -0.85240    0.28236  -3.019 0.002537 ** 
## gill.color_f                   NA         NA      NA       NA    
## gill.color_k             -0.93742    0.36218  -2.588 0.009647 ** 
## stem.root_b               0.14770    0.17440   0.847 0.397037    
## stem.root_f             -27.00056 1160.23428  -0.023 0.981434    
## stem.root_s              -1.87412    0.21985  -8.524  < 2e-16 ***
## stem.root_r             -23.16360  830.31187  -0.028 0.977744    
## stem.root_c             -25.15828 1348.02469  -0.019 0.985110    
## stem.surface_i           -0.22117    0.18567  -1.191 0.233589    
## stem.surface_s            0.90046    0.14619   6.160 7.29e-10 ***
## stem.surface_f                 NA         NA      NA       NA    
## stem.surface_t            2.44057    0.21171  11.528  < 2e-16 ***
## stem.surface_k            1.25093    0.32526   3.846 0.000120 ***
## stem.surface_y           -1.64926    0.22137  -7.450 9.33e-14 ***
## stem.surface_g          -22.09622  803.59251  -0.027 0.978063    
## stem.surface_h          -21.84114 1675.89502  -0.013 0.989602    
## stem.color_n             -2.97678    0.12771 -23.309  < 2e-16 ***
## stem.color_f                   NA         NA      NA       NA    
## stem.color_y             -2.53608    0.15191 -16.694  < 2e-16 ***
## stem.color_e             -4.37711    0.23410 -18.697  < 2e-16 ***
## stem.color_g             -1.16903    0.18236  -6.411 1.45e-10 ***
## stem.color_u             -4.25189    0.29928 -14.207  < 2e-16 ***
## stem.color_o             -0.57425    0.22387  -2.565 0.010315 *  
## stem.color_p             -5.37972    0.42176 -12.755  < 2e-16 ***
## stem.color_r             -2.71567    0.38007  -7.145 8.99e-13 ***
## stem.color_k             -5.30276    0.62291  -8.513  < 2e-16 ***
## stem.color_l             17.29866  424.02453   0.041 0.967458    
## stem.color_b             15.26536 2676.18626   0.006 0.995449    
## veil.type_u             -31.77645  469.99168  -0.068 0.946096    
## veil.color_w             28.88649  469.99151   0.061 0.950991    
## veil.color_n            -13.21271  845.22919  -0.016 0.987528    
## veil.color_y             49.27535 1343.23913   0.037 0.970737    
## veil.color_e            -15.77012 3044.22581  -0.005 0.995867    
## veil.color_u            -17.84059 2009.05781  -0.009 0.992915    
## veil.color_k             54.00640 2483.25308   0.022 0.982649    
## has.ring_t               -1.71861    0.32467  -5.293 1.20e-07 ***
## ring.type_z             -39.62268  751.28543  -0.053 0.957939    
## ring.type_e              -4.92193    0.53593  -9.184  < 2e-16 ***
## ring.type_X               2.34261    0.48505   4.830 1.37e-06 ***
## ring.type_l               0.41309    0.44409   0.930 0.352269    
## ring.type_r               4.13473    0.48170   8.584  < 2e-16 ***
## ring.type_g               1.23259    0.51979   2.371 0.017725 *  
## ring.type_p              -1.02519    0.51576  -1.988 0.046841 *  
## ring.type_m              24.20408 2653.73433   0.009 0.992723    
## spore.print.color_n      -9.87996  995.03437  -0.010 0.992078    
## spore.print.color_r      24.95274 3050.28086   0.008 0.993473    
## spore.print.color_p      -2.32143    0.46950  -4.945 7.63e-07 ***
## spore.print.color_k     -29.20845  469.99155  -0.062 0.950446    
## spore.print.color_w       2.11061    0.29843   7.072 1.52e-12 ***
## spore.print.color_u     -31.00718 2164.21842  -0.014 0.988569    
## spore.print.color_g      18.13909 1894.94013   0.010 0.992362    
## habitat_m                 0.12910    0.23624   0.547 0.584718    
## habitat_d                 0.02806    0.13204   0.213 0.831695    
## habitat_h                 0.94718    0.22364   4.235 2.28e-05 ***
## habitat_l                 0.64185    0.18456   3.478 0.000506 ***
## habitat_w                17.96183 2027.70107   0.009 0.992932    
## habitat_p               -20.36161 1861.69393  -0.011 0.991274    
## habitat_u                20.17074 3568.27154   0.006 0.995490    
## season_a                  0.10081    0.07091   1.422 0.155117    
## season_s                  1.12364    0.21391   5.253 1.50e-07 ***
## season_w                  1.36034    0.14191   9.586  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 13727.1  on 9999  degrees of freedom
## Residual deviance:  6145.7  on 9892  degrees of freedom
## AIC: 6361.7
## 
## Number of Fisher Scoring iterations: 19

The report was completed in 6.9 seconds.

Source: Ellis, Carl McBride. “Tertiary Mushroom: 1 Million More Mushrooms.” Kaggle, 3 Aug. 2024, www.kaggle.com/datasets/carlmcbrideellis/tertiary-mushroom-1-million-more-mushrooms.