Lecture


R content

ROC curves and AUC



> library(pROC)

## fit a logistic regression to biopsy data
> model <- glm(outcome ~ ., data=biopsy, family=binomial)

## create ROC object
> roc.object <- roc(biopsy$outcome, model$linear.predictors)

## extract AUC
> roc.object$auc
 	Area under the curve: 0.9963
 	
## Plot!
> roc.data <- tibble(x = roc.object$specificities, ### TNR (1-FPR)
                     y = roc.object$sensitivities) ### TPR

> ggplot(roc.data, aes(x = x, y = y)) +
      geom_line() + scale_x_reverse() +
      ylab("Sensitivity") +
      xlab("Specificity")
      

figure


Likelihood ratio test

Used to compare nested models.



> null_model <- lm(Sepal.Length ~ Petal.Length, data = iris)
> null_glance <- glance(null_model) 

> alt_model <- lm(Sepal.Length ~ Petal.Length + Species, data = iris)
> alt_glance <- glance(alt_model)

#### LRT #####
> D <- 2 * (alt_glance$logLik - null_glance$logLik) ### test statistic 
> df <- alt_glance$df - null_glance$df             ### chisquared degrees of freedom
> 1 - pchisq(D,df)
[1] 2.799982468e-13

With P=2.79e-13, we have evidence in favor of the alternative model, i.e. with the Species predictor added in.


Stepwise model selection

The step() function in base R is one of many options for step-wise model selection. By default, it uses AIC but you can change with the criterion argument. The output of this function is the selected model itself.



> model <- lm(Sepal.Length ~ ., data = iris)


### Selection with AIC
> aic.backwards <- step(model, trace=F) ## trace=F (or trace=0) reduces output vomit 

# To prove to you that the output is a model:
> glance(aic.backwards)
> tidy(aic.backwards)
               term   estimate  std.error statistic      p.value
1       (Intercept)  2.1712663 0.27979415  7.760227 1.429502e-12
2       Sepal.Width  0.4958889 0.08606992  5.761466 4.867516e-08
3      Petal.Length  0.8292439 0.06852765 12.100867 1.073592e-23
4       Petal.Width -0.3151552 0.15119575 -2.084418 3.888826e-02
5 Speciesversicolor -0.7235620 0.24016894 -3.012721 3.059634e-03
6  Speciesvirginica -1.0234978 0.33372630 -3.066878 2.584344e-03


#### Selection with BIC, for example
> bic.backwards <- step(model, trace=F, criterion = "BIC")
>