Qiu, Derek - An Applied Analysis of High-Dimensional Logistic Regression...

View the thesis

This project has been submitted to the Library for purposes of graduation, but needs to be audited for technical details related to publication in order to be approved for inclusion in the Library collection.
Term: 
Summer 2017
Degree: 
M.Sc.
Degree type: 
Project
Department: 
Department of Statistics and Actuarial Science
Faculty: 
Science
Senior supervisor: 
Richard Lockhart
Thesis title: 
An Applied Analysis of High-Dimensional Logistic Regression
Given Names: 
Derek
Surname: 
Qiu
Abstract: 
In the high dimensional setting, we investigate common regularization approaches for fitting logistic regression models with binary response variables. A literature review is provided on generalized linear models, regularization approaches which include the lasso, ridge, elastic net and relaxed lasso, and recent post-selection methods for obtaining p-values of coefficient estimates proposed by Lockhart et. al. and Buhlmann et. al. We consider varying n, p conditions, and assess model performance based on several evaluation metrics - such as their sparsity, accuracy and algorithmic time efficiency. Through a simulation study, we find that Buhlmann et. al’s multi sample splitting method performed poorly when selected covariates were highly correlated. When λ was chosen through cross validation, the elastic net had similar levels of performance as compared to the lasso, but it did not possess the level of sparsity Zou and Hastie have suggested.
Keywords: 
High dimensional;logistic regression;lasso;elastic net;significance test
Total pages: 
85