2016 Presidential Election Analysis

The culmination of the Data Mining class I took this past Winter quarter was an exploratory statistical analysis of the 2016 presidential election. We preprocessed the voting data by removing NA values, sorting data into federal, state, and county levels, and ensuring consistency. We then plotted the voting data on a U.S. county map to visualize which counties voted for which candidate, in order to determine what to look into further. We decided to inspect county voting outcomes on the basis of average poverty, to observe whether there was a significant difference in poverty levels between Trump and Clinton voters. Following this, we ran PCA, tree classification, and clustering analysis to determine the most influential factors on 2016 voting results.