Thursday, April 9, 2015

Spatial Autocorrelation

Part 1

Figure 1 shows a Pearson correlation matrix for the
selected variables of sound level and distance. 

The figure above (Figure 1) shows the results of Pearson correlation between sound levels and distance. There is a negative correlation between the two variables. Although there is a negative correlation between the two, the correlation is still considered to be high, meaning there is a strong association between the variables. 

Part 2

Figure 2. Shows a correlation matrix of the data provided by the instructor. A variety of
variables related to lifestyle, education attainments and race or ethnicity were tested. 
The results of the correlation matrix shown above (Figure 2) were very interesting. Many of the results had a negative correlation. Although it is negative, we cannot determine which variables are increasing or decreasing without a scatterplot of the data. When comparing living below the poverty line to having a bachelor’s degree and not having a high school diploma, the results are very similar in some aspects. Having no high school diploma and living below the poverty line have a positive correlation while living below the poverty line and having a bachelor’s degree is a negative correlation.  Although this is a negative relationship, we are not able to identify which is the increasing or decreasing variable from the data provided. After looking at this data, it appears that education attainments have a strong correlation with lifestyle. There is both a positive and negative correlation, but is a strong correlation in both cases.

Part 3
Introduction:

This assignment gives the student a background in spatial autocorrelation. The analysis will involve using spatial autocorrelation on presidential elections for the state of Texas. The data specifically is the percent democratic vote for the 1980 and 2008 presidential elections as well as the voter turnout for each election. In addition to the provided data, the percent of Hispanic population data will also need to be downloaded. The assignment asks the student to analyze the results to see if there is clustering or similar voting patterns for the particular variables across the state of Texas.   

Methods:

In order to start analysis using spatial autocorrelation, the Hispanic 2010 population percentage shapefile needed to be downloaded from the U.S. census website. Once the data was downloaded, the student needed to navigate the metadata to find which field within the shapefile had the necessary information. After the data was identified, it was then copied into an Excel file with the provided four datasets. Once all the data was combined, it was then joined with a shapefile for the state of Texas. 

Once all the tables were joined, the data was then ready for analysis.
The software used for this analysis is Geoda. Geoda only works with shapefiles which is the reason all the data had to be converted to that particular format. When opening the new file in Geoda, a weight needed to be created. When selecting how to weight the file, ROOK continuity was the specified choice. This weight allows for the student to determine both Moran’s I and create LISA cluster maps.

Moran's I and LISA maps were used for different purposes. The Moran's I allows for a visual representation of where all the data would fall on a graph. It allows to see where there may be clusters of votes or how strong a correlation may be. The LISA cluster maps add the geographical aspect to the data. It allows to see where in the state of Texas that these particular voting patterns were taking place. Using one method or the other would both provide valuable information, When used together, it allows the user to see exactly what is happening in the state. 

 Results: 

The results of all this testing provided some insight into the voting patterns for the state of Texas related to Democratic voting patterns. The two figures below (Figure 3 and Figure 4) show the percent Democratic vote for the 1980 and 2008 presidential elections. 


Figure 3 shows the Moran's I
value for the 2008 percent
Democratic vote for the president.
Figure 4 shows the Moran's I
value for the 1980 percent
Democratic vote for the president.

The graph to the right has much more sign of clustering within the data. A high percentage of the points are right around the center of the graph with only a few outliers. 
The graph on the right is much more spread out than the the one on the left. The points are not centered around the center of the graph. There are not any obvious outliers of the data due to the spaced out nature of the entire dataset. When looking at the Moran's I number on the top of each graph, that also shows that the left graph should have more clustering, due to the stronger correlation. 


The next two graphs shown below (Figure 5 and Figure 6) Show the voter turnout for the 1980 and 2008 presidential elections. 

Figure 5 shows the Moran's I
value for the percent voter
turnout in the 2008
presidential election. 
Figure 6 shows the Moran's I
value for the percent voter
turnout in the 2008
presidential election.

The graph on the left appears to have more clustering than the graph on the right, but it has a lower strength correlation. Although more points appear to cluster around the center of the graph, they do not cluster along the line. The greater clustering around the line in the graph to the right shows the stronger correlation. The Moran's I value is also higher than the graph on the left which further explains the findings.

Figure 7  shows the Moran's I value for 2010
Hispanic population  percentage.




The graph to the left (Figure 7) shows the Moran's I chart of Hispanic population percentage in the year 2010. Although the data does not appear to cluster around the center of the graph as much as the previous four figures, it has the highest Moran's I value. This value also tells us that there is a high degree of clustering around the fitted line. 











The next maps are LISA cluster maps created in Geoda. It is the same information shown in figures 3, 4, 5, 6 and 7 but represented geographically. The two figures below (Figure 8 and Figure 9) show the percent Democratic vote of the 1980 and 2008 presidential elections. 

Figure 8 shows the percent of Democratic vote for
the 2008 presidential election.



The figure to the right shows the 2008 Democratic percentage vote for the president. There appears to be a high percentage of Democratic voting in the southern portion of the state. The low percentage of Democratic voting happened in the central and northern sections of the state. 



Figure 9 shows the percent of Democratic vote for
the 1980 presidential election.


The figure to the left shows the 1980 percent Democratic vote for the president. In comparison to figure 8, there is still a high percentage of Democratic vote happening in the southern portion of the state, as well as a large percentage in the eastern part of the state. The lower amounts of Democratic vote stayed in relatively the same places within the central and northern sections of the state. 






Figure 10 shows the percent of voter
turnout for the 2008 presidential election.

Figure 11 shows the percent of voter
turnout for the 1980 presidential election.
The figure to the left (Figure 10) shows the percent of voter turnout in the 2008 presidential election. Other than the southern portion of the county, there does not appear to be a large connection between voter turnout and percent Democratic vote. The figure to the right (Figure 11) show sthe percent of voter turnout in the 1980 presidential election. This also shows that the southern portion of the state there is a connection between voter turnout and a greater percentage of democratic vote. 

Figure 12 show the 2010 Hispanic population percentage.


The figure to the right (Figure 12) shows the percentage of Hispanic population in the year 2010. As the map shows, there is a high Hispanic population percentage in the south and west part of the state, while there are lower percentage in the east-central parts of the state. When comparing this map to figure 8, it appears that there is a correlation between Hispanic population and percent democratic vote. 









Results:

The results of this exercise are very interesting. Although Hispanic population data was not used in this assignment, it does appear that there is a connection between Hispanic percent population and the Democratic presidential vote. In terms of voting pattern, they have stayed fairly consistent over the past 20 years. Democratic voting has started to shift westward, but has also decreased in other areas of the state. There does not appear to be a great change in voter turnout across the state. Overall there does not appear to be a great change with the different variables used in this assignment. 


Sources:

U.S. Census
Professor Weichelt

No comments:

Post a Comment