Assignment 3_DAT: Generating a Correlation Coefficient

This week, working on my subsetted Marscrater data, I'm going to examine a new research question in the context of a Pearson Correlation using linear regression model.

Research Question:

  • Is crater diameter associated with crater depth?

Null Hypothesis (H0):

  • Crater depth is not associated with crater diameter.

Alternative Hypothesis (H1):

  • Crater depth is associated with crater diameter.


I will be working with 2 quantitative variables from the Marscrater codebook, 'DEPTH_RIMFLOOR_TOPOG' and 'DIAM_CIRCLE_IMAGE', representing crater depth and crater diameter respectively. Both variables are measured in kilometres (km). I performed a linear regression analysis using both the scipy.stat and statsmodel packages in Python. The Python code is shown below.

Python Code: Calculating a Correlation Coefficient

DAT_Code 4

Model Interpretation for Correlation Coefficient

DAT_Code Results 5


My sample (subsetted data) describes a positive linear relationship between crater diameter and crater depth, indicating that higher crater depths are associated with higher crater diameters. The scatter plot above shows depth increasing rapidly with diameter for craters whose depths are less than 1km and then increases gradually for craters whose depths are greater than 1km. The correlation coefficient is 0.715, indicating a moderately strong positive relationship. Majority of the observations seem to closely fit the regression line.

A p value of 0.0 indicates that the relationship is statistically significant. This implies that it is highly unlikely that a relationship of this magnitude would be due to chance. So, I reject the null hypothesis and accept the alternative hypothesis which states that ''Crater depth is associated with crater diameter''.  To help understand the association between the 2 variables further, I calculated the coefficient of determination (denoted by R squared). An R squared of 0.51 suggests that if we know crater diameter, we can predict 51% of the variability we will see in the crater depth while 49% variability is unaccounted for. This means we can predict over half of the variability.

Conclusion: Crater depth has a strong positive association with crater diameter. This increase in crater depth with crater diameter can be represented with the linear model below;

Crater Depth = 0.04443 * Crater Diameter + 0.15275


Posted on January 17, 2016 by Okechukwu Ossai


Posted in Data Analysis Tools Course.


Leave a Reply

Your email address will not be published.