60 Important Correlation and Covariance MCQs

1 min read
[flat_pm id="7169"]

This set of 60 MCQs covers the fundamentals of correlation and covariance, including types like Pearson and Spearman, their calculations, interpretations, and applications in data analysis. Ideal for students and professionals exploring statistical relationships between variables.

1. What does covariance measure between two variables?

a) The direction and strength of linear relationship
b) How they vary together from their means
c) The difference in their means
d) The squared deviation from mean
Correct Answer: b) How they vary together from their means
📝 Explanation:
Covariance indicates the direction of linear relationship: positive if variables move together, negative if oppositely.

2. The Pearson correlation coefficient assumes which type of relationship?

a) Non-linear
b) Monotonic
c) Linear
d) Quadratic
Correct Answer: c) Linear
📝 Explanation:
Pearson's r measures the strength and direction of the linear association between two continuous variables.

3. What is the range of values for the correlation coefficient?

a) 0 to 1
b) -1 to 1
c) -∞ to ∞
d) 0 to ∞
Correct Answer: b) -1 to 1
📝 Explanation:
Values close to 1 or -1 indicate strong positive or negative correlation; 0 indicates no linear relationship.

4. Covariance can be positive, negative, or zero, but it is not standardized. What standardizes it to create correlation?

a) Dividing by the product of standard deviations
b) Multiplying by the means
c) Taking the square root
d) Subtracting the variance
Correct Answer: a) Dividing by the product of standard deviations
📝 Explanation:
Pearson's r = Cov(X,Y) / (SD_X * SD_Y), making it unitless and comparable.

5. Which correlation measure is non-parametric and suitable for ordinal data?

a) Pearson
b) Spearman
c) Kendall
d) Point-biserial
Correct Answer: b) Spearman
📝 Explanation:
Spearman's rank correlation assesses monotonic relationships without assuming normality.

6. A covariance matrix is a square matrix where the diagonal elements represent

a) Variances of variables
b) Covariances between pairs
c) Means of variables
d) Standard deviations
Correct Answer: a) Variances of variables
📝 Explanation:
Off-diagonals show pairwise covariances; diagonals are variances (Cov(X,X) = Var(X)).

7. What does a correlation of -0.85 indicate?

a) Weak positive relationship
b) Strong negative relationship
c) No relationship
d) Perfect positive relationship
Correct Answer: b) Strong negative relationship
📝 Explanation:
Values between -0.7 and -1 suggest a strong inverse linear association.

8. Kendall's tau measures the strength of

a) Linear association
b) Monotonic association
c) Quadratic association
d) Causal relationship
Correct Answer: b) Monotonic association
📝 Explanation:
It counts concordant and discordant pairs, suitable for small samples or ties.

9. The formula for sample covariance is

a) Σ(X_i - μ)(Y_i - ν) / n
b) Σ(X_i - x̄)(Y_i - ȳ) / (n-1)
c) ΣX_i Y_i / n
d) Σ(X_i^2 + Y_i^2) / n
Correct Answer: b) Σ(X_i - x̄)(Y_i - ȳ) / (n-1)
📝 Explanation:
Uses n-1 for unbiased estimate; x̄ and ȳ are sample means.

10. Correlation does not imply

a) Association
b) Causation
c) Linearity
d) Strength
Correct Answer: b) Causation
📝 Explanation:
Spurious correlations can occur without causal links; further analysis needed.

11. Which is affected by outliers more: covariance or correlation?

a) Correlation
b) Covariance
c) Both equally
d) Neither
Correct Answer: b) Covariance
📝 Explanation:
Covariance has units and scales with data; correlation is standardized and more robust.

12. Partial correlation measures the relationship between two variables while controlling for

a) All other variables
b) One or more other variables
c) Outliers
d) Non-linearity
Correct Answer: b) One or more other variables
📝 Explanation:
It removes the effect of confounders to isolate direct association.

13. In a scatter plot, a tight cluster around the line y = x indicates

a) r ≈ 0
b) r ≈ 1
c) r ≈ -1
d) Undefined r
Correct Answer: b) r ≈ 1
📝 Explanation:
Points aligned with positive slope show strong positive linear correlation.

14. The point-biserial correlation is used for

a) Two continuous variables
b) One continuous and one binary variable
c) Two binary variables
d) Ordinal and nominal
Correct Answer: b) One continuous and one binary variable
📝 Explanation:
It's a special case of Pearson for dichotomous predictors.

15. What is the covariance of a variable with itself?

a) Mean
b) Standard deviation
c) Variance
d) Correlation
Correct Answer: c) Variance
📝 Explanation:
Cov(X,X) = Var(X), the spread of the variable.

16. Spearman's rho is calculated using

a) Raw data
b) Ranked data
c) Log-transformed data
d) Z-scores
Correct Answer: b) Ranked data
📝 Explanation:
It applies Pearson correlation to ranks, handling non-normal data.

17. A correlation matrix diagonal is always

a) 1
b) 0
c) Mean
d) Variance
Correct Answer: a) 1
📝 Explanation:
Correlation of a variable with itself is perfect (r=1).

18. Polychoric correlation is appropriate for

a) Continuous variables
b) Ordinal variables assuming latent continuity
c) Binary variables
d) Nominal variables
Correct Answer: b) Ordinal variables assuming latent continuity
📝 Explanation:
It estimates correlation from underlying normal variables.

19. If two variables have zero covariance, their correlation is

a) 1
b) 0
c) -1
d) Undefined
Correct Answer: b) 0
📝 Explanation:
No linear co-variation implies uncorrelated variables.

20. The sign of covariance matches the sign of

a) Correlation
b) Variance
c) Mean
d) Standard deviation
Correct Answer: a) Correlation
📝 Explanation:
Both indicate direction: positive or negative association.

21. Biserial correlation assumes the binary variable is

a) Continuous
b) From an underlying continuous distribution
c) Nominal
d) Ordinal
Correct Answer: b) From an underlying continuous distribution
📝 Explanation:
Used when the dichotomy is artificial, like pass/fail.

22. In hypothesis testing for correlation, the null hypothesis is usually

a) ρ = 0
b) ρ = 1
c) ρ > 0
d) ρ < 0
Correct Answer: a) ρ = 0
📝 Explanation:
Tests if the true population correlation is zero (no association).

23. Tetrachoric correlation is for

a) Two continuous variables
b) Two binary variables
c) Ordinal and continuous
d) Nominal and binary
Correct Answer: b) Two binary variables
📝 Explanation:
Assumes underlying bivariate normal distribution.

24. Covariance is sensitive to

a) Units of measurement
b) Scale invariance
c) Normality only
d) Monotonicity
Correct Answer: a) Units of measurement
📝 Explanation:
Changing units (e.g., cm to m) scales covariance, unlike correlation.

25. Phi coefficient measures correlation for

a) Two continuous variables
b) Two binary variables
c) One binary and one continuous
d) Ordinal variables
Correct Answer: b) Two binary variables
📝 Explanation:
It's Pearson's r applied to 2x2 contingency tables.

26. A high positive covariance means variables tend to be

a) Both high or both low
b) One high, one low
c) Independent
d) Constant
Correct Answer: a) Both high or both low
📝 Explanation:
They deviate from means in the same direction.

27. Multiple correlation coefficient R measures

a) Pairwise correlations
b) Relationship between one variable and multiple others
c) Covariance matrix determinant
d) Partial associations
Correct Answer: b) Relationship between one variable and multiple others
📝 Explanation:
R is the correlation between observed and predicted values in regression.

28. Intraclass correlation assesses

a) Between-group similarity
b) Within-group reliability
c) Linear trends
d) Causal effects
Correct Answer: b) Within-group reliability
📝 Explanation:
Used for clustered data, like inter-rater agreement.

29. If X and Y are independent, their covariance is

a) Positive
b) Negative
c) Zero
d) Variance of X
Correct Answer: c) Zero
📝 Explanation:
Independence implies uncorrelatedness (for linear relationships).

30. Cramér's V is a measure of association for

a) Continuous variables
b) Nominal categorical variables
c) Ordinal variables
d) Binary only
Correct Answer: b) Nominal categorical variables
📝 Explanation:
Derived from chi-square, ranges 0-1 for contingency tables.

31. The sample correlation formula uses

a) n in denominator
b) n-1 in denominator
c) n-2 in denominator
d) Population size
Correct Answer: a) n in denominator
📝 Explanation:
For r, it's Σ deviations product divided by (n * SD_X * SD_Y), but consistent with covariance's n-1 adjustment.

32. Canonical correlation analyzes relationships between

a) Two sets of variables
b) Single pairs
c) Time series
d) Spatial data
Correct Answer: a) Two sets of variables
📝 Explanation:
Finds linear combinations maximizing correlation between groups.

33. A correlation of 0 does not mean

a) No linear relationship
b) No relationship at all
c) Independence
d) Variables are unrelated
Correct Answer: b) No relationship at all
📝 Explanation:
Non-linear relationships (e.g., quadratic) can exist with r=0.

34. Variance-covariance matrix is used in

a) PCA for dimensionality reduction
b) Only regression
c) Histogram plotting
d) Data cleaning
Correct Answer: a) PCA for dimensionality reduction
📝 Explanation:
Eigen decomposition reveals principal components.

35. Kendall's tau-b accounts for

a) Ties in data
b) Only linear ties
c) No ties
d) Outliers only
Correct Answer: a) Ties in data
📝 Explanation:
Adjusts for tied ranks in ordinal comparisons.

36. Negative covariance indicates

a) Variables move in same direction
b) Variables move in opposite directions
c) No movement
d) Constant values
Correct Answer: b) Variables move in opposite directions
📝 Explanation:
One increases as the other decreases from means.

37. Theil's U is an asymmetric measure for

a) Categorical prediction uncertainty
b) Continuous correlation
c) Ordinal ranks
d) Binary phi
Correct Answer: a) Categorical prediction uncertainty
📝 Explanation:
Measures how one variable reduces uncertainty in another.

38. In time series, autocorrelation is

a) Covariance with lagged self
b) Cross-correlation with another series
c) Static variance
d) Trend removal
Correct Answer: a) Covariance with lagged self
📝 Explanation:
Normalized to detect patterns like seasonality.

39. Pearson correlation requires

a) Linearity and homoscedasticity
b) Only monotonicity
c) No assumptions
d) Categorical data
Correct Answer: a) Linearity and homoscedasticity
📝 Explanation:
Assumes bivariate normality for significance tests.

40. Cross-correlation measures

a) Similarity between two series at lags
b) Self-similarity
c) Static pairs
d) Spatial links
Correct Answer: a) Similarity between two series at lags
📝 Explanation:
Used in signal processing for time shifts.

41. A value of covariance close to zero suggests

a) Strong association
b) Little linear co-variation
c) Perfect opposition
d) High variance
Correct Answer: b) Little linear co-variation
📝 Explanation:
But magnitude depends on scales; check correlation for standardization.

42. Sommer's d is a correction for

a) Ties in Kendall's tau
b) Outliers in Pearson
c) Small samples
d) Non-normality
Correct Answer: a) Ties in Kendall's tau
📝 Explanation:
Adjusts tau-a for tied observations.

43. In multivariate analysis, the correlation between residuals checks for

a) Multicollinearity
b) Independence assumptions
c) Homoscedasticity
d) Normality
Correct Answer: b) Independence assumptions
📝 Explanation:
Non-zero residual correlations violate model assumptions.

44. Covariance in portfolio theory measures

a) Risk diversification
b) Asset co-movement
c) Expected returns
d) Volatility alone
Correct Answer: b) Asset co-movement
📝 Explanation:
Low or negative covariance reduces overall portfolio risk.

45. Gamma coefficient is for

a) Ordinal data, ignoring ties
b) Nominal data
c) Continuous data
d) Binary data
Correct Answer: a) Ordinal data, ignoring ties
📝 Explanation:
Proportion of concordant minus discordant pairs.

46. The determinant of a correlation matrix indicates

a) Multicollinearity severity
b) Overall association strength
c) Average r
d) No meaning
Correct Answer: a) Multicollinearity severity
📝 Explanation:
Near zero suggests singular matrix, high redundancy.

47. Spearman correlation equals Pearson when data is

a) Ranked
b) Linear
c) Monotonic
d) Normal
Correct Answer: c) Monotonic
📝 Explanation:
Ranks preserve monotonic relationships exactly.

48. Zero correlation can occur with

a) Perfect linear trend
b) Unrelated variables
c) Quadratic relationship
d) Causal link
Correct Answer: c) Quadratic relationship
📝 Explanation:
Curvilinear patterns cancel linear signal.

49. In R, cor() function defaults to

a) Spearman
b) Kendall
c) Pearson
d) Polychoric
Correct Answer: c) Pearson
📝 Explanation:
Use method='spearman' for ranks.

50. Covariance stationarity in time series requires constant

a) Mean and variance over time
b) Only mean
c) Only covariance
d) Trends
Correct Answer: a) Mean and variance over time
📝 Explanation:
Covariances depend only on lag, not time.

51. Yule's Q measures association in

a) 2x2 tables
b) Larger contingency
c) Ordinal
d) Continuous
Correct Answer: a) 2x2 tables
📝 Explanation:
Based on odds ratio for dichotomous variables.

52. Bivariate correlation is

a) Univariate analysis
b) Pairwise relationship
c) Multivariate
d) Unconditional
Correct Answer: b) Pairwise relationship
📝 Explanation:
Focuses on two variables at a time.

53. In Excel, CORREL function computes

a) Covariance
b) Pearson correlation
c) Spearman
d) Variance
Correct Answer: b) Pearson correlation
📝 Explanation:
For two ranges of data.

54. Heteroscedasticity affects correlation by

a) No impact
b) Biasing toward zero
c) Inflating values
d) Changing sign
Correct Answer: a) No impact
📝 Explanation:
Correlation is scale-invariant but assumes homoscedasticity for inference.

55. Lambda coefficient is for

a) Asymmetric nominal association
b) Symmetric ordinal
c) Binary symmetric
d) Continuous
Correct Answer: a) Asymmetric nominal association
📝 Explanation:
Reduces prediction error using one variable to predict another.

56. The squared correlation r² represents

a) Variance explained
b) Total variance
c) Residual variance
d) Covariance ratio
Correct Answer: a) Variance explained
📝 Explanation:
Proportion of variance in Y predictable from X.

57. In Python pandas, df.corr() returns

a) Covariance matrix
b) Correlation matrix
c) Means
d) Variances
Correct Answer: b) Correlation matrix
📝 Explanation:
Default is Pearson; specify method for others.

58. Concordant pairs in Kendall's tau are those where

a) Ranks agree in order
b) Ranks disagree
c) Tied
d) Equal
Correct Answer: a) Ranks agree in order
📝 Explanation:
Both pairs increase or decrease together.

59. Covariance of standardized variables is

a) Variance
b) Correlation
c) Mean
d) Zero
Correct Answer: b) Correlation
📝 Explanation:
Z-scores have SD=1, so Cov(Zx, Zy) = r.

60. Spurious correlation example is

a) Height and weight
b) Ice cream sales and drownings
c) Temperature and plant growth
d) Age and experience
Correct Answer: b) Ice cream sales and drownings
📝 Explanation:
Both correlated with summer heat, not causally.

61. In confirmatory factor analysis, correlations inform

a) Factor loadings
b) Latent structure
c) Observed means
d) Error terms
Correct Answer: b) Latent structure
📝 Explanation:
High inter-correlations suggest common factors.

62. The eta coefficient measures

a) Nominal predictor on continuous outcome
b) Ordinal on ordinal
c) Continuous on binary
d) Spatial correlation
Correct Answer: a) Nominal predictor on continuous outcome
📝 Explanation:
Non-linear analog of point-biserial.

63. Autocovariance at lag 0 is

a) Variance
b) Mean
c) Zero
d) Covariance with lag 1
Correct Answer: a) Variance
📝 Explanation:
Cov(Y_t, Y_t) = Var(Y).

64. Fisher's z-transformation is used to

a) Stabilize variance of r for inference
b) Compute covariance
c) Rank data
d) Test normality
Correct Answer: a) Stabilize variance of r for inference
📝 Explanation:
z = 0.5 ln((1+r)/(1-r)) for hypothesis tests.

[flat_pm id="7165"]
[flat_pm id="7166"]
[flat_pm id="7168"]
← Previous: 50 Regression Analysis in Data Analysis MCQs
120 Data Cleaning and Preprocessing in Data Analysis - MCQs

120 Data Cleaning and Preprocessing in Data Analysis - MCQs

120 industry-level multiple-choice questions on data cleaning, handling missing values, outliers, encoding, scaling, and preprocessing pipelines—modeled after real data scientist…

By MCQs Generator
Descriptive, Inferential, and Time Series Statistics in Data Analysis - MCQs

100 Descriptive, Inferential, and Time Series Statistics in Data Analysis - MCQs

100 challenging multiple-choice questions on descriptive statistics, inferential methods, and time series analysis. Inspired by real data science and analytics…

By MCQs Generator
Exploratory Data Analysis (EDA) MCQs

130 Exploratory Data Analysis (EDA) MCQs

MCQs cover the fundamentals of Exploratory Data Analysis, covering data summarization, visualization techniques, handling anomalies, and inferring patterns from datasets.…

By MCQs Generator
[flat_pm id="7160"]