These 50 MCQs covers fundamental concepts in regression analysis, including linear and multiple regression, assumptions, diagnostics, and interpretation. Ideal for students and professionals in data analysis to test understanding of predictive modeling techniques.
1. What does linear regression model the relationship between?
Correct Answer: b) A dependent variable and one or more independent variables
Explanation:
Linear regression predicts a continuous outcome (Y) as a linear function of predictors (X).
2. The slope coefficient in simple linear regression represents
Correct Answer: b) Change in Y for a one-unit change in X
Explanation:
β1 = ΔY / ΔX, holding other factors constant.
3. R-squared measures
Correct Answer: b) Proportion of variance in Y explained by X
Explanation:
Ranges from 0 to 1; higher values indicate better fit.
4. The assumption of linearity in regression means
Correct Answer: b) Relationship between X and Y is linear
Explanation:
Verified via scatter plots or residual plots.
5. Homoscedasticity refers to
Correct Answer: a) Constant variance of residuals
Explanation:
Tested with Breusch-Pagan; violations suggest heteroscedasticity.
6. In multiple regression, multicollinearity is detected using
Correct Answer: a) VIF (Variance Inflation Factor)
Explanation:
VIF > 5-10 indicates high collinearity among predictors.
7. The intercept in regression is
Correct Answer: a) Expected Y when all X=0
Explanation:
β0; may lack interpretation if X=0 is outside range.
8. Residuals are
Correct Answer: b) Observed minus predicted values
Explanation:
Used for diagnostics; should be randomly distributed.
9. The F-test in regression tests
Correct Answer: a) Overall model significance
Explanation:
H0: all β=0; low p-value indicates model explains variance.
10. t-test for coefficients tests
Correct Answer: a) H0: β=0
Explanation:
Significance of individual predictors.
11. Adjusted R-squared accounts for
Correct Answer: c) Both a and b
Explanation:
Penalizes adding irrelevant variables; better for model comparison.
12. Outliers in regression can be detected using
Correct Answer: a) Leverage and Cook's distance
Explanation:
High values indicate influential points affecting fit.
13. The standard error of the estimate is
Correct Answer: a) Root mean squared error
Explanation:
Measures prediction accuracy; √(SSE/(n-k-1)).
14. Autocorrelation in residuals is tested with
Correct Answer: a) Durbin-Watson test
Explanation:
Values near 2 indicate no serial correlation; common in time series.
15. Logistic regression is used for
Correct Answer: b) Binary or categorical outcomes
Explanation:
Models log-odds; uses sigmoid function.
16. In ridge regression, the penalty is on
Correct Answer: a) Sum of squared coefficients
Explanation:
L2 regularization; reduces multicollinearity.
17. Lasso regression uses
Correct Answer: a) L1 penalty
Explanation:
Sum of absolute coefficients; performs variable selection.
18. Polynomial regression extends linear by
Correct Answer: a) Adding higher powers of X
Explanation:
Captures non-linear relationships; risks overfitting.
19. The coefficient of determination is
Correct Answer: a) R-squared
Explanation:
1 - (SS_res / SS_tot).
20. Normality of residuals is tested with
Correct Answer: a) Q-Q plot or Shapiro-Wilk
Explanation:
Assumption for inference; affects confidence intervals.
21. In OLS regression, the goal is to minimize
Correct Answer: a) Sum of squared residuals
Explanation:
Least squares method; unbiased under assumptions.
22. Heteroscedasticity can be addressed by
Correct Answer: c) Both a and b
Explanation:
Or transformations like log(Y).
23. The Durbin-Watson statistic ranges from
Correct Answer: b) 0 to 4
Explanation:
Around 2 is ideal; mean; adds dispersion parameter.
24. The Mallow's Cp selects models by
Correct Answer: a) Bias + variance estimate
Explanation:
Cp ≈ p; for subset selection.
25. In Cox proportional hazards, the assumption is
Correct Answer: a) Constant hazard ratio over time
Explanation:
Tested with Schoenfeld residuals.
26. The RMSPE is
Correct Answer: a) Root mean squared percentage error
Explanation:
Scale-free accuracy measure.
27. Backward elimination starts with
Correct Answer: a) All variables, removes insignificant
Explanation:
Stepwise; based on p-values.
28. The studentized residual is
Correct Answer: a) Residual divided by its SE
Explanation:
|e| > 3 flags outliers.
29. In probit regression, the link is
Correct Answer: a) Cumulative normal
Explanation:
For binary; inverse Φ(βX) = P(Y=1).
30. The Vuong test compares
Correct Answer: a) Non-nested GLMs
Explanation:
Likelihood-based model selection.
31. Partial regression plots show
Correct Answer: a) Marginal effect of X controlling others
Explanation:
Residuals of Y on other X vs residuals of this X.
32. The mean squared prediction error is
Correct Answer: a) Average (Y - Ŷ)^2 on new data
Explanation:
For out-of-sample performance.
33. In ARIMA regression, it models
Correct Answer: a) Time-varying errors
Explanation:
Auto-regressive integrated moving average.
34. The concordance correlation coefficient measures
Correct Answer: a) Agreement beyond correlation
Explanation:
ρ_c = ρ / √(1 + bias).
35. Kernel regression is a
Correct Answer: a) Non-parametric method
Explanation:
Nadaraya-Watson; local weighting.
36. The Ramsey RESET test detects
Correct Answer: a) Functional form misspecification
Explanation:
Adds powers of fitted values.
37. In survival regression, the baseline hazard is
Correct Answer: a) Non-parametric in Cox
Explanation:
h(t|X=0); estimated via Breslow.
38. The MAPE is sensitive to
Correct Answer: a) Zero values
Explanation:
Mean absolute percentage error; undefined if Y=0.
39. Best subset selection evaluates
Correct Answer: a) All possible subsets
Explanation:
2^p models; computationally intensive.
40. The externally studentized residual excludes
Correct Answer: a) The point itself in SE calculation
Explanation:
For detecting outliers influencing fit.
41. Ordered logit is for
Correct Answer: a) Ordinal outcomes
Explanation:
Cumulative logits; proportional odds assumption.
42. The score test in GLM is
Correct Answer: a) Efficient under H0
Explanation:
Gradient of log-likelihood; no full fit needed.
43. In spatial regression, SAR models
Correct Answer: c) Both
Explanation:
Spatial autoregressive; Wy = ρWy + Xβ + ε.
44. The Theil's U statistic compares forecasts to
Correct Answer: a) Naive method
Explanation:
U < 1 better than no-change forecast.
45. LOESS regression uses
Correct Answer: a) Local polynomials
Explanation:
Locally estimated scatterplot smoothing.
46. The Link test in regression checks
Correct Answer: a) Omitted variables or specification
Explanation:
Regress fitted and squared fitted; significant squared indicates misspec.
47. In accelerated failure time models, it assumes
Correct Answer: a) Log-linear effect on time
Explanation:
log(T) = βX + σW; parametric survival.
48. The SMAPE averages
Correct Answer: a) Symmetric absolute percentage errors
Explanation:
Handles zero issues in MAPE.
49. All subset regression is exhaustive but
Correct Answer: a) Computationally expensive
Explanation:
For p>20, use branch and bound.
50. The deleted residual is
Correct Answer: a) Prediction error leaving out i
Explanation:
PRESS component.
51. Multinomial logit for
Correct Answer: a) Nominal outcomes >2 categories
Explanation:
One vs all; IIA assumption.
52. The Wald test is
Correct Answer: a) Asymptotic for large samples
Explanation:
(β-hat / SE)^2 ~ χ².
53. In SEM, path analysis is
Correct Answer: a) Observed variables regression
Explanation:
Special case of SEM without latents.
54. The MASE normalizes errors by
Correct Answer: a) In-sample naive forecast
Explanation:
Mean absolute scaled error; scale-free.
55. Smoothing splines minimize
Correct Answer: a) RSS + smoothness penalty
Explanation:
λ tunes fit vs smoothness.
56. The CUSUM test detects
Correct Answer: a) Structural breaks
Explanation:
Cumulative sum of residuals.
57. In frailty models, it accounts for
Correct Answer: a) Unobserved heterogeneity
Explanation:
Random effects in survival.


