These 50 MCQs covers fundamental concepts in regression analysis, including linear and multiple regression, assumptions, diagnostics, and interpretation. Ideal for students and professionals in data analysis to test understanding of predictive modeling techniques.
50 Regression Analysis in Data Analysis MCQs
✅ Correct Answer: b) A dependent variable and one or more independent variables
📝 Explanation:
Linear regression predicts a continuous outcome (Y) as a linear function of predictors (X).
✅ Correct Answer: b) Change in Y for a one-unit change in X
📝 Explanation:
β1 = ΔY / ΔX, holding other factors constant.
✅ Correct Answer: b) Proportion of variance in Y explained by X
📝 Explanation:
Ranges from 0 to 1; higher values indicate better fit.
✅ Correct Answer: b) Relationship between X and Y is linear
📝 Explanation:
Verified via scatter plots or residual plots.
✅ Correct Answer: a) Constant variance of residuals
📝 Explanation:
Tested with Breusch-Pagan; violations suggest heteroscedasticity.
✅ Correct Answer: a) VIF (Variance Inflation Factor)
📝 Explanation:
VIF > 5-10 indicates high collinearity among predictors.
✅ Correct Answer: a) Expected Y when all X=0
📝 Explanation:
β0; may lack interpretation if X=0 is outside range.
✅ Correct Answer: b) Observed minus predicted values
📝 Explanation:
Used for diagnostics; should be randomly distributed.
✅ Correct Answer: a) Overall model significance
📝 Explanation:
H0: all β=0; low p-value indicates model explains variance.
✅ Correct Answer: a) H0: β=0
📝 Explanation:
Significance of individual predictors.
✅ Correct Answer: c) Both a and b
📝 Explanation:
Penalizes adding irrelevant variables; better for model comparison.
✅ Correct Answer: a) Leverage and Cook's distance
📝 Explanation:
High values indicate influential points affecting fit.
✅ Correct Answer: a) Root mean squared error
📝 Explanation:
Measures prediction accuracy; √(SSE/(n-k-1)).
✅ Correct Answer: a) Durbin-Watson test
📝 Explanation:
Values near 2 indicate no serial correlation; common in time series.
✅ Correct Answer: b) Binary or categorical outcomes
📝 Explanation:
Models log-odds; uses sigmoid function.
✅ Correct Answer: a) Sum of squared coefficients
📝 Explanation:
L2 regularization; reduces multicollinearity.
✅ Correct Answer: a) L1 penalty
📝 Explanation:
Sum of absolute coefficients; performs variable selection.
✅ Correct Answer: a) Adding higher powers of X
📝 Explanation:
Captures non-linear relationships; risks overfitting.
✅ Correct Answer: a) R-squared
📝 Explanation:
1 - (SS_res / SS_tot).
✅ Correct Answer: a) Q-Q plot or Shapiro-Wilk
📝 Explanation:
Assumption for inference; affects confidence intervals.
✅ Correct Answer: a) Sum of squared residuals
📝 Explanation:
Least squares method; unbiased under assumptions.
✅ Correct Answer: c) Both a and b
📝 Explanation:
Or transformations like log(Y).
✅ Correct Answer: b) 0 to 4
📝 Explanation:
Around 2 is ideal; mean; adds dispersion parameter.
✅ Correct Answer: a) Bias + variance estimate
📝 Explanation:
Cp ≈ p; for subset selection.
✅ Correct Answer: a) Constant hazard ratio over time
📝 Explanation:
Tested with Schoenfeld residuals.
✅ Correct Answer: a) Root mean squared percentage error
📝 Explanation:
Scale-free accuracy measure.
✅ Correct Answer: a) All variables, removes insignificant
📝 Explanation:
Stepwise; based on p-values.
✅ Correct Answer: a) Residual divided by its SE
📝 Explanation:
|e| > 3 flags outliers.
✅ Correct Answer: a) Cumulative normal
📝 Explanation:
For binary; inverse Φ(βX) = P(Y=1).
✅ Correct Answer: a) Non-nested GLMs
📝 Explanation:
Likelihood-based model selection.
✅ Correct Answer: a) Marginal effect of X controlling others
📝 Explanation:
Residuals of Y on other X vs residuals of this X.
✅ Correct Answer: a) Average (Y - Ŷ)^2 on new data
📝 Explanation:
For out-of-sample performance.
✅ Correct Answer: a) Time-varying errors
📝 Explanation:
Auto-regressive integrated moving average.
✅ Correct Answer: a) Agreement beyond correlation
📝 Explanation:
ρ_c = ρ / √(1 + bias).
✅ Correct Answer: a) Non-parametric method
📝 Explanation:
Nadaraya-Watson; local weighting.
✅ Correct Answer: a) Functional form misspecification
📝 Explanation:
Adds powers of fitted values.
✅ Correct Answer: a) Non-parametric in Cox
📝 Explanation:
h(t|X=0); estimated via Breslow.
✅ Correct Answer: a) Zero values
📝 Explanation:
Mean absolute percentage error; undefined if Y=0.
✅ Correct Answer: a) All possible subsets
📝 Explanation:
2^p models; computationally intensive.
✅ Correct Answer: a) The point itself in SE calculation
📝 Explanation:
For detecting outliers influencing fit.
✅ Correct Answer: a) Ordinal outcomes
📝 Explanation:
Cumulative logits; proportional odds assumption.
✅ Correct Answer: a) Efficient under H0
📝 Explanation:
Gradient of log-likelihood; no full fit needed.
✅ Correct Answer: c) Both
📝 Explanation:
Spatial autoregressive; Wy = ρWy + Xβ + ε.
✅ Correct Answer: a) Naive method
📝 Explanation:
U < 1 better than no-change forecast.
✅ Correct Answer: a) Local polynomials
📝 Explanation:
Locally estimated scatterplot smoothing.
✅ Correct Answer: a) Omitted variables or specification
📝 Explanation:
Regress fitted and squared fitted; significant squared indicates misspec.
✅ Correct Answer: a) Log-linear effect on time
📝 Explanation:
log(T) = βX + σW; parametric survival.
✅ Correct Answer: a) Symmetric absolute percentage errors
📝 Explanation:
Handles zero issues in MAPE.
✅ Correct Answer: a) Computationally expensive
📝 Explanation:
For p>20, use branch and bound.
✅ Correct Answer: a) Prediction error leaving out i
📝 Explanation:
PRESS component.
✅ Correct Answer: a) Nominal outcomes >2 categories
📝 Explanation:
One vs all; IIA assumption.
✅ Correct Answer: a) Asymptotic for large samples
📝 Explanation:
(β-hat / SE)^2 ~ χ².
✅ Correct Answer: a) Observed variables regression
📝 Explanation:
Special case of SEM without latents.
✅ Correct Answer: a) In-sample naive forecast
📝 Explanation:
Mean absolute scaled error; scale-free.
✅ Correct Answer: a) RSS + smoothness penalty
📝 Explanation:
λ tunes fit vs smoothness.
✅ Correct Answer: a) Structural breaks
📝 Explanation:
Cumulative sum of residuals.
✅ Correct Answer: a) Unobserved heterogeneity
📝 Explanation:
Random effects in survival.
Related Posts
New
New
New
100 Descriptive, Inferential, and Time Series Statistics in Data Analysis - MCQs
100 challenging multiple-choice questions on descriptive statistics, inferential methods, and time series analysis. Inspired by real data science and analytics…
November 8, 2025By MCQs Generator
50 Hypothesis Testing in Data Analysis - MCQs
This set of 50 MCQs explores key concepts in hypothesis testing, including null and alternative hypotheses, p-values, test statistics, error…
November 8, 2025By MCQs Generator
130 Exploratory Data Analysis (EDA) MCQs
MCQs cover the fundamentals of Exploratory Data Analysis, covering data summarization, visualization techniques, handling anomalies, and inferring patterns from datasets.…
November 8, 2025By MCQs Generator