100 challenging multiple-choice questions on descriptive statistics, inferential methods, and time series analysis. Inspired by real data science and analytics interview questions from FAANG, consulting firms, and quant roles.
100 Descriptive, Inferential, and Time Series Statistics in Data Analysis - MCQs
✅ Correct Answer: c) Mean
📝 Explanation:
The mean incorporates every value and is pulled toward extreme scores.
✅ Correct Answer: a) Mean > Median > Mode
📝 Explanation:
The tail on the right pulls the mean highest, followed by median, then mode.
✅ Correct Answer: b) Q3 – Q1
📝 Explanation:
IQR measures spread between the 75th and 25th percentiles.
✅ Correct Answer: d) Median
📝 Explanation:
Median describes location, not spread.
✅ Correct Answer: b) Normally distributed
📝 Explanation:
About 68%, 95%, and 99.7% fall within 1, 2, and 3 standard deviations.
✅ Correct Answer: a) Variance of X or Y is zero
📝 Explanation:
Division by zero standard deviation makes r undefined.
✅ Correct Answer: b) 1.5 × IQR beyond Q1 and Q3
📝 Explanation:
Points outside are flagged as potential outliers.
✅ Correct Answer: b) Relative variability when units or means differ
📝 Explanation:
CV = (σ / μ) × 100% standardizes dispersion.
✅ Correct Answer: a) (x – μ) / σ
📝 Explanation:
Measures deviations in standard deviation units.
✅ Correct Answer: b) 1 – 1/k²
📝 Explanation:
Applies to any distribution with finite variance.
✅ Correct Answer: b) Central Limit Theorem
📝 Explanation:
CLT justifies normality for large samples regardless of population shape.
✅ Correct Answer: b) Reject a true null
📝 Explanation:
False positive; probability equals α.
✅ Correct Answer: b) H₀ is true
📝 Explanation:
Small p-value casts doubt on the null.
✅ Correct Answer: b) ±1.96
📝 Explanation:
5% split equally in both tails.
✅ Correct Answer: b) Sample standard deviation s
📝 Explanation:
Leads to t-distribution with n–1 df.
✅ Correct Answer: a) 1 / √n
📝 Explanation:
Larger samples yield narrower intervals.
✅ Correct Answer: b) Type II error rate
📝 Explanation:
Probability of failing to detect a true effect.
✅ Correct Answer: a) Sample proportion p̂
📝 Explanation:
Wald interval: p̂ ± z√[p̂(1–p̂)/n].
✅ Correct Answer: a) Row and column variables are associated
📝 Explanation:
Compares observed vs expected frequencies.
✅ Correct Answer: b) Three or more population means
📝 Explanation:
F-statistic = MSB / MSW.
✅ Correct Answer: a) Mean, variance, and autocovariance are time-invariant
📝 Explanation:
Strict stationarity requires constant distribution.
✅ Correct Answer: a) Correlation between y_t and y_{t–k}
📝 Explanation:
Helps identify MA order.
✅ Correct Answer: a) Stationarity
📝 Explanation:
Root outside unit circle.
✅ Correct Answer: a) Linear trend
📝 Explanation:
Δy_t = y_t – y_{t–1} eliminates polynomial trend of order 1.
✅ Correct Answer: a) Lag 1
📝 Explanation:
Partial correlation beyond lag 1 is zero.
✅ Correct Answer: a) Series has a unit root (non-stationary)
📝 Explanation:
Reject H0 ⇒ stationary.
✅ Correct Answer: a) Seasonal period
📝 Explanation:
Common values: 12 (monthly), 4 (quarterly).
✅ Correct Answer: a) Number of parameters
📝 Explanation:
Lower AIC indicates better balance of fit and complexity.
✅ Correct Answer: a) Zero for all lags > 0
📝 Explanation:
Uncorrelated errors.
✅ Correct Answer: a) Are roughly constant in size
📝 Explanation:
Multiplicative version for proportional seasonality.
✅ Correct Answer: b) 8
📝 Explanation:
Mean = 6; variance = [(−4)² + (−2)² + 0 + 2² + 4²]/4 = 40/5 = 8.
✅ Correct Answer: b) 3
📝 Explanation:
3 appears twice; others once.
✅ Correct Answer: c) Right-skewed
📝 Explanation:
Longer tail on the positive side.
✅ Correct Answer: b) 50th
📝 Explanation:
Half the data lie below the median.
✅ Correct Answer: a) Averaging rates of change
📝 Explanation:
Handles compounding/multiplicative processes.
✅ Correct Answer: a) Variance
📝 Explanation:
Cov(X,X) = Var(X).
✅ Correct Answer: b) Ranks of the data
📝 Explanation:
Non-parametric measure of monotonic relationship.
✅ Correct Answer: a) 50 ± 4.13
📝 Explanation:
Margin = 2.064 × (10/√25) = 2.064 × 2 ≈ 4.13.
✅ Correct Answer: b) Fail to reject H0
📝 Explanation:
Critical |z| = 2.576 > 2.5.
✅ Correct Answer: a) 1068
📝 Explanation:
n = (1.96² × 0.5 × 0.5) / 0.03² ≈ 1067.11 → 1068.
✅ Correct Answer: b) n – 1
📝 Explanation:
Based on differences (n pairs).
✅ Correct Answer: a) Two variances
📝 Explanation:
H0: σ₁² = σ₂².
✅ Correct Answer: a) Medians of two independent samples
📝 Explanation:
Non-parametric alternative to two-sample t-test.
✅ Correct Answer: a) One-way ANOVA
📝 Explanation:
Compares three or more independent groups.
✅ Correct Answer: a) No autocorrelation
📝 Explanation:
Range 0–4; 2 is ideal.
✅ Correct Answer: a) Recent observations
📝 Explanation:
Higher α reacts faster to changes.
✅ Correct Answer: a) Lags 1 and 2
📝 Explanation:
PACF cuts off after p.
✅ Correct Answer: a) ∇₁₂ y_t
📝 Explanation:
∇₁₂ y_t = y_t – y_{t–12}.
✅ Correct Answer: a) Lack of autocorrelation in residuals
📝 Explanation:
High p-value supports white noise.
✅ Correct Answer: a) Stationarity
📝 Explanation:
Complements ADF; fail to reject ⇒ stationary.
✅ Correct Answer: a) Integrated
📝 Explanation:
Order of differencing to achieve stationarity.
✅ Correct Answer: a) Last observed value
📝 Explanation:
y_{t+1} = y_t + ε_{t+1} ⇒ best forecast = y_t.
✅ Correct Answer: a) Y beyond its own past
📝 Explanation:
Rejects non-causality if X lags are significant in Y equation.
✅ Correct Answer: a) Severe multicollinearity
📝 Explanation:
VIF_j = 1 / (1 – R_j²).
✅ Correct Answer: a) Proportion of variance explained
📝 Explanation:
SSR / SST.
✅ Correct Answer: a) Cov(X,Y) / Var(X)
📝 Explanation:
Minimizes sum of squared residuals.
✅ Correct Answer: a) σ, the error standard deviation
📝 Explanation:
√(SSE / (n–2)).
✅ Correct Answer: a) Logit
📝 Explanation:
log(p/(1–p)) = β0 + β1x.
✅ Correct Answer: a) 50% increase in odds per unit increase in x
📝 Explanation:
Odds multiply by exp(β).
✅ Correct Answer: a) Count data
📝 Explanation:
Mean = variance.
✅ Correct Answer: a) Survival function non-parametrically
📝 Explanation:
Product-limit estimator handles censoring.
✅ Correct Answer: a) Hazard ratios change over time
📝 Explanation:
Check via time-dependent covariates or plots.
✅ Correct Answer: a) Prior × Likelihood → Posterior
📝 Explanation:
Bayes’ theorem: P(θ|data) ∝ P(data|θ) P(θ).
✅ Correct Answer: a) Normal
📝 Explanation:
Posterior remains normal.
✅ Correct Answer: a) Sample from complex posterior distributions
📝 Explanation:
Markov Chain Monte Carlo approximates posteriors.
✅ Correct Answer: a) Resampling with replacement from the data
📝 Explanation:
Non-parametric; empirical distribution.
✅ Correct Answer: a) 2.5th and 97.5th percentiles of bootstrap statistics
📝 Explanation:
For 95% CI from B=1000 replicates.
✅ Correct Answer: a) Model generalization error
📝 Explanation:
K-fold CV estimates out-of-sample performance.
✅ Correct Answer: a) Overfit (high variance, low bias)
📝 Explanation:
Capture noise, not just signal.
✅ Correct Answer: a) Variance along projected directions
📝 Explanation:
Eigenvectors of covariance matrix.
✅ Correct Answer: a) Elbow in eigenvalue decline
📝 Explanation:
Retain components before sharp drop.
✅ Correct Answer: a) Within-cluster sum of squares
📝 Explanation:
Iterative assignment and centroid update.
✅ Correct Answer: a) –1 to +1
📝 Explanation:
Higher values indicate better cluster separation.
✅ Correct Answer: a) Chi-square test validity
📝 Explanation:
Approximation to chi-square distribution.
✅ Correct Answer: a) Paired binary data
📝 Explanation:
Tests marginal homogeneity.
✅ Correct Answer: a) Independent and identically distributed
📝 Explanation:
For large n, sample mean ≈ normal.
✅ Correct Answer: a) Welch-Satterthwaite formula
📝 Explanation:
Conservative; avoids equal variance assumption.
✅ Correct Answer: a) Large
📝 Explanation:
Benchmarks: 0.2 small, 0.5 medium, 0.8 large.
✅ Correct Answer: a) Adding irrelevant predictors
📝 Explanation:
Use adjusted R² to penalize extra terms.
✅ Correct Answer: a) Constant variance across predicted values
📝 Explanation:
Breusch-Pagan test checks this.
✅ Correct Answer: a) Strong positive autocorrelation
📝 Explanation:
Values near 0 = positive; near 4 = negative.
✅ Correct Answer: a) Irregular (random) component
📝 Explanation:
Should resemble white noise if model is good.
✅ Correct Answer: a) Seasonal-Trend decomposition using LOESS
📝 Explanation:
Robust, flexible for varying seasonal patterns.
✅ Correct Answer: a) Multiples of 12
📝 Explanation:
Indicates need for seasonal AR/MA terms.
✅ Correct Answer: a) Variance (heteroscedasticity)
📝 Explanation:
y(λ) = (y^λ – 1)/λ or log(y) for λ=0.
✅ Correct Answer: a) MA(1) structure with negative coefficient
📝 Explanation:
ACF shows spike at lag 1, then near zero.
✅ Correct Answer: a) No significant ACF/PACF spikes (white noise)
📝 Explanation:
Ljung-Box p > 0.05 supports adequacy.
✅ Correct Answer: a) Its own lags and lags of all other variables
📝 Explanation:
Vector autoregression for multivariate time series.
✅ Correct Answer: a) Future values of all variables
📝 Explanation:
Shows dynamic responses in VAR.
✅ Correct Answer: a) A stationary linear combination
📝 Explanation:
Long-run equilibrium relationship.
✅ Correct Answer: a) Number of cointegrating relationships
📝 Explanation:
Trace and max-eigenvalue statistics.
✅ Correct Answer: a) Time-varying volatility (heteroscedasticity)
📝 Explanation:
Variance depends on past squared errors.
✅ Correct Answer: a) σ²_t = α₀ + α₁ ε²_{t–1} + β₁ σ²_{t–1}
📝 Explanation:
Combines ARCH and persistence.
✅ Correct Answer: a) Normality of residuals (skewness + kurtosis)
📝 Explanation:
JB statistic ~ χ²(2).
✅ Correct Answer: a) Data come from a normal distribution
📝 Explanation:
Powerful for small samples.
✅ Correct Answer: a) Equality of variances across groups
📝 Explanation:
Robust to non-normality.
✅ Correct Answer: a) Dividing by number of tests
📝 Explanation:
Controls family-wise error rate conservatively.
✅ Correct Answer: a) Step-down multiple comparison procedure
📝 Explanation:
Less conservative than Bonferroni.
✅ Correct Answer: a) All pairwise means
📝 Explanation:
Honest Significant Difference; assumes equal variances.
✅ Correct Answer: a) Multiple treatments vs a single control
📝 Explanation:
Fewer comparisons, higher power.
✅ Correct Answer: a) Sample quantiles vs theoretical normal quantiles
📝 Explanation:
Straight line indicates normality.
✅ Correct Answer: a) Distance from mean of X (hat diagonal)
📝 Explanation:
h_ii > 2p/n flags potential leverage.
✅ Correct Answer: a) Influence of an observation on all fitted values
📝 Explanation:
Large values (> 4/n) indicate influential points.
✅ Correct Answer: a) Nested models (reduced vs full)
📝 Explanation:
Tests significance of added predictors.
✅ Correct Answer: a) λ Σ β_j² (L2)
📝 Explanation:
Shrinks coefficients, handles multicollinearity.
✅ Correct Answer: a) Sets some coefficients exactly to zero
📝 Explanation:
L1 penalty promotes sparsity.
✅ Correct Answer: a) Ridge and Lasso penalties
📝 Explanation:
Useful when p > n or high correlation.
Related Posts
New
New
New
130 Exploratory Data Analysis (EDA) MCQs
MCQs cover the fundamentals of Exploratory Data Analysis, covering data summarization, visualization techniques, handling anomalies, and inferring patterns from datasets.…
November 8, 2025By MCQs Generator
120 Data Cleaning and Preprocessing in Data Analysis - MCQs
120 industry-level multiple-choice questions on data cleaning, handling missing values, outliers, encoding, scaling, and preprocessing pipelines—modeled after real data scientist…
November 8, 2025By MCQs Generator
50 Hypothesis Testing in Data Analysis - MCQs
This set of 50 MCQs explores key concepts in hypothesis testing, including null and alternative hypotheses, p-values, test statistics, error…
November 8, 2025By MCQs Generator