Statistics
- Interpret p-values and statistical significance
P-value is probability of null hypothesis. Values below significance level (0.05) indicate significant results not due to chance.
- Recommend evaluation metrics for an imbalanced binary classification problem
Use metrics like recall, F1-score that account for class imbalance. Precision-recall curve better than ROC curve. Oversample minority class.
- What are Gaussian distributions?
Gaussian distributions describe the normal distribution characterized by mean and standard deviation parameters. They are ubiquitous in statistics and provide a symmetrical bell curve. Knowing the properties of Gaussian distributions allows for identifying anomalies as data points that diverge from the distribution.
- Define precision, recall and F1-score
Precision measures positive predictive value - the accuracy of positive predictions. Recall quantifies the true positive rate or sensitivity. F1-score balances precision and recall into a harmonic mean. These metrics provide a fuller picture of model performance beyond just accuracy for imbalanced classes.