evaluation
evaluation
¶
Evaluation utilities for probabilistic predictions.
Provides diagnostic tools for assessing calibration and discrimination of predicted distributions, including PIT-based calibration for regression and concordance index for survival analysis.
Plot functions require matplotlib (optional dependency).
pit_values
¶
pit_values(
pred_dist: Distribution, y: NDArray[floating]
) -> NDArray[floating]
Probability Integral Transform values.
For a well-calibrated model, PIT values F(y) should follow
a Uniform(0, 1) distribution.
| PARAMETER | DESCRIPTION |
|---|---|
pred_dist
|
Predicted distribution instance (one per sample).
TYPE:
|
y
|
Observed target values, shape
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray[floating]
|
PIT values |
Source code in ngboost_lightning/evaluation.py
calibration_regression
¶
calibration_regression(
pred_dist: Distribution,
y: NDArray[floating],
bins: int = 11,
) -> tuple[NDArray[floating], NDArray[floating]]
PIT-based calibration curve for regression.
For each quantile level q in linspace(0, 1, bins), computes
the fraction of observations falling below the predicted q-th
quantile. A perfectly calibrated model has
observed_fractions == expected_quantiles.
| PARAMETER | DESCRIPTION |
|---|---|
pred_dist
|
Predicted distribution instance (one per sample).
TYPE:
|
y
|
Observed target values, shape
TYPE:
|
bins
|
Number of equally spaced quantile levels (including 0 and 1).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray[floating]
|
Tuple |
NDArray[floating]
|
|
Source code in ngboost_lightning/evaluation.py
calibration_error
¶
Mean squared calibration error.
| PARAMETER | DESCRIPTION |
|---|---|
expected
|
Expected quantile levels, shape
TYPE:
|
observed
|
Observed fractions, shape
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
Mean squared error between expected and observed. |
Source code in ngboost_lightning/evaluation.py
concordance_index
¶
concordance_index(
predicted_times: NDArray[floating],
event_times: NDArray[floating],
event_observed: NDArray[bool_],
) -> float
Harrell's concordance index (C-statistic).
Measures the fraction of comparable pairs where the predicted
survival time correctly orders the observed event times.
A pair (i, j) is comparable if the earlier event is uncensored.
| PARAMETER | DESCRIPTION |
|---|---|
predicted_times
|
Predicted survival times (e.g. median),
shape
TYPE:
|
event_times
|
Observed times, shape
TYPE:
|
event_observed
|
Boolean event indicator (
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
float
|
C-index in |
float
|
concordance. |
Source code in ngboost_lightning/evaluation.py
calibration_survival
¶
calibration_survival(
pred_dist: Distribution,
event_times: NDArray[floating],
event_observed: NDArray[bool_],
bins: int = 10,
) -> tuple[NDArray[floating], NDArray[floating]]
D-calibration for survival models.
Bins samples by their predicted survival probability at the observed time, then compares the predicted probability against the observed event rate within each bin.
| PARAMETER | DESCRIPTION |
|---|---|
pred_dist
|
Predicted distribution instance (one per sample).
TYPE:
|
event_times
|
Observed times, shape
TYPE:
|
event_observed
|
Boolean event indicator (
TYPE:
|
bins
|
Number of bins.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
NDArray[floating]
|
Tuple |
NDArray[floating]
|
|
Source code in ngboost_lightning/evaluation.py
plot_pit_histogram
¶
plot_pit_histogram(
pred_dist: Distribution,
y: NDArray[floating],
bins: int = 20,
ax: Any = None,
) -> Any
Histogram of PIT values with uniform reference line.
Requires matplotlib.
| PARAMETER | DESCRIPTION |
|---|---|
pred_dist
|
Predicted distribution instance (one per sample).
TYPE:
|
y
|
Observed target values, shape
TYPE:
|
bins
|
Number of histogram bins.
TYPE:
|
ax
|
Matplotlib axes to plot on. If
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Any
|
Matplotlib |
Source code in ngboost_lightning/evaluation.py
plot_calibration_curve
¶
plot_calibration_curve(
pred_dist: Distribution,
y: NDArray[floating],
bins: int = 11,
ax: Any = None,
) -> Any
Calibration curve: expected vs observed quantile fractions.
Requires matplotlib.
| PARAMETER | DESCRIPTION |
|---|---|
pred_dist
|
Predicted distribution instance (one per sample).
TYPE:
|
y
|
Observed target values, shape
TYPE:
|
bins
|
Number of quantile levels.
TYPE:
|
ax
|
Matplotlib axes to plot on. If
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Any
|
Matplotlib |