under_over_estimation
timecave.validation_strategy_metrics.under_over_estimation(estimated_error_list, test_error_list, metric)
Compute separate validation strategy metrics for underestimation and overestimation instances (for N different experiments).
This function processes the results of a Monte Carlo experiment and outputs two separate
sets of summary statistics: one for cases where the true error is underestimated, and another one for cases
where the validation method overestimates the error.
This can be useful if one needs to analyse the performance of a given validation method on several different time series or using different models.
Users may provide a custom metric if they so desire, but it must have the same function signature as the metrics provided by this package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimated_error_list |
list[float | int]
|
List of estimated (i.e. validation) errors, one for each experiment / trial. |
required |
test_error_list |
list[float | int]
|
List of test errors, one for each experiment / trial. |
required |
metric |
callable
|
Validation strategy metric. |
required |
Returns:
| Type | Description |
|---|---|
tuple[dict]
|
[Separate] Statistical summaries for the overestimation and underestimation cases. The first dictionary is for the underestimation cases. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the estimator error list and the test error list differ in length. |
See also
MC_metric: Computes relevant statistics for the whole Monte Carlo experiment (i.e. does not differentiate between overestimation and underestimation).
Examples:
>>> from timecave.validation_strategy_metrics import under_over_estimation, PAE
>>> true_errors = [10, 30, 10, 50];
>>> validation_errors = [20, 20, 50, 30];
>>> under_over_estimation(validation_errors, true_errors, PAE)
({'Mean': -15.0, 'Median': -15.0, '1st_Quartile': -17.5, '3rd_Quartile': -12.5, 'Minimum': -20.0, 'Maximum': -10.0, 'Standard_deviation': 5.0, 'N': 2, '%': 50.0}, {'Mean': 25.0, 'Median': 25.0, '1st_Quartile': 17.5, '3rd_Quartile': 32.5, 'Minimum': 10.0, 'Maximum': 40.0, 'Standard_deviation': 15.0, 'N': 2, '%': 50.0})
If there are no overestimation or underestimation cases, the respective dictionary will be empty:
>>> under_over_estimation([10, 20, 30], [5, 10, 15], PAE)
No errors were underestimated. Underestimation data dictionary empty.
({}, {'Mean': 10.0, 'Median': 10.0, '1st_Quartile': 7.5, '3rd_Quartile': 12.5, 'Minimum': 5.0, 'Maximum': 15.0, 'Standard_deviation': 4.08248290463863, 'N': 3, '%': 100.0})
If the lengths of the estimated error and test error lists do not match, an exception is thrown:
>>> under_over_estimation(validation_errors, [10], PAE)
Traceback (most recent call last):
...
ValueError: The estimated error and test error lists must have the same length.
Source code in timecave/validation_strategy_metrics.py
516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 | |