hv Block Cross-validation method
timecave.validation_methods.CV.hvBlockCV(ts, fs=1, h=0, v=0)
Bases: BaseSplitter
Implements the hv Block Cross-validation method.
This class implements the hv Block Cross-validation method. It is similar to the BlockCV class, but it does not support weight generation. Consequently, in order to implement a weighted version of this method, the user must implement their own derived class or compute the weights separately.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ts |
ndarray | Series
|
Univariate time series. |
required |
fs |
float | int
|
Sampling frequency (Hz). |
1
|
h |
int
|
Controls the amount of samples that will be removed from the training set.
The |
1
|
v |
int
|
Controls the size of the validation set. \(2v + 1\) samples will be used for validation. |
1
|
Attributes:
| Name | Type | Description |
|---|---|---|
n_splits |
int
|
The number of splits. |
sampling_freq |
int | float
|
The series' sampling frequency (Hz). |
Methods:
| Name | Description |
|---|---|
split |
Split the time series into training and validation sets. |
info |
Provide additional information on the validation method. |
statistics |
Compute relevant statistics for both training and validation sets. |
plot |
Plot the partitioned time series. |
Raises:
| Type | Description |
|---|---|
TypeError
|
If either |
ValueError
|
If either |
ValueError
|
If the sum of |
Warning
Being a variant of the leave-one-out CV procedure, this method is computationally intensive.
See also
Block CV: The original Block CV method, which partitions the series into equally sized folds. No training samples are removed.
Notes
The hv Block Cross-validation method is essentially a leave-one-out version of the BlockCV method. There are, however, two nuances: the first one is that the \(h\) samples immediately following and preceding the validation set are removed from the training set; the second one is that more than one sample can be used for validation. More specifically, the validation set comprises \(2v + 1\) samples. Note that, if \(h = v = 0\), the method boils down to the classic leave-one-out cross-validation procedure. The average error on the validation sets is taken as the estimate of the model's true error. This method does not preserve the temporal order of the observations.
The method was first proposed by Racine [1].
References
1
Jeff Racine. Consistent cross-validatory model-selection for dependent data: hv-block cross-validation. Journal of econometrics, 99(1):39–61, 2000
Source code in timecave/validation_methods/CV.py
info()
Provide some basic information on the training and validation sets.
This method displays the number of splits, the values of the h and v
parameters, and the maximum and minimum sizes of both the training and validation sets.
Examples:
>>> import numpy as np
>>> from timecave.validation_methods.CV import hvBlockCV
>>> ts = np.ones(10);
>>> splitter = hvBlockCV(ts, h=2, v=2);
>>> splitter.info();
hv-Block CV method
------------------
Time series size: 10 samples
Number of splits: 10
Minimum training set size: 1 samples (10.0 %)
Maximum training set size: 5 samples (50.0 %)
Minimum validation set size: 3 samples (30.0 %)
Maximum validation set size: 5 samples (50.0 %)
h: 2
v: 2
Source code in timecave/validation_methods/CV.py
plot(height, width)
Plot the partitioned time series.
This method allows the user to plot the partitioned time series. The training and validation sets are plotted using different colours.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
height |
int
|
The figure's height. |
required |
width |
int
|
The figure's width. |
required |
Examples:
>>> import numpy as np
>>> from timecave.validation_methods.CV import hvBlockCV
>>> ts = np.ones(6);
>>> splitter = hvBlockCV(ts, h=1, v=1);
>>> splitter.plot(10, 10);

Source code in timecave/validation_methods/CV.py
split()
Split the time series into training and validation sets.
This method splits the series' indices into disjoint sets containing the training and validation indices.
At every iteration, an array of training indices and another one containing the validation indices are generated.
Note that this method is a generator. To access the indices, use the next() method or a for loop.
Yields:
| Type | Description |
|---|---|
ndarray
|
Array of training indices. |
ndarray
|
Array of validation indices. |
float
|
Weight assigned to the error estimate. |
Examples:
>>> import numpy as np
>>> from timecave.validation_methods.CV import hvBlockCV
>>> ts = np.ones(10);
>>> splitter = hvBlockCV(ts, h=2, v=1); # Use 3 samples for validation; remove 2-4 samples from the training set
>>> for ind, (train, val, _) in enumerate(splitter.split()):
...
... print(f"Iteration {ind+1}");
... print(f"Training set indices: {train}");
... print(f"Validation set indices: {val}");
Iteration 1
Training set indices: [4 5 6 7 8 9]
Validation set indices: [0 1]
Iteration 2
Training set indices: [5 6 7 8 9]
Validation set indices: [0 1 2]
Iteration 3
Training set indices: [6 7 8 9]
Validation set indices: [1 2 3]
Iteration 4
Training set indices: [7 8 9]
Validation set indices: [2 3 4]
Iteration 5
Training set indices: [0 8 9]
Validation set indices: [3 4 5]
Iteration 6
Training set indices: [0 1 9]
Validation set indices: [4 5 6]
Iteration 7
Training set indices: [0 1 2]
Validation set indices: [5 6 7]
Iteration 8
Training set indices: [0 1 2 3]
Validation set indices: [6 7 8]
Iteration 9
Training set indices: [0 1 2 3 4]
Validation set indices: [7 8 9]
Iteration 10
Training set indices: [0 1 2 3 4 5]
Validation set indices: [8 9]
Source code in timecave/validation_methods/CV.py
statistics()
Compute relevant statistics for both training and validation sets.
This method computes relevant time series features, such as mean, strength-of-trend, etc. for both the whole time series, the training set and the validation set. It can and should be used to ensure that the characteristics of both the training and validation sets are, statistically speaking, similar to those of the time series one wishes to forecast. If this is not the case, using the validation method will most likely lead to a poor assessment of the model's performance.
Returns:
| Type | Description |
|---|---|
DataFrame
|
Relevant features for the entire time series. |
DataFrame
|
Relevant features for the training set. |
DataFrame
|
Relevant features for the validation set. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the time series is composed of less than three samples. |
ValueError
|
If the folds comprise less than two samples. |
Examples:
>>> import numpy as np
>>> from timecave.validation_methods.CV import hvBlockCV
>>> ts = np.hstack((np.ones(5), np.zeros(5)));
>>> splitter = hvBlockCV(ts, h=2, v=2);
>>> ts_stats, training_stats, validation_stats = splitter.statistics();
Frequency features are only meaningful if the correct sampling frequency is passed to the class.
The training set is too small to compute most meaningful features.
The training set is too small to compute most meaningful features.
>>> ts_stats
Mean Median Min Max Variance P2P_amplitude Trend_slope Spectral_centroid Spectral_rolloff Spectral_entropy Strength_of_trend Mean_crossing_rate Median_crossing_rate
0 0.5 0.5 0.0 1.0 0.25 1.0 -0.151515 0.114058 0.5 0.38717 1.59099 0.111111 0.111111
>>> training_stats
Mean Median Min Max Variance P2P_amplitude Trend_slope Spectral_centroid Spectral_rolloff Spectral_entropy Strength_of_trend Mean_crossing_rate Median_crossing_rate
0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000e+00 0.0 0.0 0.0 inf 0.0 0.0
0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000e+00 0.0 0.0 0.0 inf 0.0 0.0
0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000e+00 0.0 0.0 0.0 inf 0.0 0.0
0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000e+00 0.0 0.0 0.0 inf 0.0 0.0
0 1.0 1.0 1.0 1.0 0.0 0.0 -7.850462e-17 0.0 0.0 0.0 inf 0.0 0.0
0 1.0 1.0 1.0 1.0 0.0 0.0 8.985767e-17 0.0 0.0 0.0 inf 0.0 0.0
0 1.0 1.0 1.0 1.0 0.0 0.0 -8.214890e-17 0.0 0.0 0.0 inf 0.0 0.0
0 1.0 1.0 1.0 1.0 0.0 0.0 -1.050792e-16 0.0 0.0 0.0 inf 0.0 0.0
>>> validation_stats
Mean Median Min Max Variance P2P_amplitude Trend_slope Spectral_centroid Spectral_rolloff Spectral_entropy Strength_of_trend Mean_crossing_rate Median_crossing_rate
0 1.0 1.0 1.0 1.0 0.00 0.0 8.985767e-17 0.000000 0.0 0.000000 inf 0.00 0.00
0 1.0 1.0 1.0 1.0 0.00 0.0 -8.214890e-17 0.000000 0.0 0.000000 inf 0.00 0.00
0 1.0 1.0 1.0 1.0 0.00 0.0 -1.050792e-16 0.000000 0.0 0.000000 inf 0.00 0.00
0 0.8 1.0 0.0 1.0 0.16 1.0 -2.000000e-01 0.100000 0.4 0.630930 0.923760 0.25 0.25
0 0.6 1.0 0.0 1.0 0.24 1.0 -3.000000e-01 0.109017 0.4 0.347041 1.131371 0.25 0.25
0 0.4 0.0 0.0 1.0 0.24 1.0 -3.000000e-01 0.134752 0.4 0.347041 1.131371 0.25 0.25
0 0.2 0.0 0.0 1.0 0.16 1.0 -2.000000e-01 0.200000 0.4 1.000000 0.923760 0.25 0.25
0 0.0 0.0 0.0 0.0 0.00 0.0 0.000000e+00 0.000000 0.0 0.000000 inf 0.00 0.00
0 0.0 0.0 0.0 0.0 0.00 0.0 0.000000e+00 0.000000 0.0 0.000000 inf 0.00 0.00
0 0.0 0.0 0.0 0.0 0.00 0.0 0.000000e+00 0.000000 0.0 0.000000 inf 0.00 0.00
Source code in timecave/validation_methods/CV.py
720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 | |