Causality Tests

The gtime.causality module deals with the causality tests for time series data.

class gtime.causality.GrangerCausality(target_col: str, x_col: str, max_shift=10, statistics=['ssr_f'])

Class to check for Granger causality between two time series, i.e. to check if a time series X causes Y: X->Y.

Parameters

target_colstr

The column to use as the reference, i.e. the time series Y

x_colstr

The column to test for Granger causality, i.e. the time series X.

max_shiftint, optional, default: 10

The maximal number of shifts to check for Granger causality.

statisticslist, optional, default: [‘ssr_f’]

The statistical test(s) to perform for Granger causality. A list with elements from the set: ‘ssr_f’ (sum squared residuals with F-test), ‘ssr_chi2’ (sum squared residuals with chi square test), ‘likelihood_chi2’ (likelihood ratio test with chi square distribution), ‘zero_F’ (F-test that all lag coefficients of the time series X are zero).

Attributes

results_list

A list of pandas dataframes with the results for all the listed tests in ‘statistics’.

Examples

>>> from gtime.causality.granger_causality import GrangerCausality
>>> import pandas.util.testing as testing
>>> data = testing.makeTimeDataFrame(freq="s", nper=1000)
>>> gc = GrangerCausality(target_col='A', x_col='B', max_shift=10, statistics=['ssr_f']).fit(data)
>>> gc.results_[0]
                    ssr F-test
F-value               0.372640
p-value               0.958527
degrees of freedom  969.000000
number of shifts     10.000000
fit(data: DataFrame)

Create a dataframe with the results of the Granger causality test with the specified statistical test(s).

Parameters

datapd.DataFrame, shape (n_samples, n_time_series), required

The dataframe containing the time series.

Returns

selfobject

Returns the instance itself.

class gtime.causality.ShiftedLinearCoefficient(min_shift: int = 1, max_shift: int = 10, target_col: Optional[str] = None, dropna: bool = False, bootstrap_iterations: Optional[int] = None, permutation_iterations: Optional[int] = None)

Test the shifted linear fit coefficients between two or more time series.

Parameters

min_shiftint, optional, default: 1

The minimum number of shifts to check for.

max_shiftint, optional, default: 10

The maximum number of shifts to check for.

target_colstr, optional, default: None

The column to use as the a reference (i.e., the column which is not shifted).

dropnabool, optional, default: False

Determines if the Nan values created by shifting are retained or dropped.

bootstrap_iterationsint, optional, default: None

If not None, compute the p_values of the test, by performing bootstrapping of the original data (sampling with replacement).

permutation_iterationsint, optional, default: None

If not None, compute the p_values of the test, by performing permutations of the original data.

Examples

>>> from gtime.causality.linear_coefficient import ShiftedLinearCoefficient
>>> import pandas.util.testing as testing
>>> data = testing.makeTimeDataFrame(freq="s")
>>> slc = ShiftedLinearCoefficient(target_col="A")
>>> slc.fit(data)
>>> slc.best_shifts_
y  A  B  C  D
x
A  3  6  8  5
B  9  9  4  1
C  8  2  4  9
D  3  9  4  3
>>> slc.max_corrs_
y         A         B         C         D
x
A  0.460236  0.420005  0.339370  0.267143
B  0.177856  0.300350  0.367150  0.550490
C  0.484860  0.263036  0.456046  0.251342
D  0.580068  0.344688  0.253626  0.256220
fit(data: DataFrame) ShiftedLinearCoefficient
Create the DataFrame of shifts of each time series which maximize the shifted

linear fit coefficients.

Parameters

datapd.DataFrame, shape (n_samples, n_time_series), required

The DataFrame containing the time-series on which to compute the shifted linear fit coefficients.

Returns

self : ShiftedLinearCoefficient

class gtime.causality.ShiftedPearsonCorrelation(min_shift: int = 1, max_shift: int = 10, target_col: Optional[str] = None, dropna: bool = False, bootstrap_iterations: Optional[int] = None, permutation_iterations: Optional[int] = None)

Class responsible for assessing the shifted Pearson correlations (PPMCC) between two or more series. For more info about the test, click here.

Parameters

min_shiftint, optional, default: 1

The minimum number of shifts to check for.

max_shiftint, optional, default: 10

The maximum number of shifts to check for.

target_colstr, optional, default: None

The column to use as the a reference (i.e., the columns which is not shifted).

dropnabool, optional, default: False

Determines if the Nan values created by shifting are retained or dropped.

bootstrap_iterationsint, optional, default: None

If not None, compute the p_values of the test, by performing bootstrapping of the original data (sampling with replacement).

permutation_iterationsint, optional, default: None

If not None, compute the p_values of the test, by performing permutations of the original data.

Examples

>>> from gtime.causality.pearson_correlation import ShiftedPearsonCorrelation
>>> import pandas.util.testing as testing
>>> data = testing.makeTimeDataFrame(freq="s")
>>> spc = ShiftedPearsonCorrelation(target_col="A")
>>> spc.fit(data)
>>> spc.best_shifts_
y  A  B  C  D
x
A  8  9  6  5
B  7  4  4  6
C  3  4  9  9
D  7  1  9  1
>>> spc.max_corrs_
y         A         B         C         D
x
A  0.383800  0.260627  0.343628  0.360151
B  0.311608  0.307203  0.255969  0.298523
C  0.373613  0.267335  0.211913  0.140034
D  0.496535  0.204770  0.402473  0.310065
fit(data: DataFrame) ShiftedPearsonCorrelation
Create the dataframe of shifts of each time series which maximize the

Pearson correlation (PPMCC).

Parameters

datapd.DataFrame, shape (n_samples, n_time_series), required

The DataFrame containing the time series on which to compute the shifted correlations.

Returns

self : ShiftedPearsonCorrelation