TakensEmbedding¶

class gtda.time_series.TakensEmbedding(time_delay=1, dimension=2, stride=1, flatten=True, ensure_last_value=True)[source]

Point clouds from collections of time series via independent Takens embeddings.

This transformer takes collections of (possibly multivariate) time series as input, applies the Takens embedding algorithm described in SingleTakensEmbedding to each independently, and returns a corresponding collection of point clouds in Euclidean space (or possibly higher-dimensional structures, see flatten).

Parameters
• time_delay (int, optional, default: 1) – Time delay between two consecutive values for constructing one embedded point.

• dimension (int, optional, default: 2) – Dimension of the embedding space (per variable, in the multivariate case).

• stride (int, optional, default: 1) – Stride duration between two consecutive embedded points.

• flatten (bool, optional, default: True) – Only relevant when the input of transform represents a collection of multivariate or tensor-valued time series. If True, ensures that the output is a 3D ndarray or list of 2D arrays. If False, each entry of the input collection leads to an array of dimension one higher than the entry’s dimension. See Examples.

• ensure_last_value (bool, optional, default: True) – Whether the value(s) representing the last measurement(s) must be be present in the output as the last coordinate(s) of the last embedding vector(s). If False, the first measurement(s) is (are) present as the 0-th coordinate(s) of the 0-th vector(s) instead.

Examples

>>> import numpy as np
>>> from gtda.time_series import TakensEmbedding

Two univariate time series of duration 4:

>>> X = np.arange(8).reshape(2, 4)
>>> print(X)
[[0 1 2 3]
[4 5 6 7]]
>>> TE = TakensEmbedding(time_delay=1, dimension=2)
>>> print(TE.fit_transform(X))
[[[0 1]
[1 2]
[2 3]]
[[5 6]
[6 7]
[7 8]]]

Two multivariate time series of duration 4, with 2 variables:

>>> x = np.arange(8).reshape(2, 1, 4)
>>> X = np.concatenate([x, -x], axis=1)
>>> print(X)
[[[ 0  1  2  3]
[ 0 -1 -2 -3]]
[[ 4  5  6  7]
[-4 -5 -6 -7]]]

Pass flatten as True (default):

>>> TE = TakensEmbedding(time_delay=1, dimension=2, flatten=True)
>>> print(TE.fit_transform(X))
[[[ 0  1  0 -1]
[ 1  2 -1 -2]
[ 2  3 -2 -3]]
[[ 4  5 -4 -5]
[ 5  6 -5 -6]
[ 6  7 -6 -7]]]

Pass flatten as False:

>>> TE = TakensEmbedding(time_delay=1, dimension=2, flatten=False)
>>> print(TE.fit_transform(X))
[[[[ 0  1]
[ 1  2]
[ 2  3]]
[[ 0 -1]
[-1 -2]
[-2 -3]]]
[[[ 4  5]
[ 5  6]
[ 6  7]]
[[-4 -5]
[-5 -6]
[-6 -7]]]]

Notes

To compute the Takens embedding of a single univariate time series in the form of a 1D array or column vector, use SingleTakensEmbedding instead.

Unlike SingleTakensEmbedding, this transformer does not include heuristics to optimize the choice of time delay and embedding dimension. The function takens_embedding_optimal_parameters is specifically dedicated to this task, but only on a single univariate time series.

If dealing with a forecasting problem on a single time series, this transformer can be used after an instance of SlidingWindow and before an instance of a homology transformer, to produce topological features from sliding windows over the time series.

__init__(time_delay=1, dimension=2, stride=1, flatten=True, ensure_last_value=True)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)[source]

Do nothing and return the estimator unchanged.

This method is here to implement the usual scikit-learn API and hence work in pipelines.

Parameters
• X (ndarray or list of length n_samples) – Input collection of time series. A 2D array or list of 1D arrays is interpreted as a collection of univariate time series. A 3D array or list of 2D arrays is interpreted as a collection of multivariate time series, each with shape (n_variables, n_timestamps). More generally, :mathN-dimensional arrays or lists of (:mathN-1)-dimensional arrays ($$N \geq 3$$) are interpreted as collections of tensor-valued time series, each with time indexed by the last axis.

• y (None) – There is no need for a target, yet the pipeline API requires this parameter.

Returns

self

Return type

object

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
• X (ndarray or list of length n_samples) – Input collection of time series. A 2D array or list of 1D arrays is interpreted as a collection of univariate time series. A 3D array or list of 2D arrays is interpreted as a collection of multivariate time series, each with shape (n_variables, n_timestamps). More generally, :mathN-dimensional arrays or lists of (:mathN-1)-dimensional arrays ($$N \geq 3$$) are interpreted as collections of tensor-valued time series, each with time indexed by the last axis.

• y (None) – There is no need for a target, yet the pipeline API requires this parameter.

Returns

Xt – The result of performing a Takens embedding of each entry in X with the given parameters. If X is a 2D array or a list of 1D arrays, Xt is a 3D array or a list of 2D arrays (respectively), each entry of which has shape (n_points, dimension) where n_points = (n_timestamps - time_delay * (dimension - 1) - 1) //             stride + 1. If X is an :mathN-dimensional array or a list of (:mathN-1)-dimensional arrays ($$N \geq 3$$), the output shapes depend on the flatten parameter:

Return type

ndarray or list of length n_samples

fit_transform_plot(X, y=None, sample=0, **plot_params)

Fit to data, then apply transform_plot.

Parameters
• X (ndarray of shape (n_samples, ..)) – Input data.

• y (ndarray of shape (n_samples,) or None) – Target values for supervised problems.

• sample (int) – Sample to be plotted.

• **plot_params – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

static plot(Xt, sample=0, plotly_params=None)[source]

Plot a sample from a collection of Takens embeddings of time series, as a point cloud in 2D or 3D. If points in the window have more than three dimensions, only the first three are plotted.

Parameters
• Xt (ndarray or list of length n_samples) – Collection of point clouds, such as returned by transform.

• sample (int, optional, default: 0) – Index of the sample in Xt to be plotted.

• plotly_params (dict or None, optional, default: None) – Custom parameters to configure the plotly figure. Allowed keys are "trace" and "layout", and the corresponding values should be dictionaries containing keyword arguments as would be fed to the update_traces and update_layout methods of plotly.graph_objects.Figure.

Returns

fig – Plotly figure.

Return type

plotly.graph_objects.Figure object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

transform(X, y=None)[source]

Compute the Takens embedding of each entry in X.

Parameters
• X (ndarray or list of length n_samples) – Input collection of time series. A 2D array or list of 1D arrays is interpreted as a collection of univariate time series. A 3D array or list of 2D arrays is interpreted as a collection of multivariate time series, each with shape (n_variables, n_timestamps). More generally, :mathN-dimensional arrays or lists of (:mathN-1)-dimensional arrays ($$N \geq 3$$) are interpreted as collections of tensor-valued time series, each with time indexed by the last axis.

• y (None) – Ignored.

Returns

Xt – The result of performing a Takens embedding of each entry in X with the given parameters. If X is a 2D array or a list of 1D arrays, Xt is a 3D array or a list of 2D arrays (respectively), each entry of which has shape (n_points, dimension) where n_points = (n_timestamps - time_delay * (dimension - 1) - 1) //             stride + 1. If X is an :mathN-dimensional array or a list of (:mathN-1)-dimensional arrays ($$N \geq 3$$), the output shapes depend on the flatten parameter:

• if flatten is True, Xt is still a 3D array or a list of 2D arrays (respectively), each entry of which has shape (n_points, dimension * n_variables) where n_points is as above and n_variables is the product of the sizes of all axes in said entry except the last.

• if flatten is False, Xt is an (:mathN+1)-dimensional array or list of :mathN-dimensional arrays.

Return type

ndarray or list of length n_samples

transform_plot(X, sample=0, **plot_params)

Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample.

Parameters
• X (ndarray of shape (n_samples, ..)) – Input data.

• sample (int) – Sample to be plotted.

• **plot_params – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)