SingleTakensEmbedding¶
- 
class gtda.time_series.SingleTakensEmbedding(parameters_type='search', time_delay=1, dimension=5, stride=1, n_jobs=None)[source]¶
- Representation of a single univariate time series as a point cloud. - Based on a time-delay embedding technique named after F. Takens 1 2. Given a discrete time series \((X_0, X_1, \ldots)\) and a sequence of evenly sampled times \(t_0, t_1, \ldots\), one extracts a set of \(d\)-dimensional vectors of the form \((X_{t_i}, X_{t_i + \tau}, \ldots , X_{t_i + (d-1)\tau})\) for \(i = 0, 1, \ldots\). This set is called the Takens embedding of the time series and can be interpreted as a point cloud. - The difference between \(t_{i+1}\) and \(t_i\) is called the stride, \(\tau\) is called the time delay, and \(d\) is called the (embedding) dimension. - If \(d\) and \(\tau\) are not explicitly set, suitable values are searched for during - fit3 4.- To compute time-delay embeddings of several time series simultaneously, use - TakensEmbeddinginstead.- Parameters
- parameters_type ( - 'search'|- 'fixed', optional, default:- 'search') – If set to- 'fixed', the values of time_delay and dimension are used directly in- transform. If set to- 'search',- takens_embedding_optimal_parameteris run in- fitto estimate optimal values for these quantities and store them as- time_delay_and- dimension_.
- time_delay (int, optional, default: - 1) – Time delay between two consecutive values for constructing one embedded point. If parameters_type is- 'search', it corresponds to the maximum time delay that will be considered.
- dimension (int, optional, default: - 5) – Dimension of the embedding space. If parameters_type is- 'search', it corresponds to the maximum embedding dimension that will be considered.
- stride (int, optional, default: - 1) – Stride duration between two consecutive embedded points. It defaults to 1 as this is the usual value in the statement of Takens’s embedding theorem.
- n_jobs (int or None, optional, default: - None) – The number of jobs to use for the computation.- Nonemeans 1 unless in a- joblib.parallel_backendcontext.- -1means using all processors.
 
 - 
time_delay_¶
- Actual time delay used to embed. If parameters_type is - 'search', it is the calculated optimal time delay and is less than or equal to time_delay. Otherwise it is equal to time_delay.- Type
- int 
 
 - 
dimension_¶
- Actual embedding dimension used to embed. If parameters_type is - 'search', it is the calculated optimal embedding dimension and is less than or equal to dimension. Otherwise it is equal to dimension.- Type
- int 
 
 - Examples - >>> import numpy as np >>> from gtda.time_series import SingleTakensEmbedding >>> # Create a noisy signal >>> rng = np.random.default_rng() >>> n_samples = 10000 >>> signal = np.asarray([np.sin(x / 50) + 0.5 * rng.random() ... for x in range(n_samples)]) >>> # Set up the transformer >>> STE = SingleTakensEmbedding(parameters_type='search', dimension=5, ... time_delay=5, n_jobs=-1) >>> # Fit and transform >>> signal_embedded = STE.fit_transform(signal) >>> print('Optimal time delay based on mutual information:', ... STE.time_delay_) Optimal time delay based on mutual information: 5 >>> print('Optimal embedding dimension based on false nearest neighbors:', ... STE.dimension_) Optimal embedding dimension based on false nearest neighbors: 2 >>> print(signal_embedded.shape) (9995, 2) - Notes - The current implementation favours the last value over the first one, in the sense that the last coordinate of the last vector in a Takens embedded time series always equals the last value in the original time series. Hence, a number of initial values (depending on the remainder of the division between - n_samples - dimension * (time_delay - 1) - 1and the stride) may be lost.- References - 1
- F. Takens, “Detecting strange attractors in turbulence”. In: Rand D., Young LS. (eds) Dynamical Systems and Turbulence, Warwick 1980. Lecture Notes in Mathematics, vol. 898. Springer, 1981; DOI: 10.1007/BFb0091924. 
- 2
- J. A. Perea and J. Harer, “Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis”; Foundations of Computational Mathematics, 15, pp. 799–838; DOI: 10.1007/s10208-014-9206-z. 
- 3
- M. B. Kennel, R. Brown, and H. D. I. Abarbanel, “Determining embedding dimension for phase-space reconstruction using a geometrical construction”; Phys. Rev. A 45, pp. 3403–3411, 1992; DOI: 10.1103/PhysRevA.45.3403. 
- 4
- N. Sanderson, “Topological Data Analysis of Time Series using Witness Complexes”; PhD thesis, University of Colorado at Boulder, 2018; https://scholar.colorado.edu/math_gradetds/67. 
 - 
__init__(parameters_type='search', time_delay=1, dimension=5, stride=1, n_jobs=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - 
fit(X, y=None)[source]¶
- If necessary, compute the optimal time delay and embedding dimension. Then, return the estimator. - This method is here to implement the usual scikit-learn API and hence work in pipelines. - Parameters
- X (ndarray of shape (n_samples,) or (n_samples, 1)) – Input data. 
- y (None) – There is no need for a target, yet the pipeline API requires this parameter. 
 
- Returns
- self 
- Return type
- object 
 
 - 
fit_transform(X, y=None, **fit_params)¶
- Fit to data, then transform it. - Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X. - Parameters
- X (ndarray of shape (n_samples,) or (n_samples, 1)) – Input data. 
- y (None) – There is no need for a target, yet the pipeline API requires this parameter. 
 
- Returns
- Xt – Output point cloud in Euclidean space of dimension given by - dimension_.- n_points = (n_samples - time_delay * (dimension - 1) - 1) // stride + 1.
- Return type
- ndarray of shape (n_points, n_dimensions) 
 
 - 
fit_transform_resample(X, y, **fit_params)¶
- Fit to data, then transform the input and resample the target. Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X ans a resampled version of y. - Parameters
- X (ndarray of shape (n_samples, ..)) – Input data. 
- y (ndarray of shape (n_samples,)) – Target data. 
 
- Returns
- Xt (ndarray of shape (n_samples, …)) – Transformed input. 
- yr (ndarray of shape (n_samples, …)) – Resampled target. 
 
 
 - 
get_params(deep=True)¶
- Get parameters for this estimator. - Parameters
- deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators. 
- Returns
- params – Parameter names mapped to their values. 
- Return type
- mapping of string to any 
 
 - 
resample(y, X=None)[source]¶
- Resample y so that, for any i > 0, the minus i-th entry of the resampled vector corresponds in time to the last coordinate of the minus i-th embedding vector produced by - transform.- Parameters
- y (ndarray of shape (n_samples,)) – Target. 
- X (None) – There is no need for input data, yet the pipeline API requires this parameter. 
 
- Returns
- yr – The resampled target. - n_samples_new = (n_samples - time_delay * (dimension - 1) - 1) // stride + 1.
- Return type
- ndarray of shape (n_samples_new,) 
 
 - 
set_params(**params)¶
- Set the parameters of this estimator. - The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form - <component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
- **params (dict) – Estimator parameters. 
- Returns
- self – Estimator instance. 
- Return type
- object 
 
 - 
transform(X, y=None)[source]¶
- Compute the Takens embedding of X. - Parameters
- X (ndarray of shape (n_samples,) or (n_samples, 1)) – Input data. 
- y (None) – Ignored. 
 
- Returns
- Xt – Output point cloud in Euclidean space of dimension given by - dimension_.- n_points = (n_samples - time_delay * (dimension - 1) - 1) // stride + 1.
- Return type
- ndarray of shape (n_points, n_dimensions) 
 
 - 
transform_resample(X, y)¶
- Fit to data, then transform it. - Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X. - Parameters
- X (ndarray of shape (n_samples, ..)) – Input data. 
- y (ndarray of shape (n_samples,)) – Target data. 
 
- Returns
- Xt (ndarray of shape (n_samples, …)) – Transformed input. 
- yr (ndarray of shape (n_samples, …)) – Resampled target.