ConsistentRescaling¶
- 
class gtda.point_clouds.ConsistentRescaling(metric='euclidean', metric_params=None, neighbor_rank=1, n_jobs=None)[source]¶
- Rescaling of distances between pairs of points by the geometric mean of the distances to the respective \(k\)-th nearest neighbours. - Based on ideas in 1. The computation during - transformdepends on the nature of the array X. If each entry in X along axis 0 represents a distance matrix \(D\), then the corresponding entry in the transformed array is the distance matrix \(D'_{i,j} = D_{i,j}/\sqrt{D_{i,k_i}D_{j,k_j}}\), where \(k_i\) is the index of the \(k\)-th largest value in row \(i\) (and similarly for \(j\)). If the entries in X represent point clouds, their distance matrices are first computed, and then rescaled according to the same formula.- Parameters
- metric (string or callable, optional, default: - 'euclidean') – If set to- 'precomputed', each entry in X along axis 0 is interpreted to be a distance matrix. Otherwise, entries are interpreted as feature arrays, and metric determines a rule with which to calculate distances between pairs of instances (i.e. rows) in these arrays. If metric is a string, it must be one of the options allowed by- scipy.spatial.distance.pdistfor its metric parameter, or a metric listed in- sklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS, including “euclidean”, “manhattan” or “cosine”. If metric is a callable function, it is called on each pair of instances and the resulting value recorded. The callable should take two arrays from the entry in X as input, and return a value indicating the distance between them.
- metric_params (dict or None, optional, default: - None) – Additional keyword arguments for the metric function.
- neighbor_rank (int, optional, default: - 1) – Rank of the neighbors used to modify the metric structure according to the “consistent rescaling” procedure.
- n_jobs (int or None, optional, default: - None) – The number of jobs to use for the computation.- Nonemeans 1 unless in a- joblib.parallel_backendcontext.- -1means using all processors.
 
 - 
effective_metric_params_¶
- Dictionary containing all information present in metric_params. If metric_params is - None, it is set to the empty dictionary.- Type
- dict 
 
 - Examples - >>> import numpy as np >>> from gtda.point_clouds import ConsistentRescaling >>> X = np.array([[[0, 0], [1, 2], [5, 6]]]) >>> cr = ConsistentRescaling() >>> X_rescaled = cr.fit_transform(X) >>> print(X_rescaled.shape) (1, 3, 3) - See also - References - 1
- T. Berry and T. Sauer, “Consistent manifold representation for topological data analysis”; Foundations of data analysis 1, pp. 1–38, 2019; doi: 10.3934/fods.2019001. 
 - 
__init__(metric='euclidean', metric_params=None, neighbor_rank=1, n_jobs=None)[source]¶
- Initialize self. See help(type(self)) for accurate signature. 
 - 
fit(X, y=None)[source]¶
- Calculate - effective_metric_params_. Then, return the estimator.- This method is here to implement the usual scikit-learn API and hence work in pipelines. - Parameters
- X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If - metric == 'precomputed', the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape- (n_points, n_points). Otherwise, each such entry will be interpreted as an array of- n_pointsrow vectors in- n_dimensions-dimensional space.
- y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter. 
 
- Returns
- self 
- Return type
- object 
 
 - 
fit_transform(X, y=None, **fit_params)¶
- Fit to data, then transform it. - Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X. - Parameters
- X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If - metric == 'precomputed', the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape- (n_points, n_points). Otherwise, each such entry will be interpreted as an array of- n_pointsrow vectors in- n_dimensions-dimensional space.
- y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter. 
 
- Returns
- Xt – Array containing (as entries along axis 0) the distance matrices after consistent rescaling. 
- Return type
- ndarray of shape (n_samples, n_points, n_points) 
 
 - 
fit_transform_plot(X, y=None, sample=0, **plot_params)¶
- Fit to data, then apply - transform_plot.- Parameters
- X (ndarray of shape (n_samples, ..)) – Input data. 
- y (ndarray of shape (n_samples,) or None) – Target values for supervised problems. 
- sample (int) – Sample to be plotted. 
- **plot_params – Optional plotting parameters. 
 
- Returns
- Xt – Transformed one-sample slice from the input. 
- Return type
- ndarray of shape (1, ..) 
 
 - 
get_params(deep=True)¶
- Get parameters for this estimator. - Parameters
- deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators. 
- Returns
- params – Parameter names mapped to their values. 
- Return type
- mapping of string to any 
 
 - 
static plot(Xt, sample=0, colorscale='blues')[source]¶
- Plot a sample from a collection of distance matrices. - Parameters
- Xt (ndarray of shape (n_samples, n_points, n_points)) – Collection of distance matrices, such as returned by - transform.
- sample (int, optional, default: - 0) – Index of the sample to be plotted.
- colorscale (str, optional, default: - 'blues') – Color scale to be used in the heat map. Can be anything allowed by- plotly.graph_objects.Heatmap.
 
 
 - 
set_params(**params)¶
- Set the parameters of this estimator. - The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form - <component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
- **params (dict) – Estimator parameters. 
- Returns
- self – Estimator instance. 
- Return type
- object 
 
 - 
transform(X, y=None)[source]¶
- For each entry in the input data array X, find the metric structure after consistent rescaling and encode it as a distance matrix. - Parameters
- X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If - metric == 'precomputed', the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape- (n_points, n_points). Otherwise, each such entry will be interpreted as an array of- n_pointsrow vectors in- n_dimensions-dimensional space.
- y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter. 
 
- Returns
- Xt – Array containing (as entries along axis 0) the distance matrices after consistent rescaling. 
- Return type
- ndarray of shape (n_samples, n_points, n_points) 
 
 - 
transform_plot(X, sample=0, **plot_params)¶
- Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample. - Parameters
- X (ndarray of shape (n_samples, ..)) – Input data. 
- sample (int) – Sample to be plotted. 
- plot_params (dict) – Optional plotting parameters. 
 
- Returns
- Xt – Transformed one-sample slice from the input. 
- Return type
- ndarray of shape (1, ..)