ConsecutiveRescaling¶
-
class
gtda.point_clouds.
ConsecutiveRescaling
(metric='euclidean', metric_params=None, factor=0.0, n_jobs=None)[source]¶ Rescaling of distances between consecutive pairs of points by a fixed factor.
The computation during
transform
depends on the nature of the array X. If each entry in X along axis 0 represents a distance matrix \(D\), then the corresponding entry in the transformed array is the distance matrix \(D'_{i,i+1} = \alpha D_{i,i+1}\) where \(\alpha\) is a positive factor. If the entries in X represent point clouds, their distance matrices are first computed, and then rescaled according to the same formula.- Parameters
metric (string or callable, optional, default:
'euclidean'
) – If set to'precomputed'
, each entry in X along axis 0 is interpreted to be a distance matrix. Otherwise, entries are interpreted as feature arrays, and metric determines a rule with which to calculate distances between pairs of instances (i.e. rows) in these arrays. If metric is a string, it must be one of the options allowed byscipy.spatial.distance.pdist
for its metric parameter, or a metric listed insklearn.pairwise.PAIRWISE_DISTANCE_FUNCTIONS
, including “euclidean”, “manhattan” or “cosine”. If metric is a callable function, it is called on each pair of instances and the resulting value recorded. The callable should take two arrays from the entry in X as input, and return a value indicating the distance between them.metric_params (dict or None, optional, default:
None
) – Additional keyword arguments for the metric function.factor (float, optional, default:
0.
) – Factor by which to multiply the distance between consecutive points.n_jobs (int or None, optional, default:
None
) – The number of jobs to use for the computation.None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors.
-
effective_metric_params_
¶ Dictionary containing all information present in metric_params. If metric_params is
None
, it is set to the empty dictionary.- Type
dict
Examples
>>> import numpy as np >>> from gtda.point_clouds import ConsecutiveRescaling >>> X = np.array([[[0, 0], [1, 2], [5, 6]]]) >>> cr = ConsecutiveRescaling() >>> X_rescaled = cr.fit_transform(X) >>> print(X_rescaled.shape) (1, 3, 3)
See also
-
__init__
(metric='euclidean', metric_params=None, factor=0.0, n_jobs=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(X, y=None)[source]¶ Calculate
effective_metric_params_
. Then, return the estimator.This method is here to implement the usual scikit-learn API and hence work in pipelines.
- Parameters
X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
-dimensional space.y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
self
- Return type
object
-
fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
-dimensional space.y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
Xt – Array containing (as entries along axis 0) the distance matrices after consecutive rescaling.
- Return type
ndarray of shape (n_samples, n_points, n_points)
-
fit_transform_plot
(X, y=None, sample=0, **plot_params)¶ Fit to data, then apply
transform_plot
.- Parameters
X (ndarray of shape (n_samples, ..)) – Input data.
y (ndarray of shape (n_samples,) or None) – Target values for supervised problems.
sample (int) – Sample to be plotted.
**plot_params – Optional plotting parameters.
- Returns
Xt – Transformed one-sample slice from the input.
- Return type
ndarray of shape (1, ..)
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
static
plot
(Xt, sample=0, colorscale='blues', plotly_params=None)[source]¶ Plot a sample from a collection of distance matrices.
- Parameters
Xt (ndarray of shape (n_samples, n_points, n_points)) – Collection of distance matrices, such as returned by
transform
.sample (int, optional, default:
0
) – Index of the sample to be plotted.colorscale (str, optional, default:
'blues'
) – Color scale to be used in the heat map. Can be anything allowed byplotly.graph_objects.Heatmap
.plotly_params (dict or None, optional, default:
None
) – Custom parameters to configure the plotly figure. Allowed keys are"trace"
and"layout"
, and the corresponding values should be dictionaries containing keyword arguments as would be fed to theupdate_traces
andupdate_layout
methods ofplotly.graph_objects.Figure
.
- Returns
fig – Plotly figure.
- Return type
plotly.graph_objects.Figure
object
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
object
-
transform
(X, y=None)[source]¶ For each entry in the input data array X, find the metric structure after consecutive rescaling and encode it as a distance matrix.
- Parameters
X (ndarray of shape (n_samples, n_points, n_points) or (n_samples, n_points, n_dimensions)) – Input data. If
metric == 'precomputed'
, the input should be an ndarray whose each entry along axis 0 is a distance matrix of shape(n_points, n_points)
. Otherwise, each such entry will be interpreted as an array ofn_points
row vectors inn_dimensions
-dimensional space.y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
Xt – Array containing (as entries along axis 0) the distance matrices after consecutive rescaling.
- Return type
ndarray of shape (n_samples, n_points, n_points)
-
transform_plot
(X, sample=0, **plot_params)¶ Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample.
- Parameters
X (ndarray of shape (n_samples, ..)) – Input data.
sample (int) – Sample to be plotted.
**plot_params – Optional plotting parameters.
- Returns
Xt – Transformed one-sample slice from the input.
- Return type
ndarray of shape (1, ..)