Scaler¶
-
class
gtda.diagrams.
Scaler
(metric='bottleneck', metric_params=None, function=<function amax>, n_jobs=None)[source]¶ Linear scaling of persistence diagrams.
A positive scale factor
scale_
is calculated duringfit
by considering all available persistence diagrams partitioned according to homology dimensions. Duringtransform
, all birth-death pairs are divided byscale_
.The value of
scale_
depends on two things:A way of computing, for each homology dimension, the amplitude in that dimension of a persistence diagram consisting of birth-death-dimension triples [b, d, q]. Together, metric and metric_params define this in the same way as in
Amplitude
.A scalar-valued function which is applied to the resulting two-dimensional array of amplitudes (one per diagram and homology dimension) to obtain
scale_
.
Important note:
Input collections of persistence diagrams for this transformer must satisfy certain requirements, see e.g.
fit
.
- Parameters
metric (
'bottleneck'
|'wasserstein'
|'betti'
|'landscape'
|'silhouette'
|'heat'
|'persistence_image'
, optional, default:'bottleneck'
) – See the corresponding parameter inAmplitude
.metric_params (dict or None, optional, default:
None
) – See the corresponding parameter inAmplitude
.function (callable, optional, default:
numpy.max
) – Function used to extract a positive scalar from the collection of amplitude vectors infit
. Must map 2D arrays to scalars.n_jobs (int or None, optional, default:
None
) – The number of jobs to use for the computation.None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processors.
-
effective_metric_params_
¶ Dictionary containing all information present in metric_params as well as relevant quantities computed in
fit
.- Type
dict
-
scale_
¶ Value by which to rescale diagrams.
- Type
float
See also
Notes
When metric is
'bottleneck'
and function isnumpy.max
,fit_transform
has the effect of making the lifetime of the most persistent point across all diagrams and homology dimensions equal to 2.To compute scaling factors without first splitting the computation between different homology dimensions, data should be first transformed by an instance of
ForgetDimension
.-
__init__
(metric='bottleneck', metric_params=None, function=<function amax>, n_jobs=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
fit
(X, y=None)[source]¶ Store all observed homology dimensions in
homology_dimensions_
and computescale_
. Then, return the estimator.- Parameters
X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.
y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
self
- Return type
object
-
fit_transform
(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.
y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
Xs – Rescaled diagrams.
- Return type
ndarray of shape (n_samples, n_features, 3)
-
fit_transform_plot
(X, y=None, sample=0, **plot_params)¶ Fit to data, then apply
transform_plot
.- Parameters
X (ndarray of shape (n_samples, ..)) – Input data.
y (ndarray of shape (n_samples,) or None) – Target values for supervised problems.
sample (int) – Sample to be plotted.
**plot_params – Optional plotting parameters.
- Returns
Xt – Transformed one-sample slice from the input.
- Return type
ndarray of shape (1, ..)
-
get_params
(deep=True)¶ Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
-
inverse_transform
(X)[source]¶ Scale back the data to the original representation. Multiplies by the scale found in
fit
.- Parameters
X (ndarray of shape (n_samples, n_features, 3)) – Data to apply the inverse transform to, c.f.
transform
.- Returns
Xs – Rescaled diagrams.
- Return type
ndarray of shape (n_samples, n_features, 3)
-
plot
(Xt, sample=0, homology_dimensions=None, plotly_params=None)[source]¶ Plot a sample from a collection of persistence diagrams, with homology in multiple dimensions.
- Parameters
Xt (ndarray of shape (n_samples, n_points, 3)) – Collection of persistence diagrams, such as returned by
transform
.sample (int, optional, default:
0
) – Index of the sample in Xt to be plotted.homology_dimensions (list, tuple or None, optional, default:
None
) – Which homology dimensions to include in the plot.None
is equivalent to passinghomology_dimensions_
.plotly_params (dict or None, optional, default:
None
) – Custom parameters to configure the plotly figure. Allowed keys are"traces"
and"layout"
, and the corresponding values should be dictionaries containing keyword arguments as would be fed to theupdate_traces
andupdate_layout
methods ofplotly.graph_objects.Figure
.
- Returns
fig – Plotly figure.
- Return type
plotly.graph_objects.Figure
object
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
object
-
transform
(X, y=None)[source]¶ Divide all birth and death values in X by
scale_
.- Parameters
X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.
y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.
- Returns
Xs – Rescaled diagrams.
- Return type
ndarray of shape (n_samples, n_features, 3)
-
transform_plot
(X, sample=0, **plot_params)¶ Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample.
- Parameters
X (ndarray of shape (n_samples, ..)) – Input data.
sample (int) – Sample to be plotted.
**plot_params – Optional plotting parameters.
- Returns
Xt – Transformed one-sample slice from the input.
- Return type
ndarray of shape (1, ..)