# Filtering¶

class gtda.diagrams.Filtering(homology_dimensions=None, epsilon=0.01)[source]

Filtering of persistence diagrams.

Filtering a diagram means discarding all points [b, d, q] representing non-trivial topological features whose lifetime d - b is less than or equal to a cutoff value. Points on the diagonal (i.e. for which b and d are equal) may still appear in the output for padding purposes, but carry no information.

Important note:

• Input collections of persistence diagrams for this transformer must satisfy certain requirements, see e.g. fit.

Parameters
• homology_dimensions (list, tuple, or None, optional, default: None) – When set to None, subdiagrams corresponding to all homology dimensions seen in fit will be filtered. Otherwise, it contains the homology dimensions (as non-negative integers) at which filtering should occur.

• epsilon (float, optional, default: 0.01) – The cutoff value controlling the amount of filtering.

homology_dimensions_

If homology_dimensions is set to None, contains the homology dimensions seen in fit, sorted in ascending order. Otherwise, it is a similarly sorted version of homology_dimensions.

Type

tuple

__init__(homology_dimensions=None, epsilon=0.01)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)[source]

Store relevant homology dimensions in homology_dimensions_. Then, return the estimator.

This method is here to implement the usual scikit-learn API and hence work in pipelines.

Parameters
• X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.

• y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

self

Return type

object

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
• X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.

• y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

Xt – Filtered persistence diagrams. Only the subdiagrams corresponding to dimensions in homology_dimensions_ are filtered. n_features_filtered is less than or equal to n_features.

Return type

ndarray of shape (n_samples, n_features_filtered, 3)

fit_transform_plot(X, y=None, sample=0, **plot_params)

Fit to data, then apply transform_plot.

Parameters
• X (ndarray of shape (n_samples, ..)) – Input data.

• y (ndarray of shape (n_samples,) or None) – Target values for supervised problems.

• sample (int) – Sample to be plotted.

• **plot_params – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

plot(Xt, sample=0, homology_dimensions=None, plotly_params=None)[source]

Plot a sample from a collection of persistence diagrams, with homology in multiple dimensions.

Parameters
• Xt (ndarray of shape (n_samples, n_points, 3)) – Collection of persistence diagrams, such as returned by transform.

• sample (int, optional, default: 0) – Index of the sample in Xt to be plotted.

• homology_dimensions (list, tuple or None, optional, default: None) – Which homology dimensions to include in the plot. None is equivalent to passing homology_dimensions_.

• plotly_params (dict or None, optional, default: None) – Custom parameters to configure the plotly figure. Allowed keys are "traces" and "layout", and the corresponding values should be dictionaries containing keyword arguments as would be fed to the update_traces and update_layout methods of plotly.graph_objects.Figure.

Returns

fig – Plotly figure.

Return type

plotly.graph_objects.Figure object

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

transform(X, y=None)[source]

Filter all relevant persistence subdiagrams.

Parameters
• X (ndarray of shape (n_samples, n_features, 3)) – Input data. Array of persistence diagrams, each a collection of triples [b, d, q] representing persistent topological features through their birth (b), death (d) and homology dimension (q). It is important that, for each possible homology dimension, the number of triples for which q equals that homology dimension is constants across the entries of X.

• y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

Xt – Filtered persistence diagrams. Only the subdiagrams corresponding to dimensions in homology_dimensions_ are filtered. n_features_filtered is less than or equal to n_features.

Return type

ndarray of shape (n_samples, n_features_filtered, 3)

transform_plot(X, sample=0, **plot_params)

Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample.

Parameters
• X (ndarray of shape (n_samples, ..)) – Input data.

• sample (int) – Sample to be plotted.

• **plot_params – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)