GraphGeodesicDistance

class gtda.graphs.GraphGeodesicDistance(n_jobs=None)[source]

Distance matrices arising from geodesic distances on graphs.

For each (possibly weighted and/or directed) graph in a collection, this transformer calculates the length of the shortest (directed or undirected) path between any two of its vertices, setting it to numpy.inf when two vertices cannot be connected by a path.

The graphs are encoded as sparse adjacency matrices, while the outputs are dense distance matrices of variable size.

Parameters

n_jobs (int or None, optional, default: None) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

Examples

>>> import numpy as np
>>> from gtda.graphs import TransitionGraph, GraphGeodesicDistance
>>> X = np.arange(4).reshape(1, -1, 1)
>>> tg = TransitionGraph(func=None).fit_transform(X)
>>> print(tg[0].toarray())
[[False  True False False]
 [ True False  True False]
 [False  True False  True]
 [False False  True False]]
>>> ggd = GraphGeodesicDistance().fit_transform(tg)
>>> print(ggd[0])
[[0. 1. 2. 3.]
 [1. 0. 1. 2.]
 [2. 1. 0. 1.]
 [3. 2. 1. 0.]]
__init__(n_jobs=None)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)[source]

Do nothing and return the estimator unchanged.

This method is here to implement the usual scikit-learn API and hence work in pipelines.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, n_vertices, n_vertices)) – Input data, i.e. a collection of adjacency matrices of graphs. Each adjacency matrix may be a dense or a sparse array.

  • y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

self

Return type

object

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, n_vertices, n_vertices)) – Input data, i.e. a collection of adjacency matrices of graphs. Each adjacency matrix may be a dense or a sparse array.

  • y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

Xt – Array of distance matrices. If the distance matrices have variable size across samples, Xt is a one-dimensional array of dense arrays.

Return type

ndarray of shape (n_samples,) or (n_samples, n_vertices, n_vertices)

fit_transform_plot(X, y=None, sample=0, **plot_params)

Fit to data, then apply transform_plot.

Parameters
  • X (ndarray of shape (n_samples, ..)) – Input data.

  • y (ndarray of shape (n_samples,) or None) – Target values for supervised problems.

  • sample (int) – Sample to be plotted.

  • **plot_params – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

static plot(Xt, sample=0, colorscale='blues')[source]

Plot a sample from a collection of distance matrices.

Parameters
  • Xt (ndarray of shape (n_samples, n_points, n_points)) – Collection of distance matrices, such as returned by transform.

  • sample (int, optional, default: 0) – Index of the sample to be plotted.

  • colorscale (str, optional, default: 'blues') – Color scale to be used in the heat map. Can be anything allowed by plotly.graph_objects.Heatmap.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

transform(X, y=None)[source]

Use sklearn.utils.graph_shortest_path.graph_shortest_path to compute the lengths of graph shortest paths between any two vertices.

Parameters
  • X (ndarray of shape (n_samples,) or (n_samples, n_vertices, n_vertices)) – Input data, i.e. a collection of adjacency matrices of graphs. Each adjacency matrix may be a dense or sparse array.

  • y (None) – Ignored.

Returns

Xt – Array of distance matrices. If the distance matrices have variable size across samples, Xt is a one-dimensional array of dense arrays.

Return type

ndarray of shape (n_samples,) or (n_samples, n_vertices, n_vertices)

transform_plot(X, sample=0, **plot_params)

Take a one-sample slice from the input collection and transform it. Before returning the transformed object, plot the transformed sample.

Parameters
  • X (ndarray of shape (n_samples, ..)) – Input data.

  • sample (int) – Sample to be plotted.

  • plot_params (dict) – Optional plotting parameters.

Returns

Xt – Transformed one-sample slice from the input.

Return type

ndarray of shape (1, ..)