KNeighborsGraph

class gtda.graphs.KNeighborsGraph(n_neighbors=4, mode='connectivity', metric='euclidean', p=2, metric_params=None, n_jobs=None)[source]

Adjacency matrices of \(k\)-nearest neighbor graphs.

Given a two-dimensional array of row vectors seen as points in high-dimensional space, the corresponding \(k\) to vertex \(j \neq i\) whenever vector \(j\) is among the \(k\) nearest neighbors of vector \(i\).

Parameters
  • n_neighbors (int, optional, default: 4) – Number of neighbors to use. A point is not considered as its own neighbour.

  • mode ('connectivity' | 'distance', optional, default: 'connectivity') – Type of returned matrices: 'connectivity' will return the 0-1 connectivity matrices, and 'distance' will return the distances between neighbors according to the given metric.

  • metric (string or callable, optional, default: 'euclidean') – The distance metric to use. See the documentation of sklearn.neighbors.DistanceMetric for a list of available metrics. If set to 'precomputed', input data is interpreted as a collection of distance matrices.

  • p (int, optional, default: 2) – Parameter for the Minkowski (i.e. \(\ell^p\)) metric from sklearn.metrics.pairwise.pairwise_distances. Only relevant when metric is 'minkowski'. p = 1 is the Manhattan distance, and p = 2 reduces to the Euclidean distance.

  • metric_params (dict or None, optional, default: None) – Additional keyword arguments for the metric function.

  • n_jobs (int or None, optional, default: None) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

Examples

>>> import numpy as np
>>> from gtda.graphs import KNeighborsGraph
>>> X = np.array([[[0, 1, 3, 0, 0],
...                [1, 0, 5, 0, 0],
...                [3, 5, 0, 4, 0],
...                [0, 0, 4, 0, 0]]])
>>> kng = KNeighborsGraph(n_neighbors=2)
>>> Xg = kng.fit_transform(X)
>>> print(Xg[0].toarray())
[[0. 1. 0. 1.]
 [1. 0. 0. 1.]
 [1. 0. 0. 1.]
 [1. 1. 0. 0.]]

Notes

sklearn.neighbors.kneighbors_graph is used to compute the adjacency matrices of kNN graphs.

__init__(n_neighbors=4, mode='connectivity', metric='euclidean', p=2, metric_params=None, n_jobs=None)[source]

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None)[source]

Do nothing and return the estimator unchanged.

This method is here to implement the usual scikit-learn API and hence work in pipelines.

Parameters
  • X (list of length n_samples, or ndarray of shape (n_samples, n_points, n_dimensions) or (n_samples, n_points, n_points)) – Input data representing a collection of point clouds. Each entry in X is a 2D array of shape (n_points, n_dimensions) if metric is not 'precomputed', or a 2D array or sparse matrix of shape (n_points, n_points) if metric is 'precomputed'.

  • y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

self

Return type

object

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (list of length n_samples, or ndarray of shape (n_samples, n_points, n_dimensions) or (n_samples, n_points, n_points)) – Input data representing a collection of point clouds. Each entry in X is a 2D array of shape (n_points, n_dimensions) if metric is not 'precomputed', or a 2D array or sparse matrix of shape (n_points, n_points) if metric is 'precomputed'.

  • y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

Xt – Adjacency matrices of kNN graphs, in sparse CSR format. The matrices contain ones and zeros if mode is 'connectivity', and floats representing distances according to metric if mode is 'distance'.

Return type

list of length n_samples

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

transform(X, y=None)[source]

Compute kNN graphs and return their adjacency matrices in sparse format.

Parameters
  • X (list of length n_samples, or ndarray of shape (n_samples, n_points, n_dimensions) or (n_samples, n_points, n_points)) – Input data representing a collection of point clouds. Each entry in X is a 2D array of shape (n_points, n_dimensions) if metric is not 'precomputed', or a 2D array or sparse matrix of shape (n_points, n_points) if metric is 'precomputed'.

  • y (None) – There is no need for a target in a transformer, yet the pipeline API requires this parameter.

Returns

Xt – Adjacency matrices of kNN graphs, in sparse CSR format. The matrices contain ones and zeros if mode is 'connectivity', and floats representing distances according to metric if mode is 'distance'.

Return type

list of length n_samples