plot_static_mapper_graph

gtda.mapper.plot_static_mapper_graph(pipeline, data, color_data=None, color_features=None, node_color_statistic=None, layout='kamada_kawai', layout_dim=2, clone_pipeline=True, n_sig_figs=3, node_scale=12, plotly_params=None)[source]

Plot Mapper graphs without interactivity on pipeline parameters.

The output graph is a rendition of the igraph.Graph object computed by calling the fit_transform method of the MapperPipeline instance pipeline on the input data. The graph’s nodes correspond to subsets of elements (rows) in data; these subsets are clusters in larger portions of data called “pullback (cover) sets”, which are computed by means of the pipeline’s “filter function” and “cover” and correspond to the differently-colored portions in this diagram. Two clusters from different pullback cover sets can overlap; if they do, an edge between the corresponding nodes in the graph may be drawn.

Nodes are colored according to color_features and node_color_statistic and are sized according to the number of elements they represent. The hovertext on each node displays, in this order:

  • a globally unique ID for the node, which can be used to retrieve node information from the igraph.Graph object, see Nerve;

  • the label of the pullback (cover) set which the node’s elements form a cluster in;

  • a label identifying the node as a cluster within that pullback set;

  • the number of elements of data associated with the node;

  • the value of the summary statistic which determines the node’s color.

Parameters
  • pipeline (MapperPipeline object) – Mapper pipeline to act onto data.

  • data (array-like of shape (n_samples, n_features)) – Data used to generate the Mapper graph. Can be a pandas dataframe.

  • color_data (array-like of length n_samples, or None, optional, default: None) – Data to be used to construct node colors in the Mapper graph (according to color_features and node_color_statistic). Must have the same length as data. None is the same as passing numpy.arange(len(data)).

  • color_features (object or None, optional, default: None) –

    Specifies one or more feature of interest from color_data to be used, together with node_color_statistic, to determine node colors. Ignored if node_color_statistic is a numpy array.

    1. None is equivalent to passing color_data.

    2. If an object implementing transform or fit_transform, or a callable, it is applied to color_data to generate the features of interest.

    3. If an index or string, or list of indices/strings, it is equivalent to selecting a column or subset of columns from color_data.

  • node_color_statistic (None, callable, or ndarray of shape (n_nodes,) or (n_nodes, 1), optional, default: None) – If a callable, node colors will be computed as summary statistics from the feature array y determined by color_data and color_features. Let y have n columns (note: 1d feature arrays are converted to column vectors). Then, for a node representing a list I of row indices, there will be n colors, each computed as node_color_statistic(y[I, i]) for i between 0 and n. None is equivalent to passing numpy.mean. If a numpy array, it must have the same length as the number of nodes in the Mapper graph and its values are used directly as node colors (color_features is ignored).

  • layout (None, str or callable, optional, default: "kamada-kawai") – Layout algorithm for the graph. Can be any accepted value for the layout parameter in the layout method of igraph.Graph 1.

  • layout_dim (int, default: 2) – The number of dimensions for the layout. Can be 2 or 3.

  • clone_pipeline (bool, optional, default: True) – If True, the input pipeline is cloned before computing the Mapper graph to prevent unexpected side effects from in-place parameter updates.

  • n_sig_figs (int or None, optional, default: 3) – If not None, number of significant figures to which to round node summary statistics. If None, no rounding is performed.

  • node_scale (int or float, optional, default: 12) – Sets the scale factor used to determine the rendered size of the nodes. Increase for larger nodes. Implements a formula in the Plotly documentation.

  • plotly_params (dict or None, optional, default: None) – Custom parameters to configure the plotly figure. Allowed keys are "node_trace", "edge_trace" and "layout", and the corresponding values should be dictionaries containing keyword arguments as would be fed to the update_traces and update_layout methods of plotly.graph_objects.Figure.

Returns

fig – Figure representing the Mapper graph with appropriate node colouring and size.

Return type

plotly.graph_objects.FigureWidget object

Examples

Setting a colorscale different from the default one:

>>> import numpy as np
>>> np.random.seed(1)
>>> from gtda.mapper import make_mapper_pipeline, plot_static_mapper_graph
>>> pipeline = make_mapper_pipeline()
>>> data = np.random.random((100, 3))
>>> plotly_params = {"node_trace": {"marker_colorscale": "Blues"}}
>>> fig = plot_static_mapper_graph(pipeline, data,
...                                plotly_params=plotly_params)

Inspect the composition of a node with “Node ID” displayed as 0 in the hovertext:

>>> graph = pipeline.fit_transform(data)
>>> graph.vs[0]["node_elements"]
array([70])

Write the figure to a file using Plotly: >>> fname = “current_figure” >>> fig.write_html(fname + “.html”) >>> fig.write_image(fname + “.svg”) # Requires psutil

References

1

igraph.Graph.layout documentation.