Data
- class gdeep.data.AbstractPreprocessing(*args, **kwds)
- class gdeep.data.DatasetFactory
Dataset factory class for the tori dataset and torchvision datasets using the factory design pattern
Examples:
# Create a dataset for the tori dataset dataset = get_dataset("Tori", name="DoubleTori", n_points=100) # Create the MNIST dataset dataset = get_dataset("Torchvision", name="MNIST")
- build(key: str, **kwargs) Any
This method returns the DataLoader builder corresponding to the input key.
- Args:
- key:
the name of the dataset
- register_builder(key: str, builder: Any)
this method adds to the internal builders dictionary new dataloader builders
- exception gdeep.data.MissingVocabularyError
Class to raise the missing vocabulary exception in the tokenizers
- class gdeep.data.PreprocessingPipeline(preprocessors: Iterable[AbstractPreprocessing[Any, Any]])
Pipeline to fit non-fitted preprocessors to a dataset in a sequential manner. The fitted preprocessing transform can be attached to a dataset using the ´attach_transform_to_dataset´ method. The intended use case is to fit the preprocessors to the training dataset and then attach the fitted transform to the training, validation and test datasets.
The transform is only applied to the data and not the labels.
Examples:
from gdeep.data.preeprocessors import PreprocessingPipeline, Normalization, PreprocessImageClassification from gdeep.data.datasets import DatasetImageClassificationFromFiles image_dataset = DatasetImageClassificationFromFiles( os.path.join(file_path, "img_data"), os.path.join(file_path, "img_data", "labels.csv")) preprocessing_pipeline = PreprocessingPipeline((PreprocessImageClassification((32, 32)), Normalization())) preprocessing_pipeline.fit_to_dataset(image_dataset) # this will not change the image_dataset preprocessed_dataset = preprocessing_pipeline.attach_transform_to_dataset(image_dataset)
- class gdeep.data.TransformingDataset(dataset: Dataset[R], transform: Callable[[R], S])
This class is the base class for all the Datasets that need to be transformed via preprocessors. This base class expects to get data from Dataset.
- Args:
- dataset :
The source dataset for this class.
- transform :
This is either a function defined by the users or a fitted preprocessor. The preprocessors inherits from
AbstractPreprocessing
Preprocessors
- class gdeep.data.preprocessors.FilterPersistenceDiagramByHomologyDimension(homology_dimensions_to_filter: List[int])
This class filters the persistence diagrams of a dataset by their homology dimension.
Here we assume that the dataset is a tuple of (persistence diagram, label) and that the points in the diagram are sorted by ascending lifetime. This is an invariant of the OneHotEncodedPersistenceDiagram class but could go wrong if the diagrams are modified in a way that breaks this invariant.
- Args:
- homology_dimensions_to_filter:
The homology dimensions of the points in the diagram that should be kept.
- fit_to_dataset(dataset: Dataset[Tuple[OneHotEncodedPersistenceDiagram, T]]) None
This method does nothing.
- class gdeep.data.preprocessors.FilterPersistenceDiagramByLifetime(min_lifetime: float, max_lifetime: float)
This class filters the persistence diagrams of a dataset by their lifetime, i.e. the difference between the birth and death coordinates.
Here we assume that the dataset is a tuple of (persistence diagram, label) and that the points in the diagram are sorted by ascending lifetime. This is an invariant of the OneHotEncodedPersistenceDiagram class but could go wrong if the diagrams are modified in a way that breaks this invariant.
- Args:
- min_lifetime:
The minimum lifetime of the points in the diagram.
- max_lifetime:
The maximum lifetime of the points in the diagram.
- fit_to_dataset(dataset: Dataset[Tuple[OneHotEncodedPersistenceDiagram, T]]) None
This method does nothing.
- class gdeep.data.preprocessors.MinMaxScalarPersistenceDiagram
This class runs the standard min-max normalisation on the birth and death times of the persistence diagrams. For example. The transformation is: X_scaled = X_std * (max - min) + min
- class gdeep.data.preprocessors.Normalization
This class runs the standard normalisation on all the dimensions of the tensors of a dataset. For example, in case of images where each item is of shape
(C, H, W)
, the average and the standard deviations will be tensors of shape(C, H, W)
- class gdeep.data.preprocessors.NormalizationPersistenceDiagram(num_homology_dimensions: int)
This class runs the standard normalisation on the birth and death coordinates of the persistence diagrams of a dataset accross all the homology dimensions.
The one-hot encoded persistence diagrams are kept as is.
- class gdeep.data.preprocessors.ToTensorImage(size: int | List[int])
Class to preprocess image files for classification tasks
- Args:
- size :
Desired output size. If size is a sequence like (h, w), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. I.e, if height > width, then image will be rescaled to
(size * height / width, size)
.
- class gdeep.data.preprocessors.TokenizerQA(vocabulary: Vocab | None = None, tokenizer: partial | None = None)
Class to preprocess text dataloaders for Q&A tasks. The type of dataset is assumed to be of the form
(string,string,list[string], list[string])
.- Args:
- vocabulary:
the torch vocabulary
- tokenizer :
the tokenizer of the source text
Examples:
from gdeep.data import TorchDataLoader from gdeep.data import TransformingDataset from gdeep.data.preprocessors import TokenizerQA dl = TorchDataLoader(name="SQuAD2", convert_to_map_dataset=True) dl_tr, dl_ts = dl.build_dataloaders() textds = TransformingDataset(dl_tr_str.dataset, TokenizerQA())
- fit_to_dataset(dataset: Dataset[Tuple[str, str, List[str], List[int]]]) None
Method to fit the vocabulary to the input text
- Args:
- dataset:
the dataset to fit to
- class gdeep.data.preprocessors.TokenizerTextClassification(tokenizer: partial | None = None, vocabulary: Vocab | None = None)
Preprocessing class. This class is useful to convert the data format
(label, text)
into the proper tensor format( word_embedding, label)
. The labels should be integers; if they are string, they will be converted.- Args:
- tokenizer :
the tokenizer of the source text
- vocabulary :
the vocubulary; it can be built or it can be given.
- fit_to_dataset(dataset: Dataset[Tuple[Any, str]]) None
Method to extract global data, like to length of the sentences to be able to pad.
- Args:
- dataset :
the data in the format
(label, text)
- class gdeep.data.preprocessors.TokenizerTranslation(vocabulary: Dict[str, int] | None = None, vocabulary_target: Dict[str, int] | None = None, tokenizer: Callable[[str], List[str]] | None = None, tokenizer_target: Callable[[str], List[str]] | None = None)
Class to preprocess text dataloaders for translation tasks. The Dataset type is supposed to be
(string, string)
. The padding item is supposed to be of index 0.- Args:
- vocabulary :
the vocabulary of the source text; it can be built automatically or it can be given.
- vocabulary_target :
the vocubulary of the target text; it can be built automatically or it can be given.
- tokenizer:
the tokenizer of the source text
- tokenizer_target:
the tokenizer of the target text
Examples:
from gdeep.data import DatasetBuilder from gdeep.data import TransformingDataset from gdeep.data.preprocessors import TokenizerTranslation db = DatasetBuilder(name="Multi30k", convert_to_map_dataset=True) ds_tr, ds_val, _ = db.build() textds = TransformingDataset(ds_tr, TokenizerTranslation())
Datasets
- class gdeep.data.datasets.AbstractDataLoaderBuilder
The abstract class to interface the Giotto dataloaders
- class gdeep.data.datasets.DataLoaderBuilder(tuple_of_datasets: List[Dataset[Any]])
This class builds, out of a tuple of datasets, the corresponding dataloaders. Note that this class would use the same parameters for all the datasets. You can use different parameters for each dataset by passing a list of dictionaries to the build method
- Args:
- tuple_of_datasets :
Tuple consisting of the training, validation and test datasets. Also one or two elements are acceptable: they will be considered as training first and validation afterwards.
- Example:
>>> import torch >>> from gdeep.data.dataloaders import DataLoaderBuilder >>> x, y = torch.rand(10, 3, 32, 32), torch.randint(0, 1, (10,)) >>> x_train, y_train = x[:8], y[:8] >>> x_val, y_val = x[8:], y[8:] >>> train_dataset = gdeep.data.datasets.FromArray(x_train, y_train) >>> val_dataset = gdeep.data.datasets.FromArray(x_val, y_val) >>> dataloader_builder = DataLoaderBuilder(train_dataset, val_dataset) >>> train_loader, val_loader = dataloader_builder.build()
- build(tuple_of_kwargs: List[Dict[str, Any]] | DataLoaderParamsTuples | None = None) List[DataLoader[Any]]
This method accepts the arguments of the torch Dataloader and applies them when creating the tuple. If the tuple of kwargs is a list of dictionaries, then the first dictionary will be applied to the training dataset, the second to the validation dataset and the third to the test dataset. If the tuple of kwargs is a DataLoaderParamsTuples, then the parameters will be applied to the corresponding dataset.
- Args:
- tuple_of_kwargs:
Tuple consisting of the training, validation and test dataloaders. Also one or two elements are acceptable: they will be considered as training first and validation afterwards.
- class gdeep.data.datasets.DataLoaderKwargs(*, train_kwargs, val_kwargs, test_kwargs)
Object to store keyword arguments for train, val, and test dataloaders
- class gdeep.data.datasets.DatasetBuilder(name: str = 'MNIST', convert_to_map_dataset: bool = False)
Class to obtain Datasets from the classical datasets available on pytorch. Also the torus dataset and all its variations can be found here
- Args:
- name:
check the available datasets at https://pytorch.org/vision/stable/datasets.html and https://pytorch.org/text/stable/datasets.html
- convert_to_map_dataset:
whether to conver to the MapDataset or to keep IterableDataset
- build(**kwargs) Tuple[Dataset[Any], Dataset[Any] | None, Dataset[Any] | None]
Method that returns the dataset.
- Args:
- kwargs:
the arguments to pass to the dataset builder. For example, you may want to use the options
split=("train","dev")
orsplit=("train","test")
- class gdeep.data.datasets.DatasetCloud(dataset_name: str, bucket_name: str = 'adversarial_attack', download_directory: None | str = None, use_public_access: bool = True, path_to_credentials: None | str = None, make_public: bool = True)
DatasetCloud class to handle the download and upload of datasets to the DataCloud. If the download_directory does not exist, it will be created and if a folder with the same name as the dataset exists in the download directory, it will not be downloaded again. If a folder with the same name as the dataset does not exists locally, it will be created when downloading the dataset.
- Args:
- dataset_name (str):
Name of the dataset to be downloaded or uploaded.
- bucket_name (str, optional):
Name of the bucket in the DataCloud. Defaults to DATASET_BUCKET_NAME.
- download_directory (Union[None, str], optional):
Directory where the dataset will be downloaded to. Defaults to DEFAULT_DOWNLOAD_DIR.
- use_public_access (bool, optional):
If True, the dataset will be downloaded via public url. Defaults to False.
- path_credentials (Union[None, str], optional):
Path to the credentials file. Only used if public_access is False and credentials are not provided. Defaults to None.
- make_public (bool, optional):
If True, the dataset will be made public
- Raises:
- ValueError:
Dataset does not exits in cloud.
- Returns:
None
- download() None
Download a dataset from the DataCloud. If the dataset does not exist in the cloud, an exception will be raised. If the dataset exists locally in the download directory, the dataset will not be downloaded again.
- Raises:
- ValueError:
Dataset does not exits in cloud.
- ValueError:
Dataset exists locally but checksums do not match.
- get_existing_datasets() List[str]
Returns a list of datasets in the cloud.
- Returns:
- List[str]:
List of datasets in the cloud.
- class gdeep.data.datasets.DlBuilderFromDataCloud(dataset_name: str, download_directory: str, use_public_access: bool = True, path_to_credentials: None | str = None)
Class that loads data from Google Cloud Storage
This class is useful to build dataloaders from a dataset stored in the GDeep Dataset Cloud on Google Cloud Storage.
The constructor takes the name of a dataset as a string, and a string for the download directory. The constructor will download the dataset to the download directory. The dataset is downloaded in the version used by Datasets Cloud, which may be different from the version used by the dataset’s original developers.
- Args:
- dataset_name (str):
The name of the dataset.
- download_dir (str):
The directory where the dataset will be downloaded.
- use_public_access (bool):
Whether to use public access. If you want to use the Google Cloud Storage API, you must set this to True. Please make sure you have the appropriate credentials.
- path_to_credentials (str):
Path to the credentials file. Only used if public_access is False and credentials are not provided. Defaults to None.
- Returns:
torch.utils.data.DataLoader: The dataloader for the dataset.
- Raises:
- ValueError:
If the dataset_name is not a valid dataset that exists in Datasets Cloud.
- ValueError:
If the download_directory is not a valid directory.
- build(tuple_of_kwargs: List[Dict[str, Any]]) Tuple[DataLoader, DataLoader, DataLoader]
Builds the dataloaders for the dataset.
- Args:
**tuple_of_kwargs: Arguments for the dataloader builder.
- Returns:
- Tuple[DataLoader, DataLoader, DataLoader]:
The dataloaders for the dataset (train, validation, test).
- get_metadata() Dict[str, Any]
Returns the metadata of the dataset.
- Returns:
- Dict[str, Any]:
The metadata of the dataset.
- class gdeep.data.datasets.FromArray(x: Tensor | ndarray, y: Tensor | ndarray)
This class is useful to build dataloaders from a array of X and y. Tensors are also supported.
- Args:
- X :
The data. The first dimension is the datum index
- y :
The labels, need to match the first dimension with the data
- class gdeep.data.datasets.ImageClassificationFromFiles(img_folder: str = '.', labels_file: str = 'labels.csv')
This class is useful to build a dataset directly from image files
- Args:
- img_folder (string):
The path to the folder where the training images are located
- labels_file (string):
The path and file name of the labels. It shall be a
.csv
file with two columns. The first columns contains the name of the image and the second one contains the label value- transform (AbstractPreprocessing):
the instance of the class of preprocessing. It inherits from
AbstractPreprocessing
- target_transform (AbstractPreprocessing):
the instance of the class of preprocessing. It inherits from
AbstractPreprocessing
- class gdeep.data.datasets.OrbitsGenerator(parameters: Sequence[float] = (2.5, 3.5, 4.0, 4.1, 4.3), num_orbits_per_class: int = 1000, num_pts_per_orbit: int = 1000, homology_dimensions: Sequence[int] = (0, 1), validation_percentage: float = 0.0, test_percentage: float = 0.0, dynamical_system: str = 'classical_convention', n_jobs: int = 1, dtype: str = 'float32', arbitrary_precision=False)
Generate Orbit dataset consistent of orbits defined by the dynamical system x[n+1] = x[n] + r * y[n] * (1 - y[n]) % 1 y[n+1] = y[n] + r * x[n+1] * (1 - x[n+1]) % 1 Note that there is an x[n+1] value in the second dimension. The parameter r is an hyperparameter and the classification task is to predict it given the orbit. By default r is chosen from (2.5, 3.5, 4.0, 4.1, 4.3). Args:
- parameters (Tuple[float]):
Hyperparameter of the dynamical systems.
- num_orbits_per_class (int):
number of orbits per class.
- num_pts_per_orbit (int):
number of points per orbit.
- homology_dimensions (Sequence[int]):
homology dimension of the persistence diagrams.
- validation_percentage (float, optional):
Percentage of the validation dataset. Defaults to 0.0.
- test_percentage (float, optional):
Percentage of the test dataset. Defaults to 0.0.
- dynamical_system (str, optional):
either use persistence paths convention ´pp_convention´ or the classical convention ´classical_convention´. Defaults to ‘´classical_convention´’.
- n_jobs (int, optional):
number of cpus to run the computation on. Defaults to 1.
- get_dataloader_combined(dataloaders_kwargs: DataLoaderKwargs) Tuple[DataLoader, DataLoader, DataLoader]
Generates a Dataloader from the orbits dataset and the persistence diagrams Returns:
- DataLoader:
Dataloader of orbits and persistence diagrams
- get_dataloader_orbits(dataloaders_kwargs: DataLoaderKwargs) Tuple[DataLoader, DataLoader, DataLoader]
Generates a Dataloader from the orbits dataset Returns:
- DataLoader:
Dataloader of orbits
- get_dataloader_persistence_diagrams(dataloaders_kwargs: DataLoaderKwargs) Tuple[DataLoader, DataLoader, DataLoader]
Generates a Dataloader from the persistence diagrams dataset Returns:
- Tuple[DataLoader, DataLoader, DataLoader]:
Dataloaders of persistence diagrams
- get_orbits() None | ndarray
Returns the orbits as an ndarrays of shape (num_classes * num_orbits_per_class, num_pts_per_orbit, 2) Returns:
- np.ndarray:
Orbits
- get_persistence_diagrams() None | ndarray
Returns the orbits as an ndarrays of shape (num_classes * num_orbits_per_class, num_topological_features, 3) Returns:
- np.ndarray:
Persistence diagrams
- class gdeep.data.datasets.PersistenceDiagramFromFiles(file_path: str)
- class gdeep.data.datasets.Rotation(axis_0: int, axis_1: int, angle: float)
Class for rotations
- class gdeep.data.datasets.ToriDataset(name: str, **kwargs)
This class is used to generate data loaders for the family of tori-datasets
- Args:
- name:
name of the torus dataset to generate
- gdeep.data.datasets.create_pd_orbits(orbits, num_classes, homology_dimensions=(0, 1), n_jobs=2) Tensor
Computes the weak alpha persistence of the orbit data clouds.
- Args:
- orbits (np.array):
Orbits of shape [n_points, 2]
- homology_dimensions (tuple, optional):
Dimensions to compute the persistence diagrams. Defaults to (0, 1).
- n_jobs (int, optional):
Number of cpus to use for parallel computation. Defaults to multiprocessing.cpu_count().
- Returns:
- np.array:
Array of persistence diagrams of shape [num_classes, num_orbits, num_persistence_points, 3]. In the last dimension the first two values are the coordinates of the points in the persistence diagrams and the third is the homology dimension.
- gdeep.data.datasets.generate_orbit_parallel(num_classes, num_orbits, num_pts_per_orbit: int = 100, parameters: List[float] = [1.0]) ndarray
Generate sequence of points of a dynamical system in a parallel manner.
- Args:
- num_classes (int):
number of classes of dynamical systems.
- num_orbits (int):
number of orbits of dynamical system per class.
- num_pts_per_orbit (int, optional):
Number of points to generate. Defaults to 100.
- parameter (List[float], optional):
List of parameters of the dynamical system. Defaults to [1.0].
- Returns:
- np.ndarray:
Array of sampled points of the dynamical system.
Persistence Diagrams
- class gdeep.data.persistence_diagrams.OneHotEncodedPersistenceDiagram(data: Tensor, homology_dimension_names: List[str] | None = None)
This class represents a single one-hot encoded persistence diagram.
- Args:
- data:
The data of the persistence diagram. The data must be a tensor of shape (num_points, 2 + num_homology_dimensions) and the last dimension must be the concatenation of the birth-death-coordinates and the one-hot encoded homology dimension. The invariants of the persistence diagram are checked in the constructor.
- homology_dimension_names:
The names of the homology dimensions. If None, the names are set to H_0, H_1, …
- Example::
- pd = torch.tensor ([[0.0928, 0.0995, 0.0000, 0.0000, 1.0000, 0.0000],
[0.0916, 0.1025, 1.0000, 0.0000, 0.0000, 0.0000], [0.0978, 0.1147, 1.0000, 0.0000, 0.0000, 0.0000], [0.0978, 0.1147, 0.0000, 0.0000, 1.0000, 0.0000], [0.0916, 0.1162, 0.0000, 0.0000, 0.0000, 1.0000], [0.0740, 0.0995, 1.0000, 0.0000, 0.0000, 0.0000], [0.0728, 0.0995, 1.0000, 0.0000, 0.0000, 0.0000], [0.0740, 0.1162, 0.0000, 0.0000, 0.0000, 1.0000], [0.0728, 0.1162, 0.0000, 0.0000, 1.0000, 0.0000], [0.0719, 0.1343, 0.0000, 0.0000, 0.0000, 1.0000], [0.0830, 0.2194, 1.0000, 0.0000, 0.0000, 0.0000], [0.0830, 0.2194, 1.0000, 0.0000, 0.0000, 0.0000], [0.0719, 0.2194, 0.0000, 1.0000, 0.0000, 0.0000]])
names = [“Ord0”, “Ext0”, “Rel1”, “Ext1”] pd = OneHotEncodedPersistenceDiagram(pd, names)
- all_close(other: OneHotEncodedPersistenceDiagram, atol: float = 1e-07) bool
This method checks if the persistence diagrams are close.
- filter_by_lifetime(min_lifetime: float, max_lifetime: float) OneHotEncodedPersistenceDiagram
This method filters the persistence diagram by lifetime.
- Args:
- min_lifetime:
The minimum lifetime of the remaining points.
- max_lifetime:
The maximum lifetime of the remaining points.
- static from_numpy(data: ndarray) OneHotEncodedPersistenceDiagram
This method creates a persistence diagram from a numpy array.
- get_all_points_in_homology_dimension(homology_dimension: int) OneHotEncodedPersistenceDiagram
This method returns all points in a given homology dimension.
- get_lifetimes() Tensor
This method returns the lifetimes of the points.
- get_num_homology_dimensions() int
This method returns the number of homology dimensions.
- get_num_points() int
This method returns the number of points.
- get_points_in_homology_dimension(homology_dimension: int) OneHotEncodedPersistenceDiagram
This method returns all points in a given homology dimension.
- get_raw_data() Tensor
This method returns the raw data of the persistence diagram. This function should not be used to change the data.
- static load(path: str) OneHotEncodedPersistenceDiagram
This method loads a persistence diagram from a file.
- plot(names: List[str] | None = None) Figure
This method plots the persistence diagram.
- Args:
- names:
The names of the homology dimensions.
Examples:
pd = torch.tensor ([[0.0928, 0.0995, 0.0000, 0.0000, 1.0000, 0.0000], [0.0916, 0.1025, 1.0000, 0.0000, 0.0000, 0.0000], [0.0978, 0.1147, 1.0000, 0.0000, 0.0000, 0.0000], [0.0978, 0.1147, 0.0000, 0.0000, 1.0000, 0.0000], [0.0916, 0.1162, 0.0000, 0.0000, 0.0000, 1.0000], [0.0740, 0.0995, 1.0000, 0.0000, 0.0000, 0.0000], [0.0728, 0.0995, 1.0000, 0.0000, 0.0000, 0.0000], [0.0740, 0.1162, 0.0000, 0.0000, 0.0000, 1.0000], [0.0728, 0.1162, 0.0000, 0.0000, 1.0000, 0.0000], [0.0719, 0.1343, 0.0000, 0.0000, 0.0000, 1.0000], [0.0830, 0.2194, 1.0000, 0.0000, 0.0000, 0.0000], [0.0830, 0.2194, 1.0000, 0.0000, 0.0000, 0.0000], [0.0719, 0.2194, 0.0000, 1.0000, 0.0000, 0.0000]]) names = ["Ord0", "Ext0", "Rel1", "Ext1"] pd = OneHotEncodedPersistenceDiagram(pd, names) pd.plot()
- save(path: str) None
This method saves the persistence diagram to a file.
- set_homology_dimension_names(homology_dimension_names: List[str]) None
This method sets the homology dimension names.
- gdeep.data.persistence_diagrams.collate_fn_persistence_diagrams(batch: List[Tuple[OneHotEncodedPersistenceDiagram, int]]) Tuple[List[Tensor], Tensor]
This function collates the data for the persistence diagram by padding the data, converting the data to tensors, converting the labels to tensors and generating masks for the valid entries.
The input is a list of tuples of the form (persistence diagram, label).
- Args:
- batch:
The list of tuples of the form (persistence diagram, label).
- Returns:
The data, the labels and the masks.
- gdeep.data.persistence_diagrams.get_one_hot_encoded_persistence_diagram_from_gtda(persistence_diagram: ndarray) OneHotEncodedPersistenceDiagram
This function takes a single persistence diagram from giotto-tda and returns a one-hot encoded persistence diagram.
- Args:
- persistence_diagram:
An array of shape (num_points, 3) where the first two columns represent the coordinates of the points and the third column represents the index of the homology dimension.
- Returns:
- OneHotEncodedPersistenceDiagram:
A one-hot encoded persistence diagram. If the persistence diagram has only one homology dimension, the third column will be filled with ones.
- gdeep.data.persistence_diagrams.get_one_hot_encoded_persistence_diagram_from_gudhi_extended(diagram: Tuple[ndarray, ndarray, ndarray, ndarray]) OneHotEncodedPersistenceDiagram
Convert an extended persistence diagram of a single graph to an array with one-hot encoded homology type. Args:
- diagram (Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]):
The diagram of an extended persistence of a single graph.
- Returns:
- np.ndarray:
The diagram in one-hot encoded homology type of size (num_points, 6).