DiverseSelector.base

class DiverseSelector.base.SelectionBase

Base class for selecting subset of sample points.

Methods

select(arr, size[, labels])

Return indices representing subset of sample points.

select_from_cluster(arr, size[, labels])

Return indices representing subset of sample points from one cluster.

select(arr: ndarray, size: int, labels: Optional[ndarray] = None) ndarray

Return indices representing subset of sample points.

Parameters:
  • arr (np.ndarray) – Array of features if fun_distance is provided. Otherwise, treated as distance matrix.

  • size (int) – Number of sample points to select (i.e. size of the subset).

  • labels (np.ndarray, optional) – Array of integers or strings representing the labels of the clusters that each sample belongs to. If None, the samples are treated as one cluster. If labels are provided, selection is made from each cluster.

Returns:

selected – Indices of the selected sample points.

Return type:

list

abstract select_from_cluster(arr: ndarray, size: int, labels: Optional[ndarray] = None) ndarray

Return indices representing subset of sample points from one cluster.

Parameters:
  • arr (np.ndarray) – Array of features (columns) for each sample (rows). If fun_distance is None, this arr is treated as a pairwise distance array.

  • size (int) – Number of sample points to select (i.e. size of the subset).

  • labels (np.ndarray, optional) – Array of integers or strings representing the labels of the clusters that each sample belongs to. If None, the samples are treated as one cluster. If labels are provided, selection is made from each cluster.

Returns:

selected – Indices of the selected sample points.

Return type:

list