find_ridge#
- crispy.scms.find_ridge(X, G, D=3, h=1, d=1, eps=1e-06, maxT=1000, weights=None, converge_frac=99, ncpu=None, return_unconverged=True, f_h=5)[source]#
Identify density ridges in data using the Subspace Constrained Mean Shift (SCMS) algorithm.
This function iteratively shifts walkers towards the density ridges of the input data by computing local density estimates and projecting onto the subspace of interest.
- Parameters:
X (ndarray) – Coordinates of the data points, shape (n, D, 1).
G (ndarray) – Initial coordinates of the walkers, shape (m, D, 1).
D (int, optional, default=3) – Dimensionality of the data points.
h (float, optional, default=1) – Smoothing bandwidth for the Gaussian kernel.
d (int, optional, default=1) – Number of dimensions to retain in the ridge subspace.
eps (float, optional, default=1e-6) – Convergence criterion. The maximum allowable error for a walker to be considered converged.
maxT (int, optional, default=1000) – Maximum number of iterations.
weights (ndarray, optional) – Weights for the data points. If None, all points are equally weighted.
converge_frac (float, optional, default=99) – Fraction of walkers that must converge to terminate the algorithm, expressed as a percentage.
ncpu (int, optional) – Number of CPUs to use for parallel processing. Defaults to the number of available CPUs.
return_unconverged (bool, optional, default=True) – If True, returns both converged and unconverged walkers. Otherwise, only converged walkers are returned.
f_h (float, optional, default=5) – Factor for filtering data points based on their distance to walkers. Points farther than f_h * h are excluded to reduce computational overhead.
- Returns:
G_converged (ndarray) – Coordinates of the converged walkers.
G_unconverged (ndarray, optional) – Coordinates of unconverged walkers, returned only if return_unconverged=True.
Notes
The algorithm stops when either the fraction of converged walkers meets converge_frac or the maximum number of iterations (maxT) is reached.
The SCMS algorithm leverages Gaussian kernels to compute local density estimates and iteratively shifts walkers toward regions of high density.
Examples
Find ridges in a dataset with 3D coordinates:
>>> import numpy as np >>> from crispy import scms >>> data = np.random.random((100, 3, 1)) # Random 3D data >>> walkers = np.random.random((20, 3, 1)) # Random walker positions >>> ridges, unconverged = scms.find_ridge(data, walkers, h=0.5, maxT=500)