Convex clustering is a popular clustering model without requiring the number of clusters as prior knowledge. It can generate a clustering path by continuously solving the model with a sequence of regularization parameter values. PyClustrPath is a convex clustering implementation on PyTorch, aiming to solve large-scale convex clustering problems with GPU acceleration. Several optimization algorithms have been designed for solving the convex clustering model, including:
- Semismooth Newton-CG Augmented Lagrangian Method (SSNAL) [1] [2]
- Alternating Minimization Algorithm (AMA) and fast AMA [3]
- Alternating Direction Method of Multipliers (ADMM) [3]
We summarize the main contributions of PyClustrPath as follows:
-
We design and develop a new Python package for efficiently generating clustering paths by solving convex clustering model, which supports both CPU and GPU computation platforms simultaneously.
-
We implement three popular and efficient algorithms in PyClustrPath: ADMM, fast AMA, and SSNAL.
-
We design the package PyClustrPath in a well-structured modular manner, making it flexible and scalable to incorporate new algorithms in the future.
-
We develop a visualization module in PyClustrPath, providing a user-friendly way to visualize the generated clustering path.
-
We extensively test PyClustrPath's numerical performance for generating clustering paths for several popular benchmark datasets. The results demonstrate the efficiency and robustness of PyClustrPath.
Our package visualizes the clustering path results, as shown in the following figures.
![]() LUNG dataset |
![]() LIBRAS-6 dataset |
![]() COIL-20 dataset |
The following figures show the performance comparison between GPU and CPU on the COIL-20 dataset. The first figure demonstrates the GPU's superior performance for various tensor operations. For example, on the COIL-20 dataset (with dimensions 1024*1440), the speedup for sparse matrix multiplication and Cholesky decomposition for solving systems of linear equations is approximately 40 times and 12 times, respectively. The other figure shows that when solving 100 problems, the CPU can only solve less than 65% of the problems within 10 times the time of the GPU. The overall GPU speedup exceeds tenfold.
![]() Operation performance on GPU and CPU |
![]() Profile of different Algorithms and Platforms |
Our paper is available: Hongfei Wu, Yancheng Yuan, "PyClustrPath: An efficient Python package for generating clustering paths with GPU acceleration", 2025.
If you use PyClustrPath in your research, we would appreciate the citation to the following paper:
@misc{wu2025pyclustrpathefficientpythonpackage,
title={PyClustrPath: An efficient Python package for generating clustering paths with GPU acceleration},
author={Hongfei Wu and Yancheng Yuan},
year={2025},
eprint={2501.15964},
archivePrefix={arXiv},
primaryClass={math.OC},
url={https://arxiv.org/abs/2501.15964},
}- CPU: Intel(R) Xeon(R) Platinum 8480C
- GPU: NVIDIA GeForce RTX 4090
- Ubuntu 22.04
- CUDA 12.1
- Python 3.10.14
- Pytorch 2.2.2
-
/pyclustrpath: Convex clustering solver/solvers: Convex clustering solver in different methods__init__.py: Initialization file'base.py: Base class of convex clustering solverama.py: AMA method and Fast AMA methodadmm.py: ADMM methodpdhg.py: PDHG methodssnal.py: SSNAL methodssncg.py: SSNCG method in SSNAL
__init__.py: Initialization filedata_processor.py: Data preprocessing, including weight vector and matrix calculation, data mapping, etc.global_variable.py: Global variables, including log file setting, the arguments from the command line, etc.utils.py: Utility functions, including Projects, Proximal operators,Incomplete Cholesky Decompositionvisualize.py: Visualization functions, including the clustering path visualization/IChol_CUDA: Incomplete Cholesky Decomposition with CUDA Extension
-
/demo: Demo files/data/real_data: Real datasets/results: Clustering path visualization and the comparison between GPU and CPU/cluster_path_pictures: Clustering path visualization results
demo_lung.py: Demo file for the LUNG datasetdemo_libras6.py: Demo file for the LIBRAS-6 datasetdemo_libras.py: Demo file for the LIBRAS datasetdemo_COIL20.py: Demo file for the COIL20 datasetdemo_MNIST.py: Demo file for the MNIST dataset
Note on PyTorch Installation :
PyClustrPath depends on PyTorch . Please install it from the above link.
You also need to install the following packages before running the code.
This package is used to calculate the K-nearest neighbors (KNN) on GPU.
$ pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
And then, make sure ninja has been installed:
$ wget -P /usr/bin https://github.com/unlimblue/KNN_CUDA/raw/master/ninja
Reference: https://github.com/unlimblue/KNN_CUDA
This package is used to calculate the Cholesky decomposition for sparse matrix on both CPU and GPU.
$ pip install cholespy
Reference: https://github.com/rgl-epfl/cholespy
For PCG method, we implement the PyTorch extension with CUDA for the incomplete Cholesky decomposition.
$ python IChol_CUDA/setup.py install
Note that you need to keep your Torch version consistent with the CUDA version. Else, it would raise an error in the compilation process.
For detailed instructions, please refer to the files in the /demo folder. The following is an example of running the code on the LUNG dataset using the SSNAL method.
In the function get_args(), you can set the parameters by change the default values,
including the solver algorithm, stopping tolerance, maximum number of iterations, number of neighbors, and whether to use the KKT condition to stop the solver.
import numpy as np
import torch
import argparse
import scipy.io as sio
from pyclustrpath.utils import str2bool
from pyclustrpath import gv, visualize_clustering_results
device = 'cuda' if torch.cuda.is_available() else 'cpu' # Set the device
def get_args():
# define the argument parser
parser = argparse.ArgumentParser()
parser.add_argument('--solver', default='ssnal', type=str, choices=['admm', 'ama', 'fast_ama', 'ssnal', 'pdhg'],
help='The solver to use: admm, fast_ama, ssnal, or pdhg')
parser.add_argument('--stop_tol', default=1e-6, type=float, help='Stopping tolerance for the solver')
parser.add_argument('--max_iter', default=20000, type=int, help='Maximum number of iterations for the solver')
parser.add_argument('--k_neighbor', default=10, type=int, help='Number of neighbors to consider in the data processor')
parser.add_argument('--use_kkt', default=False, type=str2bool, help='Whether to use the KKT condition to stop the solver')
parser.add_argument('--data', default='lung', type=str,
help='The dataset to use: libras6, libras, COIL20, lung, MNIST_test')
parser.add_argument('--device', default=device, type=str, help='The device to use: cpu or cuda')
return parser.parse_args() def main():
# get the arguments
args = get_args()
gv._init(args)
data = sio.loadmat('data/real_data/lung.mat') # Load the LUNG data
tensor_data = torch.Tensor(data['X']).double()
data_label = np.squeeze(data['col_assign'])
gamma_upper = 0.93 # Set the list of gamma
gamma_lower = 0.057
gamma_num = 100
gamma_step = (gamma_upper - gamma_lower) / (gamma_num - 1)
gamma_list = np.arange(gamma_upper, gamma_lower - gamma_step, -gamma_step)
from pyclustrpath import DataProcessor
data_processor = DataProcessor(X=tensor_data, k=args.k_neighbor, device=device) # Process the data
from pyclustrpath import ConvexClusterSolver
solver = ConvexClusterSolver(method=args.solver, gamma_list=gamma_list, # Load the model
stop_tol=args.stop_tol, max_iter=args.max_iter,
use_kkt=args.use_kkt, device=device)
solutions = solver.solve(data_processor) # Solve the problems
visualize_clustering_results(data=tensor_data, solution=solutions, # Visualize the clustering path
gamma_list=gamma_list, label=data_label)Then you can check the clustering path visualization results in the same folder as the demo file.
And the log file will be saved in the /log_file folder.
[1] D.F. Sun, K.C. Toh, and Y.C. Yuan, Convex clustering: model, theoretical guarantee and efficient algorithm, Journal of Machine Learning Research, 22(9), 2021.
[2] Y.C. Yuan, D.F. Sun, and K.C. Toh, An efficient semismooth Newton based algorithm for convex clustering, ICML 2018.
[3] Chi E C, Lange K. Splitting methods for convex clustering[J]. Journal of Computational and Graphical Statistics, 2015, 24(4): 994-1013.
This project is licensed under the BSD 2-Clause License. See the LICENSE file for details.




