arrow
Volume 36, Issue 3
Reduced-Rank Modeling for High-Dimensional Model-Based Clustering

Lei Yang, Junhui Wang & Shiqian Ma

J. Comp. Math., 36 (2018), pp. 426-440.

Published online: 2018-06

Export citation
  • Abstract

Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large. In this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.

  • AMS Subject Headings

62-07, 90C30

  • Copyright

COPYRIGHT: © Global Science Press

  • Email address

ly888@nyu.edu (Lei Yang)

j.h.wang@cityu.edu.hk (Junhui Wang)

  • BibTex
  • RIS
  • TXT
@Article{JCM-36-426, author = {Yang , LeiWang , Junhui and Ma , Shiqian}, title = {Reduced-Rank Modeling for High-Dimensional Model-Based Clustering}, journal = {Journal of Computational Mathematics}, year = {2018}, volume = {36}, number = {3}, pages = {426--440}, abstract = {

Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large. In this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.

}, issn = {1991-7139}, doi = {https://doi.org/10.4208/jcm.1708-m2016-0830}, url = {http://global-sci.org/intro/article_detail/jcm/12269.html} }
TY - JOUR T1 - Reduced-Rank Modeling for High-Dimensional Model-Based Clustering AU - Yang , Lei AU - Wang , Junhui AU - Ma , Shiqian JO - Journal of Computational Mathematics VL - 3 SP - 426 EP - 440 PY - 2018 DA - 2018/06 SN - 36 DO - http://doi.org/10.4208/jcm.1708-m2016-0830 UR - https://global-sci.org/intro/article_detail/jcm/12269.html KW - Clustering, Gaussian mixture model, Group Lasso, ADMM, Reduced-rank model. AB -

Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large. In this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.

Yang , LeiWang , Junhui and Ma , Shiqian. (2018). Reduced-Rank Modeling for High-Dimensional Model-Based Clustering. Journal of Computational Mathematics. 36 (3). 426-440. doi:10.4208/jcm.1708-m2016-0830
Copy to clipboard
The citation has been copied to your clipboard