Maximum triangle rule and semi-supervision based k-means algorithm for large scale data
发表时间:2019-03-11
点击次数:
论文类型:
期刊论文
第一作者:
Feng J.
合写作者:
Lu Z.,Zhang Z.
发表时间:
2015-01-01
发表刊物:
ICIC Express Letters
收录刊物:
EI、Scopus
文献类型:
J
卷号:
9
期号:
6
页面范围:
1553-1558
ISSN号:
1881803X
摘要:
The clustering algorithms which need to repeatedly scan the whole data set can not well complete the clustering analysis of large scale data sets. At the same time, affected by the initialization parameter and data distribution, the quality of clustering results obtained by some of them is not high. In order to solve these problems, a Maximum Triangle Rule and Semi-Supervision based k-means algorithm (MTRSSKM) is designed in this paper. MTRSSKM applies the maximum triangle rule to choose the initial clustering centers for the k-means clustering algorithm, and uses a small amount of labels retained in the memory to supervise and guide the clustering process. MTRSSKM only needs to scan the original data set one time. The clustering quality of the MTRSSKM is improved and the idea of one scan accelerates the clustering process of MTRSSKM. The experiment on the 1998KDD data set shows the effectiveness of MTRSSKM. ? 2015 ICIC International.
是否译文:
否