NAV
中文 DALIAN UNIVERSITY OF TECHNOLOGYLogin
rengongzhinengyingyong
Paper
Current position: Home >> Research Results >> Paper
Visual analytics for the clustering capability of data
Release time:2019-03-09 Hits:
Indexed by: 期刊论文
First Author: Lu ZhiMao
Correspondence Author: Lu, ZM (reprint author), Harbin Engn Univ, Pattern Recognit & Nat Computat Lab, Harbin 150001, Peoples R China.
Co-author: Liu Chen,Zhang Qi,Zhang ChunXiang,Fan DongMei,Yang Peng
Date of Publication: 2013-05-01
Journal: SCIENCE CHINA-INFORMATION SCIENCES
Included Journals: SCIE、EI
Document Type: J
Volume: 56
Issue: 5
Page Number: 1-14
ISSN No.: 1674-733X
Key Words: data mining; clustering analysis; visual analysis; minimum distance spectrum; nearest neighbor spectrum; outliers
Abstract: Clustering analysis is an unsupervised method to find hidden structures in datasets and has been widely used in various fields. However, it is always difficult for users to understand, evaluate, and explain the clustering results in the spaces with dimension greater than three. Although high-dimensional visualization of clustering technology can express clustering results well, it still has significant limitations. In this paper, a visualization cluster analysis method based on the minimum distance spectrum (MinDS) is proposed, aimed at reducing the problems of clustering multidimensional datasets. First, the concept of MinDS is defined based on the distance between high-dimensional data. MinDS can map any dataset from high-dimensional space to a lower dimension to determine whether the data set is separable. Next, a clustering method which can automatically determine the number of categories is designed based on MinDS. This method is not only able to cluster a dataset with clear boundaries, but can also cluster the dataset with fuzzy boundaries through the edge corrosion strategy based on the energy of each data point. In addition, strategies for removing noise and identifying outliers are designed to clean datasets according to the characteristics of MinDS. The experimental results presented validate the feasibility and effectiveness of the proposed schemes and show that the proposed approach is simple, stable, and efficient, and can achieve multidimensional visualization cluster analysis of complex datasets.
Translation or Not: no