Auto-detection of Hot Topics in Mass Chinese Internet Information

CHUN-HUI DENG; YUN LUO; HUI-FANG DENG

doi:10.12783/dtcse/cmsms2018/25261

Auto-detection of Hot Topics in Mass Chinese Internet Information

CHUN-HUI DENG, YUN LUO, HUI-FANG DENG

Abstract

In order to overcome the weakness of the traditional topic detection clustering strategy and realize hot topic auto-discovery, we re-examined density-based clustering algorithms, and then put forward a sub-cluster relation-based and multi-resolution density clustering algorithm (SRBMRClustering) which considers both adjacency information of sub-clusters and relative density concept. And in the meanwhile, in order to reduce the computational complexity, we proposed a Web structure-based text feature weight calculation method and a concept-feature extraction method and used feature-based news text vector representation method to improve the textual representation and shrink the dimension of feature space. Finally, we used Chinese news corpus of June-July 2012 to verify our algorithm. The experimental results show that the algorithmâ€™s performance and clustering quality are improved to a notable extent.

Keywords

Hot topic auto-detection, Text preprocessing, Text clustering, Web structure-based text feature weight calculation, Concept feature extraction, Multi-resolution density clustering, Subcluster relation-based clustering, Feature-based news text vector representation, Feature space, Advanced Chinese word segmentation ICTCLAS system.Text

DOI
10.12783/dtcse/cmsms2018/25261

Refbacks

There are currently no refbacks.

Username
Password
Remember me

COMPUTER SCIENCEand ENGINEERING

Auto-detection of Hot Topics in Mass Chinese Internet Information

Abstract

Keywords

Refbacks

COMPUTER SCIENCE
and ENGINEERING