A Text Classification Algorithm Based on PCA

Jian-lin LI

doi:10.12783/dtcse/cst2017/12555

A Text Classification Algorithm Based on PCA

Jian-lin LI

Abstract

Study the related WEB text feature extraction algorithm, through the mutual information (MI), document frequency (DF), information gain (IG) andÏ‡2 statistics (CHI) algorithm research, using of their respective advantage complementary, proposed a multiple combination feature extraction algorithm based on principal component analysis (PCA-MCFEA). First, by the orthogonal transformation of the PCA algorithm to faster dimensionality reduction of the text feature space; Then through the multiple combination feature extraction algorithm in the lower dimension of feature space fast extract more representative of the feature, filter out some representative weak feature items; Finally, using the SVM classifier to classify the text. The experimental results show that PCA-MCFEA algorithm can effectively improve text classification accuracy and running efficiency.

Keywords

PCA-MCFEA, Feature extraction, Text classification

DOI
10.12783/dtcse/cst2017/12555

Refbacks

There are currently no refbacks.

Username
Password
Remember me

COMPUTER SCIENCEand ENGINEERING

A Text Classification Algorithm Based on PCA

Abstract

Keywords

Refbacks

COMPUTER SCIENCE
and ENGINEERING