Specific Textual Information Detection for Chinese Micro-blog
Abstract
Long-term specific textual information detection is an interesting research problem. Batch processing method usually involves training a classifier with different train sets periodically to maintain its performance, since the context of specific textual information in the micro-blog space tends to change. The micro-blog data is an abundant source for detecting and analyzing the specific textual information. As a universal concept, the specific information can be information about any entity, such as movies, journeys and so on. If we can collect long-term specific information and analyze them, hidden data value maybe emerges. In this paper, we present an incremental learning method based on SVM to detect long-term specific information efficiently. Besides, topic words in different time periods about the specific information are also extracted. To test our ideas, we manually create a labeled data set about weight loss production from Chinese Sina micro-blog within one-year span with the help of a semi-supervised text classifier. Experiments show that our algorithm can maintain the detection performance quite well and find strong related topic words in different time periods.
DOI
10.12783/dtcse/itms2016/9457
10.12783/dtcse/itms2016/9457
Refbacks
- There are currently no refbacks.