Cross-modal Retrieval of Chinese-CQA Based on CCA Algorithm
Abstract
With the development of Chinese Q&A community, there are a large number of questionanswer pairs has being accumulated. For this question-answer pairs may contain text, pictures, audio, video and other multi-modal data. And the key question for the Chinese Q&A community platform becomes how to match the questions with the most appropriate answers by using cross-modal information such as text and images. In this paper, we propose a question and answer retrieval model based on CCA cross-modal retrieval algorithm. Firstly, the LDA is used to represent Chinese text features, and then the image features are extracted using a convolutional neural network and the Kmeans clustering method is used to obtain image features. Finally, the Canonical Correlation Analysis (CCA) method is used to retrieve between the image and text, CCA method crosses the heterogeneous problem of the underlying multimedia data, and retains the correlation of the variables, then get crossmodel search results of questions and answers. After Clear the correlation between the two models, the image and text features are mapped to the same feature space, and the similarity of the feature vectors can be directly measured, multimodal retrieval with document retrieval map is implemented. The experimental results show that the cross-modal retrieval method based on CCA in Chinese community can improve the accuracy of answer retrieval.
Keywords
Chinese CQA, Cross-model retrieval, CCA, Semantic abstraction.Text
DOI
10.12783/dtcse/cmsms2018/25219
10.12783/dtcse/cmsms2018/25219
Refbacks
- There are currently no refbacks.