Detection of child exploiting chats from a mixed chat dataset as text classification task

Date

2011

Journal Title

Journal ISSN

Volume Title

Publisher

Australian Language Technology Association

Abstract

Detection of child exploitation in Internet chatting is an important issue for the protection of children from prospective online paedophiles. This paper investigates the effectiveness of text classifiers to identify Child Exploitation (CE) in chatting. As the chatting occurs among two or more users by typing texts, the text of chat-messages can be used as the data to be analysed by text classifiers. Therefore the problem of identification of CE chats can be framed as the problem of text classification by categorizing the chatlogs into predefined CE types. Along with three traditional text categorizing techniques a new approach has been made to accomplish the task. Psychometric and categorical information by LIWC (Linguistic Inquiry and Word Count) has been used and improvement of performance in some classifier has been found. For the experiments of current research the chat logs are collected from various websites open to public. Classification-via-Regression, J-48-Decision-Tree and Naïve-Bayes classifiers are used. Comparison of the performance of the classifiers is shown in the result. (Author Abstract)

Description

Keywords

child abuse, cyber solicitation, online solicitation, grooming, investigation, International Resources, research

Citation

RahmanMiah, M. W., Yearwood, J., & Kulkarni, S. (2011, December). Detection of child exploiting chats from a mixed chat dataset as text classification task. In Proceedings of the Australian Language Technology Association Workshop (pp. 157-165).

DOI