Abstract:
The volume of child sexual abuse materials (CSAM) created and shared daily both
surface web platforms such as Twitter and dark web forums is very high ([1]). Based on
volume, it is not viable for human experts to intercept or identify CSAM manually.
However, automatically detecting and analysing child sexual abusive language in online
text is challenging and time-intensive, mostly due to the variety of data formats and
privacy constraints of hosting platforms. We propose a CSAM detection intelligence
algorithm based on natural language processing and machine learning techniques ([2]).
Our CSAM detection model is not only used to remove CSAM on online platforms, but
can also help determine perpetrator behaviours, provide evidences, and extract new
knowledge for hotlines, child agencies, education programs and policy makers.