Online child grooming detection has recently attracted intensive research interests from both the machine
learning community and digital forensics community due to
its great social impact. The existing data-driven approaches
usually face the challenges of lack of training data and the
uncertainty of classes in terms of the classification or decision
boundary. This paper proposes a grooming detection approach in
an effort to address such uncertainty based on a data set derived
from a publicly available profiling data set. In particular, the
approach firstly applies the conventional text feature extraction
approach in identifying the most significant words in the data
set. This is followed by the application of a fuzzy-rough feature
selection approach in reducing the high dimensions of the selected
words for fast processing, which at the same time addressing
the uncertainty of class boundaries. The experimental results
demonstrate the efficiency and efficacy of the proposed approach
in detecting child grooming.