Social net­work­ing sites offer users the option to sub­mit user spam reports for a given mes­sage, indi­cat­ing this mes­sage is inap­pro­pri­ate. In this paper we present a frame­work that uses these user spam reports for spam detec­tion. The frame­work is based on the HITS web link analy­sis frame­work and is instan­ti­ated in three mod­els. The mod­els sub­se­quently intro­duce prop­a­ga­tion between mes­sages reported by the same user, mes­sages authored by the same user, and mes­sages with sim­i­lar con­tent. Each of the mod­els can also be con­verted to a sim­ple semi-supervised scheme. We test our mod­els on data from a pop­u­lar social net­work and com­pare the mod­els to two base­lines, based on mes­sage con­tent and raw report counts. We find that our mod­els out­per­form both base­lines and that each of the addi­tions (reporters, authors, and sim­i­lar mes­sages) fur­ther improves the per­for­mance of the framework.

  • [PDF] M. Bosma, E. Meij, and W. Weerkamp, “A Frame­work for Unsu­per­vised Spam Detec­tion in Social Net­work­ing Sites,” in ECIR ’12, 2012.
    [Bib­tex]
    @inproceedings{ECIR:2012:bosma,
      Author = {Maarten Bosma and Meij, Edgar and Weerkamp, Wouter},
      Booktitle = {ECIR '12},
      Date-Added = {2011-11-23 18:10:33 +0100},
      Date-Modified = {2011-11-23 18:12:12 +0100},
      Title = {A Framework for Unsupervised Spam Detection in Social Networking Sites},
      Year = {2012}}