Yuan Niu
Department of Computer Science
University of California, Davis
Davis, 95616
http://midgard.cs.ucdavis.edu/~niu
A Look at Web Spam

Problem Overview
Web spamming is a prevalent problem on the Web. Publishing and contributing to the web is increasingly easy. Spammers, motivated by money, want to attract users to their sites. High visibility in search results facilitates and lends a false air of legitimacy to the spammer's page. To get to this point, spammers create many doorway pages hosted by free services. They publicize the URLs to these doorway pages through comment spam. Their goal in doing so is not to trick people, but rather to defeat search engine ranking algorithms. In some cases, spammers use these techniques to lead visitors to malicious websites that try to exploit vulnerabilites on the visitor's machine.

Types of Spam on the Web

Splog
Splog = Spam Blog. These are fake blogs set up by spammers to fool search engines. Many of these act as doorway urls and redirect visitors to another website which may continue the redirection or may be the page being promoted. Splogs, like other spam pages, may be used to form link-farms.
Sping
Sping = Spam Ping. These are pings from a spammer's site, usually a splog, to an aggregator informing it of updates to the spammer's site.
Comment Spam
These are comments left in forums, guestbooks, blogs with URLs promoting the spammer's page. Sometimes these comments contain scripts to redirect a visitor automatically, in a sense hijacking the comment page. Comment spam can be human or machine generated.
Trackback Spam
Trackbacks are usually machine generated, though forms do exist for manual trackbacks. These are URLs pointing to another page on site A that a user B specifies at the time posting. Trackbacks rely on HTTP requests to generate what looks like a comment containing person B's webpage link on site A. This function has been heavily abused by spammers.
Pingback Spam
Pingbacks are also machine generated. However, unlike trackbacks, a pingback from site A to site B requires that a link exist on site A. Some suggest the exclusive use of pingbacks as a way to deter the spam from trackbacks.
Wiki Spam
Spammers take advantage of the open nature of wikis and create pages often containing redirection scripts as well as

Publications
A Quantitative Study of Forum Spamming Using Context-based Analysis. [PDF]
Yuan Niu, Yi-min Wang, Hao Chen, Ming Ma, Francis Hsu.
Proceedings of the 14th Annual Network and Distributed System Security Symposium (NDSS2007).

Spam Double-Funnel: Connecting Web Spammers with Advertisers. [PDF]
Yi-min Wang, Ming Ma, Yuan Niu, and Hao Chen.
Proceedings of the 16th International World Wide Web Conference (WWW2007).

iPhish: Phishing Vulnerabilities on Consumer Electronic Devices. [PDF]
Yuan Niu, Francis Hsu, and Hao Chen.
Proceedings of the Usability, Psychology, and Security Workshop (UPSEC '08).