Tuesday, December 29, 2009

TwiGUARD update

Several months ago we announced TwiGUARD, a project researching how hackers spread malware/spam via Twitter. We believe that defenses like SafeBrowse (a Google feature that tells you when URLs are malicious) react too slowly. We are starting an experiment today that shows this, whose results we'll post in two months.

Social networking sites are the new front in the computer virus war. Previously, users would check a webpage (such as CNN or Slashdot) only once a day. Now, users check Twitter or Facebook several times an hour. (I am a good example of this, checking Twitter every ten minutes throughout the day). This means a piece of malware can spread quickly among Twitter users, faster than a security mechanism (like SafeBrowse, or updates to virus signatures) can respond.

Google's SafeBrowse is based on its search engine spider. When it comes across a site distributing malware, it adds that site to black-list. Browsers like Firefox downloads black-list updates every 30 minutes. When a user innocently clicks on a link to one of these bad sites in the black-list, Firefox will display a warning instead.

There is a race between how fast hackers can distribute malware on Twitter, and how fast Google's spider can find them, update the list, and distribute that list to browsers.

We have devised an experiment to test this speed. We downloaded all the tweets from yesterday (December 28, 2009) that contained URLs, and saved them to a file. This file contains half a million (504,489) URLs.

After downloading the list, we ran it through Google's SafeBrowse. It told us that about a thousand (1,250) of those URLs were bad.

Next we are going to wait a week and run the same list of URLs again through SafeBrowse. We expect that Google will have found more of them to be bad. We expect the number of bad URLs found in that file will double or triple. We will run the December 28 list through SafeBrowse every week for the next two months. We should see a steady rise in SafeBrowse claiming URLs are bad.

While we have done this informally in the past, this is the first time we are tracking the results. We'll post them in two months.


Matthew Wollenweber said...

I'm curious, how do you download all Twitter messages? Do you have to crawl the whole twitter site or is there a simpler mechanism?

Anonymous said...

What if URLs truly were safe at the start of the test period, but were 'infected' before the end?

What if Google makes improvements to SafeBrowse during the test period, resulting in newly detectable malicious URLs? (a function of skill/ability, not of time as you are testing)

These things might skew your data. What about a shorter test period to mitigate these possibilities?

Anonymous said...

Any chance these would be run through other checks like Firefox, IE or McAfee SiteAdvisor? I'm not sure if these other checks can be automated.