スパム再襲来 Re-spam raid


▲これが新種のスパムだ
今度は定型文送るようなスパムが登場しました。名づけて「prize spam」。
以下のような正規表現にて受理される文を送る・フォローを全くしない・デフォルトアイコンが特徴です。
初めの方は^\@(.+?) Hey \1\.ってしてたんだけど、途中から仕様がかわって名前を呼ぶように・・・

m|^\@\w+? Hey .+?\. You have been chosen for a prize! http://tinyurl.com/\w{7}$|

定型文なので、streamAPIを使えばいけるのではないかと予想。
「prize!」でstreamAPIを使ってみると、このspamに関するpostばかりが出てきました。というわけで。

  1. Stream APIのtrackを使って「prize!」で捕捉する
  2. 正規表現2つにマッチするかどうか調べる
  3. マッチしたらreport for spamする+つぶやく

こうすればスパム被害も減るでしょう。
この方式のいいところは、全世界のpostからスパムを拾えること。

今後の課題:RTされたspamを、もっと柔軟に捕捉してblockする・サービスとして展開できるようにOAuth認証に対応・Erlangに移植して耐障害性を高める

コードはこちら。 ./spam.pl TwitterID Password で使ってあげてください。
スパムを捕捉するとreport spamしました^q^ってつぶやく仕様です(ぇ

〜〜〜ここから英語〜〜〜

This time, a spam accounts who send fixed form sentence to us appeared in Twitter. I named them 'prize spam'.
Their characters are they send a tweet we can capture by the regular expression I wrote, they do not follow anyone, and their icons are default icon.
At first, my regular expression was "^\@(.+?) Hey \1\.", but as they raised the level and came to call not id but name, I couldn't use this pattern.
Because their tweets are fixed form sentence, expected that I can capture their tweets by stream API.
When I used stream API setting word "prize!", I could get many tweets concerns this spam. Thus...

  1. Capture tweets containing "prize" by Twitter streaming API.
  2. Confirm whether captured tweet matches two regular expressions
  3. If it matches, report the account for spam and post that I did it.

Therefore, spam damage will decreases.
This way's advantage is we came to be able to capture spams from a tweet all over the world.

Future tasks:Capture more flexibly and block spam account who was retweeted, correspond to OAuth to develop as service, to improve trouble resistance, port to Erlang.
The code is described later.
Usage ./spam.pl TwitterID Password

spam.pl

#!/usr/bin/perl
use strict;
use Jcode;
use utf8;
use Net::Twitter;
use AnyEvent::Twitter::Stream;

binmode STDOUT, ":utf8";

# Set bot information
my($user, $password) = @ARGV;
my $twit = Net::Twitter->new(
		username => $user,
		password => $password
	);
my $done = AnyEvent->condvar;
my %args = (track => 'prize!');

my $streamer = AnyEvent::Twitter::Stream->new(
	username => $user,
	password => $password,
	method   => 'filter', %args,
	on_tweet => \&got_tweet,
	on_error => sub {
		my $error = shift;
		warn "ERROR: $error";
		$done->send;
	},
	on_eof   => sub {
		$done->send;
	},
);
$done->recv;

# Streaming API
sub got_tweet {
	my $histweet;
	my $tweet = shift;
	print "$tweet->{user}{screen_name}: $tweet->{text}\n";
	if($tweet->{text} =~ m|^\@\w+? Hey .+?\. You have been chosen for a prize! http://tinyurl.com/\w{7}$|) {
		print "$tweet->{user}{screen_name} is SPAM !!!!!!!!!\n";
		$twit->update("\@$tweet->{user}{screen_name} をreport for spamしました^q^");
		$twit->report_spam($tweet->{'user'}{'id'});
	} elsif ($tweet->{text} =~ m%(RT|QT) \@(\w+?):? \@\w+? Hey .+?\. You have been chosen for a prize!%) {
		print "$2 is SPAM !!!!!!!!!\n";
		$twit->update("$2 をreport for spamしました^q^");
		$twit->report_spam($2);
	}
}