Trackback Spam

Someone is attempting to spam my blog pretty heavily through trackbacks. This is stupid. It’s never going to work since I have so few comments here anyhow that I moderate all of them. It only serves to annoy me by generating an email saying that I have comments to moderate. My original thought was to block inbound packets from the offenders. It would be pretty simple. Add a table to my firewall config and then use a little shell script magic (grep, awk etc) to pull the addresses from the log an load them into the table. But a quick pass through sort confirmed that these attacks were coming from mostly different addresses. For now I’ve disabled trackbacks by disabling the php file that supports them. Next up change my mail filter to quarantine the emails appropriately. Sigh.

Keeping up to date with FreeBSD

I use FreeBSD for nearly anything that needs a server. It’s got quite a bit to offer. Anyone who actually knows FreeBSD knows that it’s dead simple keep up to date. I’ve used this basic technique for several years.  The steps are pretty simple and can be found here with additional instructions for dealing with multiple machines here.  I’ve pretty much followed this method for years including updating a machine located in a remote close with an NFS mounted /usr/{src,obj} over an IPSEC link. I recently added a new wrinkle that I think is pretty cool. My build box is an HP/Compaq DL360 with hardware RAID. I’ve pretty much standardized on this hardware. A client clued me into this simple technique. Long story short he had to move a data center from Chicago to CT and he chose to do it by stocking up on spare RAID drives. He cloned a server by pulling a working drive from a working server and replacing it with a spare. He shipped the pulled drive via courier to the new data center. Installed it in the correct slot on the same kind of server chassis booted the new clone up. At this point the clone server now saw it’s drive array missing the other drive. He was obviously mirroring. On insertion of the new drive the clone server hardware did an automatic rebuild. I apply the same technique to FreeBSD. I build a new server from the lastest snapshot then go through the source update process. Next I pull one of the drives and put the pull aside replacing it with a spare.  Et voila. Now I have a save point on that Drive. I can install it in the server alone an reboot. and I’m right at the point where I have built and installed the world and have just finished running mergemaster. This is an excellent starting point for building a fresh server.

Should I use SPF?

Should I use SPF? What is SPF? Will SPF reduce the amount of spam that I get to my domain? There’s a lot of talk about SPF as a means of preventing spam these days and though it was originally designed to do that I’d have to put it down as a miserable failure at spam prevention. Does that mean that you shouldn’t use it? The juries still out on that one.

SPF stands for Sender Policy Framework. It’s a means of specifying where mail from a specific domain will come from. The implementation is an ugly kludge that overloads DNS TXT records. Through the overloaded records a remote server which is receiving a mail that claims to be from your domain can determine if it is real or a forgery… …maybe. It turns out that mailing lists that forward mail without rewriting the envelope headers as well as older mailers like mutt which still have a bounce forwarding feature will most likely false positive (e.g. receive an SPF fail).

SPF probably won’t reduce the amount of spam you receive. In fact don’t be surprised if you start to receive or are currently receive quite a bit of spam that passes the SPF tests. Many of the “more reputable” spammers, the ones sending spam using real mailservers and not hijacked windows machines on a botnet, use SPF to fool your spam filter into thinking that a piece of spam is a legitimate mail.

Should you use SPF? That depends on what you want. If you want a spam free inbox then look elsewhere. Under no circumstances should you ever assume that a piece of mail that is marked SPF fail is spam. In fact you are better off ignoring it when you test to determine the level of spammyness of your inbound mail stream. SPF does have one big positive and this is big enough that I recommend that people use it if they can.

It turns out that if you do enable SPF on your domain then spammers will no longer be able to forge messages from you as spam. To the postmaster of the domain this means that a large chunk of the those bounce messages from tens of thousands of people who don’t exist don’t get sent to you. You know what I mean the mail from that says that mail from to failed because there is no drlizardo128341. If you enable SPF the smarter spammers will not put your domain into the from field of an outgoing spam since then people can tell it’s a forgery.

Lil’ Bobby Tables

A while back I found xkcd. I found it through a mailing list but this post grabbed my attention. Here’s a recent comic which made me think about the way that popular web applications use SQL databases. If you don’t get it the
Mom in the comic executed a classic SQL injection attack against her son’s school. (BTW I didn’t want to spoil the joke. You can see the previous sentence by highlighting it with your mouse.) In any case this is a common attack method against many current web applications and it shows just how naively many of the programmers are of SQL in general. There are two practical ways to defend against this attack. The most obvious is to validate your SQL before you pass it to the database engine. One that requires a little more thought would be to disallow the web database user from being able to DROP TABLES in the first place. Any real web application should expect a database with at least two users, root, or dba and webuser, or www. Root should be allowed to do anything to the database but his credentials need to be protected. If your web application has grown to the point where you’ve split your database server from your webserver for performance purposes. Allowing the root or dba level of access from localhost only is a good start. Webuser should be able to SELECT on your applications tables. He should be able to INSERT, UPDATE, and DELETE on a limited subset of tables as your database will allow. He may need to be able to CREATE a temporary table and possible DROP the same but that’s a job that’s really better done by a stored procedure. E.g. create a stored procedure that does the needed manipulation and then allow the webuser the privilege to call the stored. Obviously what you can do here depends on your database. I know that Postgresql can grant these very fine grained security settings. If I recall correctly MySQL is a little more course but is still workable.

Hylafax: Ugh

I tried to setup hylafax today. I had it going a few years ago. I even had a neat hack where I would have it take all inbound faxes, convert them into pdf and store them in directory accessible from the web. It was pretty cool. I figured I’d re-create that and maybe add some Python-Fu to have an outbound directory but alas it wasn’t meant to be. I ran around in circles for three hours trying to eliminate the problems but got no-where until I installed efax and right off the bat the fax just worked. That eliminated the problems of (The modem broke between last time and now, the modem doesn’t like the VoIP line, and my new HP A-I-O doesn’t like the fax modem) leaving Hylafax is misconfigured. Here we go again, another mailing list……

More stuff about Postgresql that should be obvious.

I’ve been scratching my head on this one for far too long. I have a query under Postgresql which retrieves the distance between too points given the knowledge of their zipcodes. This work because I have an incomplete table of mileages between arbitrary three digit zipcode pairs. Each time I use this table my queries take a long time and I could never understand why. It has to do with the type that postgresql assigns to computed text fields. I was doing something like this:

SELECT * FROM worklist INNER JOIN partial_zipcode_mileage ON SUBSTRING(worklist.origin_zipcode, 1, 3) = partial_zipcode_mileage.origin_partial_zipcode...

The issue here is the type of the expression: SUBSTRING(worklist.origin_zipcode, 1, 3) as compared to the type of the field partial_zipcode_mileage.origin_partial_zipcode. The latter is a SQL CHAR(3) since it will always hold 3 characters. Postgresql assignes the first expression a type of TEXT since it has know way to know how bit a field you actually want. This prevents the postgresql engine from using the index and this, my query takes along time. Substitute SUBSTRING(worklist.origin_zipcode, 1, 3)::char(3) in the statement and all is happy.

Why math is important.

While I was sleeping we seemed to forget how to do math. This guy was quoted a rate of 0.002 cents per KB to use his Verizon Wireless Data Card while roaming in Canada. When he got the bill they charged him 0.002 dollars/KB. His story is here. What makes it sad is that the verizon customer service people don’t understand the difference and continue to quote him the lower rate while insisting that the charge on the bill is correct. All of this would be a non-issue if the marketing weasels at Verizon would just fess up to the fact that their price for roaming data is $2.05 / MB.

Exception trapping in pl/pgsql.

Learned a new trick with postgresql stored procedures today. This will probably appear to be obvious but it’s new to me. You can do exception trapping in pl/pgsql by you can also ignore some errors. The form is:

WHEN undefined_table THEN NULL;
RAISE NOTICE, ('Notice: ' || SQLSTATE || ' - ' || SQLERRM);

Most MTA’s should offer SSL/TLS

Most MTA’s should offer Opportunistic TLS by default

I think that the time has come for most SMTP MTA servers to offer STARTTLS session protection by default. I see two reasons for doing this. Firstly, it takes a short amount of extra time and a little more CPU horsepower and that’s a resource that spammers cannot control. Secondly, opportunistic TLS brings email security a little more in line with the security model that most users expect.


The majority of spammers out there are relying on stealing CPU time on machines that they don’t own. I don’t see them moving to TLS at the client side anytime soon. On the other hand legitimate email senders usually aren’t sending mail in such bulk that the cost of encrypting the session would be an onorous penalty. The practical end result of this would be differentiation of mail at the inbox. We would get mail from servers that used TLS to encrypt the session and mail from servers that didn’t. Assuming that the MTA server flagged the mail on this axis by adding a header, the end result is a hook for a statistical spam filter to use.

Users expectation

The second advantage would be a little added security in email during the transport from client MTA to server MTA. If everyone adopted opportunistic TLS encryption of the wire then sending email would better approximate the users expectations for security. Compared to physical mail email without TLS is like sending a postcard. No one sends postcards where security is a requirement because it’s obvious that everyone between the point where you drop the mail in the postbox and  the point of delivery  can just read the mail. Most users don’t expect that this is the case with email right now.

The advantage of opportunisticaly encrypting the mail is that we have a situation that we can grow into. If some server doesn’t do TLS in transport the mail still gets delivered.

Greylisting + MS Exchange 2003

I’m using greylisting to filter spam. It works quite well. If you aren’t of the technique this is how it works. Greylisting filters spam by testing the RFC compliance of the server that is trying to send mail to you. RFCs 2821 and 821 describe the meat and potatoes of sending email on the internet. The RFCs both specify that the receiver may tell the sender to queue the message and retry later because the receiver is temporarily out of resources. Greylisting exploits this to sift spam from legitimate email because many Spam sending programs cannot queue mail. As a method of spam detection Greylisting is great because it takes almost no resources on the receiving side to filter. Other methods of filtering are not so resource friendly. I find that Greylisting is rejecting over my inbound 90% of the spam. I used to say that it did this with 0 false positives but after reading these two threads I’m not so sure:

Leave it to Microsoft to rain on the parade.

I’m not going to stop Greylisting. It’s just been too effective at spam removal for me to even consider going without. I’m also aware of several people who are using Exchange to contact me who have not run across this problem. For me the solution to this potential problem is to contact some of the people who I know that are running Exchange and see what their awareness is on this problem.