Protecting your website from spam

Several decades ago, a certain group of well-rounded British comedians coined an unforgettable catchphrase as part of their repertoire of surreal humour – “Spam”.

Back firmly in the present, unsolicited emails that bear the same name are causing an incredible loss of business productivity given the fact that employees, directors, the self-employed and IT support teams all have to devote valuable time to cleaning out and safeguarding their inboxes from the barrage of unwanted messages they receive on a daily basis.

If you’ve just registered a shiny new domain name, and are looking to develop (or have developed for you) a website to communicate your message to a global audience, you should be aware that one of the key battlegrounds in the “war on spam” lies in the methods employed on that website which allow your visitors to contact you.

A bit of history…

In the early days of the Web, designers and site authors quickly set about offering “mail me” links in their pages as a means of getting their visitors to contact them – the visitor could then click on these, and this would in turn open up the visitor’s mail client software with the delivery address automatically entered (subject lines could also be specified in this way) for them. This made use of the “mailto” attribute of the <a> (or “anchor”) HTML tag (HTML stands for HyperText Markup Language, and is the lingua franca for website design). An example is shown below:

The above code would generate the following in a page:

Unfortunately, given these “naked” addresses left in pages, unscrupulous individuals and organisations realised that if they could somehow collect them, they would become the custodians of huge databases of email addresses, which could then be bombarded with unsolicited messages attempting to lure recipients into buying everything from pirate software to medication of a very dubious nature, or being go-betweens for “transfers” of huge amounts of money from the supposed former leaders of developing world nations.

Their chosen method of achieving this was to unleash legions of “harvesters” around the Web – these were (and still very much are) automated scripts that would jump from website to website scanning for instances of “mailto” appearing in them, and then reporting back to their masters with details of the email addresses they contained. Once the spammers had collected addresses in this way, they would begin to send out their messages to them. Some unwary recipients attempted to stop the flow of unwanted emails coming to them by replying to the messages, but in many instances this only served to confirm to the senders that the address was “live”, and so fair game for more!

Contact forms – the solution to spam?

Eventually, with the onset of dynamic Web technologies such as CGI (Common Gateway Interface), ASP (Active Server Pages) and PHP (Hypertext PreProcessor), many Webmasters moved away from the practice of using “mailto” type links, and instead set up the now-widespread contact forms on their pages. An example of such a contact form is shown below:

Forms allowed the delivery email address to be stored on the webserver, and therefore not directly accessible to the spammers and their harvesters. This was a major step forward, but by no means a panacea to the woes of unsolicited mails as it quickly forced spammers everywhere to re-evaluate their nefarious strategies.

To begin with, webmasters tried to ensure that the correct information was submited via contact forms using “validation” routines written in the “client-side” language JavaScript – in theory, error messages would be displayed in the visitor’s browser software if there were inconsistencies in what they were trying to send (such as fields left blank, or alphabetical characters inserted into telephone number fields and so on…), and the message would not be sent to its final destination until the script was satisfied that everything was “present and correct”.

Unfortunately, client-side validation routines could be easily circumvented by the spammers, and the automated “bot” processes they used to hijack the contact forms they came across. A solution to this escalation in the spam “arms race”, came in the form of server-side validation, written using technologies such as ASP or PHP. In this instance, the details submitted via contact forms could be analysed on the web server itself, and bad entries could be blocked, providing the correct coding techniques were employed.

Server-side technologies could also be harnessed to track the submitter’s IP (Internet Protocol) address – a series of numbers uniquely identifying devices, such as computers, on the Internet. In the case of bogus information being submitted, the IP address could then be used to assist in tracking down the sender.

The trouble with HTML (and BBCode)

One type of common spamming “attack” on contact forms is the insertion of HTML and other “tags” (such as those used in BBCode – Bulletin Board Code, as found on Web-based discussion forums) into their fields – these often come in the guise of links to sites of a dubious nature, and examples are shown below:

An example of HTML spam insertion is shown above.

Thankfully, a correctly coded server-side validation script can block this kind of thing from happening at source. There are other things server-side scripts should be looking to block – for example, spambots inserting variations of the form’s associated domain name, i.e. zyqqqhhs@mydomain.com, and also the improper uses of carriage returns and new lines.

Blacklists – another line of defence

Over the years, as the problem of Spam increased exponentially, “blacklists” of spam-sending IP addresses, email addresses and message subject lines were set up as a centralised method of trying to contain this technological menace. Web server administrators could then use dedicated software to filter messages before they reached their intended destinations based on their likelihood of containing unwanted content. If you are thinking of setting up a website, then you should certainly contact your hosting provider to ensure that they make use of such systems.

Finally, Web designers and server administrators are constantly involved in a game of cat and mouse with the spammers who are growing increasingly sophisticated in their methods of operation. The above information should hopefully provide some information on the subject, and while not by any means exhaustive, should at least point out some of the points to be aware of when setting up a new website.

To summarise

Use server-side validation on your contact forms
Ensure that spam protection is enabled on your server
Record the IP addresses of visitors using a contact form
Be careful if you post your email address on websites you do not directly control

Don’t

Leave “naked” email addresses on your web pages
Reply to any spam messages you may receive
Rely on client-side validation alone (such as using JavaScript)

How 247 Creative can help you

You can count on our extensive experience in web development to assist in minimising the unwanted email messages you receive from your website. We offer industry best-practice methods of containing this modern menace to our clients. Contact us now to find out how you can keep your inbox as free as possible of time-wasting spam.