Jen Heilemann
DESIGN*DEVELOP

Mail Mashr
A pair of PHP and jQuery scripts to keep your email addresses public, but not too public.

Skip the explanation, I want the code.

For a long time, I have used a classic and favorite approach to hiding my email addresses from thieving bots and spiders: Javascript obfuscation. You know the drill: take your favorite email address, break it up into four or five parts, then use a Javascript function to put it back together, as confusingly as possible.

It gets the job done — but it is a pain when you want to update email addresses and I always miss something putting all the pieces together. Anyone without Javascript turned on is sol or end up seeing something like “joe {**AT**} example {**dot**} com.” It’s hard to read, hard to understand if you’re using a screen reader, no one can copy & paste, and it does not protect your email address very well anyway. Not to mention the code nightmare if I need to put multiple email addresses on the page.

What Are The Issues?

I looked into other methods when I started redesigning this site, but the options seemed dismal. Most of them involve some sort of Javascript obfuscation, with little or no provision for users without it, and the rest either don’t work (Hex values or ASCII characters) or they force all users to go out of their way (as in “Joe at this domain”). I’m not one to think my users or readers are dumb, but not everyone is going to understand references like that. I want to figure out something that will work on every site I work on — and not every site is going to have users that are very web–savvy.

So, I have certain goals to achieve:

  1. Accessibility: no images and no confusing text
  2. Works without Javascript: it should still make sense and be a click–able link
  3. It should be a "mailto:" link, and the user should be able to copy & paste (in a perfect world)
  4. It must be easy to change the email address(es)
  5. It must work for multiple email addresses on a page.

Hide Those Email Addresses (Without Javascript)

The first step was to find a server-side script that could somehow help hide or obfuscate email addresses. I came across PHP-mailto, a collection of PHP functions by jodrell. As he (or she) describes it, “When called, it prints some JavaScript which prints some more JavaScript that prints some HTML containing the mailto: link.”

For the most part, this gets us right back to where we started (more Javascript!) but it does have one important feature: a couple of functions together can ‘read’ through an entire file, identify “mailto” links, and edit each one. So that takes care of Nos. 4 and 5 in one fell swoop. So to begin with I can put this at the very beginning of a file:

<?php
function ob_mailto($html) {
    $patt = "#href=['\"]mailto:([A-Z0-9-_\.\+]+)@(((?:[\w-]+\.)+)([A-Z]{2,4}))['\"]#i";
    preg_match_all($patt, $html, $addresses);
    if (count($addresses[0]) > 0) {
        for ($i = 0 ; $i < count($addresses[0]) ; $i++) {
            \\ Edit the mailto links here
        }
    }
    return $html;
}
ob_start('ob_mailto');
?>

Then I can write the rest of the page as usual with regular mailto links.

But there is still the problem of that middle line: how should each mailto link be transformed?

Http Redirect?

One of the options Sarven points out (in his long list of options) is to use the PHP header() function to redirect the browser to open an email program. Most people who mentioned it were hard-coding email addresses into a php file, then linking to the file from another page, as you can see in Sarven’s example.

This method has all kinds of problems. Most browsers will leave a blank page open, leaving the user lost. It would be hard to maintain — can you imagine if you needed a list of click–able email addresses on a page? And there’s no clue on the referring page that the link is any different than any other link. However, this method has promise, as it is the only one that uses server-side scripting to hide the mailto link.

I’ve been reading lately, and came across a clever script that forces a download dialog box. David Powers hard codes a $_GET request into a link, then uses the variables to define an http redirect. Why not pass the information for an email address like that?

ob_start('ob_mailto');
// Where $_GET['u'] is our "username" and $_GET['d'] is our "domain name"
if (isset($_GET['u']) && isset($_GET['d'])) {
    header("Location: mailto:" . $_GET['u'] . '@' . $_GET['d']);
}
?>

As far as I understand it, there's no need to escape the values in our $_GET variables, as any maliciously entered code would simply be sent straight back to the user. I may be mistaken on that, I’ll have to do a little more research. Let me know if you think that could be a problem.

If I include the http redirect function in the file that calls it, I also avoid the blank page that was mentioned before. In fact, it’s not even necessary to name the page the user is currently on for the href to work: the only information needed in the anchor is the $_GET variables. So an example anchor code would look like this:

<a href="?u=joe&amp;d=example.com">Email Joe</a>

And the resulting link would look like this:

Email Joe
If you have Javascript turned on, check the source code.

Because it doesn't have a filename or path, the query information is simply appended to the current page and the page is reloaded. This can force a “jump” to the top of the page if you have a long page, but it means your email address is hidden & the user gets a click–able link. If they are very web–savvy, they might even deduce the email address from the link text. To make it more understandable, I usually add a title attribute with some explanation.

Translating Mailto links into $_GET Requests

So now that we know what we want our mailto links to look like, we just have to insert two lines in the middle of the ob_mailto function:

<?php
function ob_mailto($html) {
    $patt = "#href=['\"]mailto:([A-Z0-9-_\.\+]+)@(((?:[\w-]+\.)+)([A-Z]{2,4}))['\"]#i";
    preg_match_all($patt, $html, $addresses);
    if (count($addresses[0]) > 0) {
        for ($i = 0 ; $i < count($addresses[0]) ; $i++) {
            \\ Edit the mailto links here!!
            $newAddr = "href=\"?u=" . $addresses[1][$i] . "&amp;d=" . $addresses[2][$i] . "\"";
            $html = str_replace($addresses[0][$i], $newAddr, $html);
        }
    }
    return $html;
}
ob_start('ob_mailto');
if (isset($_GET['u']) && isset($_GET['d'])) {
    header("Location: mailto:" . $_GET['u'] . '@' . $_GET['d']);
}
?>

When prepended to a file, this chunk of code will auto-magically grab any mailto links it sees and transform them into $_GET requests, then translate them back into a mailto: header when the link is clicked. This takes care of #2, #4, and #5 in my concerns. It's not perfect, but it doesn’t leave all the non-Javascript people out in the cold.

Prettifying and Perfecting

I realize that I seem to be down on Javascript. I think we as developers often depend on it too much for functionality, when there are often better (if more complicated) options. But I am all for using it to improve the experience of most users. So the final two concerns, no confusing text and it should be a “mailto:” link I am going to address with my favorite Javascript library: jQuery.

If you don’t usually use jQuery, you can include it in your page with this line:

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js" type="text/javascript"></script>

After including the jQuery library, there's just a few more lines:

<script type="text/javascript">
$(document).ready(function() {
    $('a[href*="?u="][href*="&d="]').attr('href', function(i, ref) {
        h = ref.replace( /\?u=([A-Z0-9-_\.\+]+)&d=(((?:[\w-]+\.)+)([A-Z]{2,4}))/i, "$1@$2");
        return "mai"+"lto:"+ h; });
});
</script>

Essentially these few lines are telling the browser, “Find <a> tags with href values which include both ‘?u=’ and ‘&d=’. Edit each of those href values, grabbing the text after ‘?u=’ and the text after ‘&d=’ and replace the href with ‘mailto:username@domain.’ ”

Suddenly it all comes back to the beginning. The Javascript reassembles that ugly link into something readable, and you can copy–paste. Even better, if you keep your scripts in a separate file, then there are zero references to "mailto" in the source code, and a bot would have to both parse your javascript files and parse the entire jQuery library and apply it to the source code to even figure out what’s going on.

Putting it all together

I’m trying to keep this as “best practice” as possible, so in the end you’ll have three files to work with:

Of course if you use them, I’d suggest just including the PHP and Javascript scripts in your standard config and script files, respectively.

Your original file

At the very top, before anything else:

<?php
include('/path/to/includes/mail_mashr.php');
ob_start('ob_mailto');
?>

The rest of your page as usual, then just before the closing body tag:

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.5.2/jquery.min.js" type="text/javascript"></script>
<script src="/path/to/js/mail_mashr.js" type="text/javascript"></script>

mail_mashr.php

<?php
function ob_mailto($html) {
    $patt = "#href=['\"]mailto:([A-Z0-9-_\.\+]+)@(((?:[\w-]+\.)+)([A-Z]{2,4}))['\"]#i";
    preg_match_all($patt, $html, $addresses);
    if (count($addresses[0]) > 0) {
        for ($i = 0 ; $i < count($addresses[0]) ; $i++) {
            $newAddr = "href=\"?u=" . $addresses[1][$i] . "&amp;d=" . $addresses[2][$i] . "\"";
            $html = str_replace($addresses[0][$i], $newAddr, $html);
        }
    }
    return $html;
}
if (isset($_GET['u']) && isset($_GET['d'])) {
    header("Location: mailto:" . $_GET['u'] . '@' . $_GET['d']);
}

mail_mashr.js

$(document).ready(function() {
    $('a[href*="?u="][href*="&d="]').attr('href', function(i, ref) {
        h = ref.replace( /\?u=([A-Z0-9-_\.\+]+)&d=(((?:[\w-]+\.)+)([A-Z]{2,4}))/i, "$1@$2");
        return "mai"+"lto:"+ h; });
});

There you have it!

Final Notes

This is hardly the end–all and be–all email–hiding scripts. It certainly is not as secure as it could be. If you are tinkering, I think that changing the $_GET variables to something random would be a first step; encrypting the parts of the email address might be something to try (though there are downsides in that the no-Javascript link will not be human-readable).

If you do include a title with an explanation for non-Javascript users, it's possible to append line 4 of the mail_mashr.js script to make the title go away when someone has Javascript turned on:

return "mai"+"lto:"+ h; }).attr('title','');

Hopefully this helps you and makes your coding a little easier. Feel free to use these files in any of your projects, non-profit and commercial. I would appreciate a credit and a link back to this page (http://jenheilemann.com/article/mail_masher.php) in your Javascript file.

Download Mail Mashr 1.3

Originally published on Friday, April 15, 2011. Last updated on May 5, 2011.