Sniff browser history for improved user experience

The social web has filled our websites with too much third-party clutter as we figure out the best way to integrate content with the favorite sites and preferences of our visitors. Intelligent websites should tune-in to the content preferences of their visitors, tailoring a specific experience based on each visitor’s favorite sites and services across the social web. In this post I will teach you how to mine the rich treasure trove of personalization data sitting inside your visitor’s browser history for deep personalization experiences.

I first blogged about this technique almost two years ago but I will now provide even more details and example implementations.

  1. Evaluate links on a page
  2. Test a known set of links
  3. Live demos and examples
    1. Online aggregators
    2. Social bookmarks
    3. OpenID providers
    4. Mapping services
  4. Summary

Web browsers store a list of web pages in local history for about a week by default. Your browsing history improves your browsing experience by autocompleting a URL in your address bar, helping you search for previously viewed content, or coloring previously visited links on a page. Link coloring, or more generally applying special CSS properties to a :visited link, is a DOM-accessible page state and a useful method of comparing a known set of links against a visitor’s browser history for improved user experience.

  • New Site
  • Visited site

A web browser such as Firefox or Internet Explorer will load the current user’s browser history into memory and compare each link (anchor) on the page against the user’s previous history. Previously visited links receive a special CSS pseudo-class distinction of :visited and may receive special styling.

<style type="text/css">
ul#test li a:visited{color:green !important}
</style>
<ul id="test">
  <li><a href="http://example.com/">Example</a></li>
</ul>

The example above defines a list of test links and applies custom CSS to any visited link within the set. Your site’s JavaScript code can request each link within the test unordered list and evaluate its visited state.

Any website can test a known set of links against the current visitor’s browser history using standard JavaScript.

  1. Place your set of links on the page at load or dynamically using the DOM access methods.
  2. Attach a special color to each visited link in your test set using finely scoped CSS.
  3. Walk the evaluated DOM for each link in your test set, comparing the link’s color style against your previously defined value.
  4. Record each link that matches the expected value.
  5. Customize content based on this new information (optional).

Each link needs to be explicitly specified and evaluated. The standard rules of URL structure still apply, which means we are evaluating a distinct combination of scheme, host, and path. We do not have access to wildcard or regex definitions of a linked resource.

In less geeky terms we need to take into account all the different ways a particular resource might be referenced. We might need to check the http and https versions of the page, with and without a www. prefix to more thoroughly evaluate active use of a particular website and its pages.

I group my tests into sets of URLs with the most likely matches placed at the beginning of the set. I evaluate each link in the set until I find a match thereby exhausting positive indicators of site activity while prioritizing the data scan.

Live demos and examples

Sniffing a visitor’s browser history has good and evil implications. An advertiser can determine if you visited Audi’s website lately, drill down on exact Audi models, and offer related information without ever placing code on the Audi website. I have been scanning the browser history of my site visitors for the past few months and I have coded a few examples to show benevolent uses for improved user experience.

Online aggregators

Web feed subscription buttons

Clusters of feed subscription buttons clutter our websites, displaying tiny banner ads for online aggregators of little use to most of our site visitors. My blog checks a known list of online aggregators against the current visitor’s browser history and adds a targeted feed subscription button for increased conversion. A Google Reader user will see an “Add to Google button” and a Netvibes user will see an “Add to Netvibes” button without cluttering up the interface. I insert direct links to each site’s feed handlers to help convert the current visitor into a long-term subscriber.

Once I match a particular service I could also check to see if the current visitor is already subscribed to my feed. I would simply need to run a second test against the data retrieval URL, such as feedid=1234, to match web traffic with subscriber numbers.

Visit my live example of link scanning popular online feed aggregators for a demo and the applicable code.

Social Bookmarks

Social bookmarking buttons

I like to see my latest blog posts spread all over the web thanks to social bookmarking sites and other methods of content filtering and annotation. Most sites spray a group of tiny service icons near their blog posts and hope a visitor recognizes the 16 pixel square and takes action. Suck. There has to be a better way.

I can scan a current visitor’s browser history to determine an active presence on one or more bookmarking sites. Once I determine the current visitor is also a Digg user I can show live data from Digg.com to prompt a specific action such as submitting a story or voting for content. I can create a much better user experience for 3 services I know my visitor actively uses instead of spraying 50 sites across the page.

Visit my live example of link scanning popular social bookmarking sites for a demo and the applicable code.

OpenID providers

Pibb OpenID signin buttons

OpenID is an increasingly popular single sign-on method and centralized identity service. OpenID lets a member of your site sign-on using a username and password from a growing list of OpenID providers including your instant messenger, web portal, blog host, or telephone company account. Visitors signing up for your site or service shouldn’t have to know anything about OpenID, federated identities, or other geeky things, but should be able to easily discover they can sign-in with a service they already use and trust every day.

I can scan a list of sign-in endpoints for a list of OpenID providers and only present my site visitor with options actually relevant to their everyday web usage. Prompting a user to sign-in to your service with their WordPress.com account should be much more effective than an input field sporting an OpenID icon. Link scanning for active usage should increase new member sign-ups, reduce support costs due to yet another username and password, and make your members happy.

Visit my live example of link scanning current OpenID providers for a demo and applicable code.

Mapping services

Facebook map selector

Online mapping services have changed the way we interact with location data. Need to get to 123 Main Street? Not a problem, I’ll just send that data over to your favorite mapping service to help you find your way.

I can scan a visitor’s browser history to determine their favorite mapping service. Perhaps she is most comfortable with MapQuest, Google Maps, or Yahoo. Or maybe she uses a Garmin GPS unit and would prefer a direct sync with that specialized service. Determining my visitors’ favorite mapping tool helps me deliver a valuable visualization or link I know they prefer.

Visit my live example of link scanning map API providers for a demo and applicable code.

Summary

Websites should take advantage of the full capabilities of modern browsers to deliver a compelling user experience. Built-in capabilities such as XMLHttpRequest took years of implementation before finding its asynchronous groove in data-heavy websites. I hope we can similarly probe other latent useful features to improve the social web through more personalized and responsive experiences.

I have been the browser history of my website visitors for the past few months to gracefully enhance adding my Atom feed to their favorite feed reader. Easily recognized branding such as “Add to My Yahoo” has yielded much higher conversion rates than a simple Atom link with a minimal effect on page load performance. Dynamically checking for active usage of 50 or so aggregators allows me to extend my total test list and promote an obscure tool that might never make the cut for permanent on-screen real estate.

How will your site utilize your visitor’s browser history for a more custom user experience? How will you connect data in new ways once you have concrete knowledge of the new feature developments that will be most useful to your visitors’ online lifestyle?

28 comments

Commentary on "Sniff browser history for improved user experience":

  1. Ted Rheingold on wrote:

    clever.you may as well crow about your WYSIWYG comment tool. It’s the Good.If you’re into formatting comments ;> It’s actually a much nicer way of saying what HTML is allowed.

  2. Dave on wrote:

    Would be nice if you could tab into it from the URL field though ;) (I had the same problem with TinyMCE and our editor)Anyway, great article, really nice lateral thinking there. I’ll have to remember this one :)

  3. Pascal Van Hecke on wrote:

    Just this bit of feedback: it might be useful to not just sniff on the homepage.Heavy users of e.g. del.icio.us might not go to the homepage at all, but just work in their own “space”. While it’s impossible to guess urls that contain their username, some urls like http://del.icio.us/logout and http://del.icio.us/post/ might be more likely to have been visited recently than the homepage.

    • Niall Kennedy on wrote:

      Pascal, I agree, each service has multiple possible indicators of activity. I have assigned a set of test URLs to each service in an attempt to more thoroughly identify usage, but only display one URL in my test page HTML output for the sake of a concise example. I tested for the del.icio.us post page (/post) in my example as a strong indicator of active del.icio.us bookmarking.

  4. Ian McKellar on wrote:

    Oh wow! I’d only ever thought of moderately sinister uses for these techniques before. This is brilliant!Ian

  5. drew olanoff on wrote:

    Wow. Fantastic post and so many implications. Personalization on the web is the cry for ’08, and this got my brain moving.Great stuff Niall.

  6. Brian Breslin on wrote:

    Niall, very clever. I am impressed. these little details can go a long way to improving a user experience.

  7. Kris Arnold on wrote:

    Interesting and potentially helpful to users. But please be mindful that some people would consider these to be techniques that could lead to privacy violations if the user’s data was communicated back to the server. Details are here.SafeHistory and SafeCache are Firefox extensions written to protect against techniques like these.

  8. Tim Trautmann on wrote:

    Niall,mindblowing and very useful stuff! Thanks for sharing!-T

  9. Eric Gonzalez on wrote:

    It’s a neat idea and nice job on the examples niall. I’m not sure I’d buy into this though, because It seems to me that this is a rich source for unwanted solicitation/facebookingFor example, if I have been hit BestBuy.com and purchase a tv, It would be pretty annoying to hit amazon.com and have them send a popup window with the same television I just purchased. This is before we get to awkward moments with people who browse umm.. purient interest sites!

  10. Scott Schiller on wrote:

    Jeremiah Grossman demoed the less-nice side of this in “I Know Where You’ve Been

    For fun, I made a “Web 2.0 Awareness Test“, (sense of humour required.)

    Generally, I don’t like the idea of browsers being able to arbitrarily test and verify that I’ve been to other domains; that’s none of their business. Some Firefox extensions exist which fix this behaviour under that browser.

  11. Patrick on wrote:

    Very cool, Niall. I think sniffing browser history can produce very compelling user personalization. However, in the case of generating links to social service providers, I think some way of using cookies for stored profiles might be the better approach. It’s a bit less overhead and it also allows the user to customize what links to show, ie. define the links they want to see vs the ones you think they want to see. How best to let the user select and de-select the social service links that’s easy and seamless would be the challenge.

    • Niall Kennedy on wrote:

      Patrick,I set a 30-day cookie within my own site’s scope after first scan. I check for the cookie’s presence before scanning, and interpret stored variables such as “google” into meaningful on-page markup. Using cookies or other forms of locally cached data for stored profiles across sites would only work if the browser supported DOM storage scoped to a TLD (.com, .net. etc). The current Firefox 2+ DOM storage implementation does not support these wildcards for security reasons. I don’t think it is possible to share such information without using a browser plugin/extension. I could look for the del.icio.us extension for example, or a shared DOM-accessible store I know has such data independent of the currently viewed website, but such as implementation is not available.

  12. AngusM on wrote:

    There’s room here for browser manufacturers to support some kind of voluntary disclosure. In the same way that we can set preferred languages in our browser, it ought to be possible to say “I use these sites, so I’d appreciate seeing appropriate submit buttons” and have savvy sites react accordingly.Many social networking addicts use too many sites to make sending the complete list back as a request header. It would make sense to use an indirection: send only the URL of a page where a site can fetch personalization/preference information (and the information available could go beyond just a list of sites, including other information that would affect UI/presentation options). For privacy purposes, the URL might need to change dynamically in a non-predictable fashion, so that sites couldn’t use this to track visitors.The question is whether the gain to the user would be enough to justify it, as it would require both browser support and the setting up of ‘preference proxies’. That’s a lot of infrastructure for a limited user experience enhancement.

  13. Kyle Bennett on wrote:

    I didn’t get it at first, because the sample pages showed nothing whatsoever. Then I realized that NoScript was blocking them. Nice try, but except for sites I visit very regularly and have unblocked for other reasons, I’ll never see this in any of the uses you suggest.

    That’s the flaw in this kind of idea, script blocking is becoming mainstream, and nobody is going to unblock a site just to get more advertising-ish stuff (meaning stuff that is of more benefit to the author than to the reader) – and in particular not to expose personal data, even for moderately benevolent reasons. Regular readers might, but I assume they’re a small, almost incidental, segment of the target audience.

    • Niall Kennedy on wrote:

      Kyle, I disagree with your claim script blocking is becoming mainstream. You decided to download a browser that likely did not ship with your computer and visit that third party browser’s add-on page to select extensions that may work best for you. You are also willing to check your status bar every time a site does not seem to function properly, and perhaps white-list that site’s JavaScript. If the calendar date selector is not working when you try to book a plane ticket, you know you can only blame yourself. I believe the mainstream would become frustrated at the lost functionality when they try using a mainstream service such as Yahoo! Mail or United Airlines with JavaScript disabled by default. My attempt to gracefully enhance a page for My Yahoo! users who have what to do with RSS or Atom is squarely targeted at mainstream users. They use JavaScript, which is exactly how their iGoogle and My Yahoo! page is able to function.

  14. Todd Sampson on wrote:

    You are an evil genius Niall. Great stuff.- Todd

  15. Zach Heaton on wrote:

    This technique definitely crosses the line for me as a user – page history is not something that I expect my browser to expose to the world, and I’m going to have second thoughts about visiting a site I discover using something like this.Fortunately, the fix in this case is a one-line user-defined CSS file –  “a:visited {color:blue !important}” breaks all of the test cases quite nicely.

  16. nikki on wrote:

    The problem with this type of sniffing for social bookmarking is that many of use use Google Reader, or another tool for our RSS feeds, and thus don’t visit the homepage of the site daily. I regularly read several of the feeds that were tested, but none of them showed up until I clicked the links to go to the main page.Plus, honestly, I find it a bit invasive. Not bothered enough to mess with my link styles, but I can’t say I find it nice at all.Personalization should be an opt-in, not something that people do under the table, if that makes sense.

  17. joshnunn on wrote:

    I’m not sure I’m OK with this as a user – but I can’t deny that it’d make the overload of ID managers/bookmark services/sharing sites a whole lot less cluttered. This is the kind of thing that every big web company wants to do, but can’t get away with – but wouldn’t our online lives be simpler if we sometimes got over our paranoia and let them? Maybe you should implement (and advocate) a privacy policy to go with the use of this script – something that says “I use this data only to enhance the user experience and no data is stored on my servers”. I think that might go a way to alleviate your visitors concerns about their privacy.

    And Kyle Bennett: I think you’re ahead of yourself. I had NoScript turned on for 6 months about a year ago, and found that too much of the web was suddenly cut off for me by default. I’m a fairly security savvy user, and I had to turn it off – Ma and Pa aren’t even going to try it in the first place.

  18. Otis Gospodnetic on wrote:

    Hi Niall – this is useful, thanks.Since you’ve been sniffing things for a few months, that means you also might have some stats.Got any stats on readers/bookmark services/etc.?I’d love to see it.

  19. nok on wrote:

    really really great post – and its great that the previous two comments suggest a privacy policy and then request statistic sharing. i’m curious as to the authors thoughts.i do think that the average user would occasionally clear their history so the stats would probably only show the use for that session, as was my experience. regardless this does have fantastic implications. thank you.

  20. Joe on wrote:

    I think you have come up with some cool uses for scanning browser history, glad to see these powers being used for instead of the evil that marketing sites employ them for, It would be cool if there were a way to do a wildcard search, such as any history urls that contain digg could bring up that link.. I visit digg daily, but never through the dot com, always a sub page found via my netvibes reader. I could see this being more valuable with some sort of wildcard sniffing.

  21. Paul Elia on wrote:

    I’m over a year late to this blog party but I wanted to say great job, Niall.  This is a very clever way to learn more about your visitor and to have an option to do something based on the discoveries made.
    Regarding JavaScript, we are moving towards 100% JavaScript required websites in our work. We have been testing and the numbers are in the upper 90% range for JavaScript support. A much more rewarding experience can be had by all with JavaScript enabled user agents and the coding is so much simpler for what we do. We no longer spend the extra 90% effort to pick up the last 2% of the visitors, at least for the websites that are time-limited in nature which is most of what we do. For ones that are supposed to last indefinitely then, yes, some effort is made to support user agents that do not support JavaScript.
    Regarding the privacy advocates, hey, I get it, but to a point. When I physically walk into a brick and mortar store and the sales person sees me he/she is making adjustments for how to interact with me based on judgments and assessments of my sex, race, dress, mannerisms, etc. Maybe I’m wearing a T-shirt with some message on it that may or may not reveal what I really think about the topic. There are appropriate and inappropriate ways the sales person could make use of the information and his/her interpretation of it. If he/she wants to make a sale then the appropriate ones will be used to better match what I am seeking with what he/she has to offer.
    A benevolent use of the technique you described is to likewise enhance online user experiences.

  22. Adrian Scott on wrote:

    Clever work, nicely done! ;)

  23. Robin Wilton on wrote:

    Interesting. I can see that, used “benevolently” the techniques you describe might indeed result in a more ‘personalised’ user experience, but I can see two potential issues:

    1 – it seems to me that you blur the distinction (or at least, are happy to let the blurring persist) between ‘personalized according to what the user would choose if they were able’ and ‘personalized according to the interests of the website owner’ – which are two entirely different things.

    2 – I’m not at all clear how the techniques you describe differ from ‘obtaining unauthorized access to computer systems’ – which as far as I know is an offense in most US states and many countries elsewhere. Even basic Data Protection principles stipulate that data collected for one purpose may not be used for another – and yet here you are encouraging people to do exactly that, and showing them how.

    Still – I have a certain admiration for your frankness in publishing all this.

    You have probably raised awareness of some of the risks, and done a lot to encourage some users to disable JavaScript in their browsers…