Recently in Social Category

Social software used to connect people and their interests. Includes social networks, recommendation systems, and other software benefiting from the knowledge of the crowds.

  1. Mar16

    Create enhanced results on Yahoo! and Facebook with Share markup

    Yahoo! announced support for enhanced search results last week based on Facebook Share and RDFa markup. Website owners can add a few meta tags to their pages to boost click-throughs from a more visual Yahoo! Search result and ease the process of sharing a link on Facebook at the same time. In this post I will cover the major categories of enhanced share types -- audio, images, video, news, blogs, games, documents, and multimedia -- and walk through how site owners can stand out on shareable platforms. Yahoo! and Facebook are just the first two platforms to collaborate on this effort. Expect more announcements from a variety of activity stream and URL share providers in the future.

    Why add special markup?

    Yahoo Search enhanced result The Daily Show

    Shared pages and search results contain, at a minimum, a title and a description extracted from your site's <head>. You can enhance your search and share capabilities by explicitly specifying a thumbnail image or playback data you would like included directly alongside mentions of your pages.

    <meta name="title" content="Apple iPhone 3G" />
    <meta name="description" content="iPhone 3G combines three products in one..." />
    <link rel="image_src" type="image/jpeg" href="http://www.apple.com/iphone/iphone_thumb.jpg" />

    Thumbnails appear alongside Yahoo! search results in a 54 pixel high by 98 pixel wide image window. Thumbnails might be overlaid with a call to action for supported mediums and their embedded players.

    Supported mediums

    The Facebook Share API supports annotating a page as audio, video, generic multimedia, an image, blog content, or news content. Publishers simply add a single line to their pages to help Facebook classify the type of page shared by their users and apply the appropriate specialized handling.

    <meta name="medium" content="blog" />

    Yahoo! Search currently supports enhanced search results for video, games, and documents (all wrapped in a Flash player). Playback SWF files must be white-listed by Yahoo! before they are supported for inline playback (Facebook also white-lists). Yahoo! currently supports the following websites' embedded Flash players:

    Video
    Hulu, YouTube, Yahoo! Video, Metacafe
    Games
    Playcrafter
    Documents
    Scribd, SlideShare

    Inline audio

    Facebook supports track title, artist name, album name, and a direct link to the audio through its share interface. Facebook users should be able to playback MP3s and other popular audio formats from their social news feed.

    <meta name="medium" content="audio" />
    <meta name="title" content="Pearl Jam - Black" />
    <meta name="description" content="Sheets of empty canvas, untouched sheets of clay..." />
    <link rel="image_src" type="image/jpeg" href="http://pearljam.com/ten.jpg" title="Ten Album cover" />
    <link rel="audio_src" type="audio/mpeg" href="http://pearljam.com/black.mp3" />
    <meta name="audio_type" content="audio/mpeg" />
    <meta name="audio_title" content="Black" />
    <meta name="audio_artist" content="Pearl Jam" />
    <meta name="audio_album" content="Ten" />

    Inline video

    Video sites should specify their embedded player and its desired height and width on each video page. White-listed players will be played back directly in a social news feed.

    <meta name="medium" content="video" />
    <meta name="title" content="The Daily Show with Jon Stewart: Thu, Mar 12, 2009" />
    <meta name="description" content="Jon Stewart and CNBC's Jim Cramer go face to face in the studio." />
    <link rel="image_src" type="image/jpeg" href="http://thumbnails.hulu.com/73511_145x80.jpg" />
    <link rel="video_src" type="application/x-shockwave-flash" href="http://www.hulu.com/embed/_3TIApx3ymwKbAfZnz-MKA" />
    <meta name="video_width" content="512" />
    <meta name="video_height" content="296" />
    <meta name="video_type" content="application/x-shockwave-flash" />

    Summary

    Search results pages and social news feeds provide new sources of referrals for connected sites. Webmasters can spice up their pages with a few extra meta tags to stand out any time a machine or a human decides to share your content with others. Adding thumbnails to a page, something as simple as a user profile picture in a social context, is a simple step that will help your content stand out from the crowd.

    Publishers supporting embedded Flash players should consider the impact of off-site inline playback on their business. If you've already embraced spreading your SWFs far and wide you should jump on the white-list queue early to gain a competitive advantage through partners such as Yahoo! and Facebook.

  2. Jan03

    Facebook v. Power Ventures

    Facebook v. Power Ventures

    Facebook filed eight legal complaints in United States federal court against Power Ventures, operators of social aggregator Power.com (story via NYT Bits blog). Facebook claims Power collected Facebook usernames and passwords, stored Facebook data on their servers, used the Facebook trademark without license, sent e-mails posing as Facebook, and knowingly circumvented Facebook's attempts to block access. The lawsuit, filed on December 30th in San Jose, comes one month after Facebook initially contacted Power.com regarding its violation and attempted to transition Power to an acceptable method of access: Facebook Connect.

    Power.com is headquartered in Rio de Janeiro, Brazil with additional offices in San Francisco and Hyderabad, India. Power raised $8 million from Draper Fisher Jurvetson, DFJ affiliate FIR Capital, Esther Dyson, and other investors. Facebook is seeking triple damages for willful violation including all revenue generated by Power.com in the month of December. Facebook may be able to claim $10,000 for each Facebook account accessed by Power under California Penal Code section 502 due to repeat violations.

    1. The password anti-pattern
    2. Social data distribution
    3. Dispute timeline
    4. Tips for business partnerships
    5. Summary

    The password anti-pattern

    Facebook login

    Collecting Facebook usernames and passwords is at the heart of the dispute. Power.com impersonates a Facebook user after collecting their username and password. The site imports friends lists from Facebook and other social providers to create a meta profile for its over-networked members trying to keep their many personas in sync. Facebook Connect, announced in May and available for beta testing shortly after, provides account linking between Facebook and other sites, SSL transport, and friend imports. Facebook Connect limits the data flow of Facebook user data in ways a direct login would not. Power.com assumed full user powers as a remote agent of a Facebook user instead of an authorized proxy to accomplish its own goals and violated Facebook terms of service in the process.

    I covered some of these data portability issues and best practices in my Data Portability, Authentication, and Authorization post last year.

    Social data distribution

    [T]he sole end for which mankind are warranted, individually or collectively, in interfering with the liberty of action of any of their number, is self-protection. That the only purpose for which power can be rightfully exercised over any member of a civilized community, against his will, is to prevent harm to others. His own good, either physical or moral, is not a sufficient warrant...In the part which merely concerns himself, his independence is, of right, absolute. Over himself, over his own body and mind, the individual is sovereign.

    John Stuart Mill, On Liberty

    Modern society mostly allows people to commit self-harm as long as that action is not also harming others. Facebook restricts access to another person's member data beyond the original intent that person's sharing. New data use must explicitly receive permission to participate in shared data beyond the walls of Facebook.com (you may invite me into this new context but I am not automatically imported). Data is shared within a friend context on Facebook with the understanding such information is protected and may be limited to only a group of approved friends. Once that friend data starts propagating outside its initial use (by a Facebook member or Facebook itself) the trust associated with sharing data is violated. If you have ever thought twice about posting an e-mail address on a web page out of fear of automated data harvesters you have experienced communicating with a known community of site visitors versus other uses. Facebook wants to be an identity hub of real data about real people and takes certain steps to protect that data exchange.

    Power.com knowingly violated the Facebook Terms of Service and encouraged Facebook members to do the same.

    Dispute timeline

    Power.com launched to a United States audience on December 1, 2008. The site previously focused on the Brazilian market with support for Flogão and Google-owned Orkut since launching in August. Facebook contacted Power.com on December 1, according to the lawsuit, notifying the team of their terms of service violation.

    Power Ventures CEO Steven Vachani responded to the Facebook inquiry on December 12 (11 days later) promising to delete all existing Facebook data stored on Power.com servers and implement Facebook Connect as a replacement by December 26. The next business day Facebook acknowledged the e-mail and waited for confirmation of data deletion and Connect switch-over. Vachani confirmed the transition progress on December 22 (4 days before the supposed switch).

    Vachani e-mailed Facebook legal council after the close of business on December 26 and communicates a "business decision" not to comply with Facebook's request to stop collecting and storing Facebook logins on Power.com. Vachani claimed the site would implement Facebook Connect but such integration would take over 5 weeks to complete. Power.com kicks off a "launch promotion" that same day with a $100 reward for the Facebook user who invites the most friends to join Power using their Facebook credentials. Facebook implements an IP-address block against Power.com servers on the evening of December 26 to prevent further abuse.

    Power.com circumvents the IP-block by Facebook and continues its marketing campaigns. Power sets up a Facebook event page to promote its $100 signup give-away and uses the existing Facebook accounts in its system to send event invites to friends lists.

    Facebook took legal action against Power Ventures on December 30, one business day after the Christmas holiday weekend, to prevent further abuse after civil discussions obviously broke down. Facebook accused Power of trespassing on Facebook servers in San Jose (a modern form of ToS violation), spamming Facebook members (violation of CAN-SPAM), and knowingly circumventing data protections (DMCA), and unlicensed use of the Facebook trademark.

    Tips for business partnerships

    Power Ventures could take proactive steps to look like a legitimate, responsible business in the eyes of potential business partners such as Facebook.

    Create a meaningful WHOIS record

    Power.com domain data currently lists "DiscountDomainRegistry" as a technical contact. "Power Assist Inc" is listed as a registrant and "Leigh Power" is listed as an administrative contact. Not good identity management.

    Add SSL

    If you are going to collect member login credentials from other sites you should at least use a SSL certificate for more secure data transfer. Self-sign if you must, but $30 will buy you a certificate recognized by major browsers. If you can afford extended validation certificates and the verification process that entails, even better.

    Register your company with the partner website

    Facebook allows its members to join one or more corporate networks. Register your company on Facebook and at least associate executive and developer accounts. This additional verification step helps Facebook identify your employees. Other social networks have similar verification and associations.

    Power Ventures is not listed in the Facebook corporate network directory.

    Summary

    Power.com violated Facebook terms of service by accessing and storing Facebook member data on its servers. Facebook immediately contacted Power regarding this violation and attempted to work with the site as they transitioned to the official data API, Facebook Connect. Power reneged on their agreement hours before promised delivery and immediately launched a marketing campaign to financially reward further violations. Facebook decided enough is enough and blocked Power through technical measures followed by legal measures when the site did not comply.

    I have little sympathy for Power and its actions. I hope other sites violated by Power.com such as Google, Microsoft, MySpace, and Hi5 put a stop to websites like Power harvesting user data instead of using permitted access methods such as OAuth. Locating your business in Brazil with servers in Canada and development in India does not shield companies from the consequences of abusive practices.

  3. Dec10

    Looking to join smart, challenging team to change your Web

    I have decided to once again seek full-time corporate employment. I miss being surrounded by smart people every day and the new synapses that light up in such an engaging environment. I have started talking to a few companies about working on difficult Web problems full-time as they strengthen their online business. It will be fun to fully invest in a single product once again in a specialized role.

    I like to think of my work over the past two years as rethinking front-end development. In the social, distributed web our front-end exists beyond the borders of our own websites. It's inside web feeds, widgets, and specialized applications that deliver the right content at the right time to the right audience inside their consumption environment of choice. Web development is headed in this direction, causing us to rethink the differences between a website and a web application while delivering more context-specific experiences to our audiences. I have covered some of these changes at three Widget Summit conferences, taught computer science classes in China, rewired consumer electronics devices for the Web age, restructured large media companies to embrace syndication beyond their walls, and basically changed the way data is rendered and consumed. Fun stuff and I would like more challenges every single day. Many websites still need to better embrace the syndicated web and I'd like to dedicate myself to that type of front-end excellence every day.

    So that's what I'm up to. You'll continue to see new product and demonstrative web apps released here over the coming weeks. The quiet of the holidays typically yields interesting hacks such as creating visitor profiles based on browser history, analyzing Google's new indexing techniques, or reverse-engineering Google Reader data formats.

    I am looking for tough new challenges on a full-time basis among other bright, driven people. If you're building cool new things among a good group of people and we're not already talking contact me and perhaps we'll create some excellent products together. Official résumé also available, with the concise bullet-point summaries such a document requires.

  4. Dec09

    OpenSocial REST for social data interchange

    OpenSocial is best known for its social applications: canvas and profile views powered by JavaScript and Flash. Applications and widgets are just one part of the full OpenSocial offering. Over the past few months the OpenSocial spec has grown to include JSON, Atom, and XML outputs over a RESTful interface. OpenSocial containers MySpace, LinkedIn, and Plaxo already expose social data over these protocols, with additional support from large networks such as 51.com and Yahoo! expected in the near future.

    Sharing and Accessing Social Data in OpenSocial

    Exposing data over OpenSocial REST foramts is not limited to widget containers. Social web apps such as Flickr, Twitter, or even Facebook could support OpenSocial data standards without ever adding OpenSocial application support to their web pages. Last week I turned TwitterFE into an OpenSocial RESTful container, opening up Twitter data for OpenSocial clients. OpenSocial 0.9, scheduled for release on December 19, will help solidify these new protocols across containers (I found many errata in 0.81 and I am pushing for changes in 0.9). In this blog post I will provide a brief overview of OpenSocial RESTful protocols and its data implementation for any website interested in standardized descriptors of social data.

    1. OpenSocial background
    2. People
    3. Activities
    4. Advertising OpenSocial support
    5. Summary

    OpenSocial background

    OpenSocial applications request and interpret applications via JavaScript requests. An application might request profile data on the logged-in member viewing the app, write a new story to a member's social news feed, or store custom data such as a member's favorite color. These data objects are called Person, Activity, and AppData respectively. Each of these data objects contain a minimal-set of required information and a long list of optional data that varies by implementation. Most social applications store a member ID, username, profile photo, and a profile URL, for example, but specified views on romance or religion are less common.

    Yet OpenSocial isn't just for widget containers. Social web apps can export and import user data via anonymous and/or authenticated requests. You just have to speak the language of OpenSocial data to achieve fluid data interchange between servers. Data requests may occur with or without a login but additional data may be exposed to requesters with proper OAuth credentials for a particular account.

    People

    People are the center of a social network experience, connecting us to new data and interactions. OpenSocial maps common profile components across containers including e-mail addresses, profile pictures, location, and member bios. Friends lists are a collection of people objects mapped to a particular owner.

    The only required person data from a container are a display name and a container-specific identifier such as the numeric auto-increment id you are storing in your users table. Websites need to stick to a specific Person vocabulary to ensure compatibility across sites.

    Portable Contacts and OpenSocial RESTful Person objects are wire-compatible formats. The specs are currently aligned but you might hear either one used interchangeably in conversations.

    Example

    <person xmlns="http://ns.opensocial.org/2008/opensocial">
      <id>mysite.com:1234</id>
      <displayName>John Smith</displayName>
      <name>
        <givenName>John</givenName>
        <familyName>Smith</familyName>
        <formatted>John Smith</formatted>
      </name>
      <gender>male</gender>
      <emails>
        <primary>true</primary>
        <type>work</type>
        <value>john@example.org</value>
      </emails>
      <ims>
        <primary>true</primary>
        <type>aim</type>
        <value>johnsmith</value>
      </ims>
      <account>
        <domain>mysite.com</domain>
        <username>johnsmith</username>
        <userid>1234</userid>
      </account>
    </person>

    In the above example I've defined some basic data about a fictional user of MySite.com using the OpenSocial Person vocabulary in an XML format. Consuming agents can write a single interpreter for multiple OpenSocial containers and easily display, export, or annotate profile and friend data over this interface.

    Activities

    Activities are small application updates usually posted to a social news feed. When a member adds an event, posts a status update, uploads a photo, or takes some other action websites usually write a new activity into the member's feed. These actions are normalized in the OpenSocial context into a specific Activity vocabulary.

    Example>

    <activity xmlns="http://ns.opensocial.org/2008/opensocial">
      <title>Updating my MySite account</title>
      <url>http://mysite.com/johnsmith/status/123456789</url>
      <id>mysite.com:1234/activity:123456789</id>
      <userid>mysite.com:1234</userid>
    </activity>

    The above example normalizes a text-based status update into an OpenSocial activity expressed in XML. I can post this message into an OpenSocial activity stream, open up export capabilities for members, or interface with a wider array of applications (desktop, mobile, etc.) that already support activity stream display.

    Advertising OpenSocial support

    OpenSocial RESTful resources are described using XRDS-Simple. If you have used OpenID or OAuth you're likely already familiar with this markup and discovery process. Agents can probe possible supporting containers for application/xrds+xml response support to receive a full descriptor set.

    OpenSocial REST containers advertise supported data objects by the object's name. A type of http://ns.opensocial.org/* advertises RESTful support, data objects available (person, activity, etc.), possible query types, and a hint of specversion (currently "2008"). You might choose to support some or all of the OpenSocial data objects and your XRDS document will serve as the central discovery resource for such data.

    Summary

    OpenSocial is about more than just widgets and applications rendered in a web browser. The project exposes standardized interfaces and object descriptors for social web components while offering interoperability with very large social networks around the world. Any social website can allow public and private access to member data using OpenSocial RESTful protocols and responses. You will open up new API opportunities, allow import and export of data between sites, and even expose more granular data to crawlers such as Google (if you choose). Interesting stuff that's just getting started.

  5. Nov19

    Rewriting Twitter for web best practices

    TwitterFE.com screenshot

    Last week I decided to rewrite the Twitter.com front-end on Google App Engine to incorporate modern front-end programming best practices, exceptional performance, and establish a solid platform for further development. TwitterFE.com is a fully-functional read-only clone of Twitter.com designed to make your web browser sing. I created the site as an example of web development best practices anyone can integrate into their web presence.

    The new web front-end on TwitterFE.com features localized templates, expressive markup, distinct URL structures, integrated site search, geo-distributed dynamic and static servers, and more available features than Twitter.com. In this post I will outline some of the changes I've applied to the Twitter front-end reproduction as they apply to general front-end web development.

    1. Global audience
    2. Unique usage models
    3. Consolidate URLs
    4. Expressive markup
    5. Split the page load
    6. Know your cache settings
    7. Review access controls
    8. Expose site search
    9. Summary

    Global audience

    Twitter.com top countries - Google Trends

    I added a localization framework to the Twitter front-end to enable site content delivered in multiple languages. According to Google Trends Twitter's top regional languages are English, Portuguese, Japanese, Chinese, German, and Spanish speaking regions. I isolated the site's template strings and translated all key phrases into Spanish. Visitors with an Accept-Language header of es will receive template strings in Spanish.

    Twitter profile about box Spanish - Evan Williams

    The most difficult part of localization is isolating your template strings and choosing common wording across the site. Twitter uses the terms "person," "user," "account," and more to reference a profile owner. Websites need to pick common concepts to explain their site interactions before requesting translations.

    Modern websites rely on crowd-sourcing to translate a site into new languages. Porting a web application to your native language is a point of pride for many communities. Something as simple as "favourites" instead of "favorites" for the Brits could help create identity around a product in other countries. Facebook Translations and Google in Your Language are just two examples of large localization efforts led by an engaged community.

    Unique usage models

    Twitter.com has three main visitor interactions: anonymous visitor, annotator, and author. The current Twitter website loads resources for all possible interaction types, weighing down the page and interfering with the intended experience. I tore the site down and started from scratch, building up each interaction model starting with the anonymous visitor.

    Anonymous visitor

    Twitter logged out view - Evan Williams

    The anonymous, public visitor is anyone browsing Twitter content in a non signed-in state. This audience typically makes up the majority of site traffic and includes both humans and search engines. Websites need to clearly and concisely communicate content to this new audience who is likely inexperienced with your site while quickly while smartly driving business objectives such as member signups or advertising.

    Annotator

    Twitter logged out view - Evan Williams

    The annotator is a logged-in member browsing site content with an opportunity for annotation. They may want to discover new social network friends, mark content as a favorite, or otherwise engage with an existing content. Annotations are typically short and asynchronous, posting new associations between an account holder and a unique content identifier. The majority of pages presented to logged-in users on Twitter.com follow this annotation interaction model.

    Author

    Twitter logged out view - Evan Williams

    Twitter is a message authoring platform. Logged-in users may publish new text updates from their homepage while browsing other subscribed content. Authors type into a text area and receive real-time feedback on authoring limitations while they type. The author commits new content to Twitter's servers after hitting a submit button, and the service responds with a confidence indicator for accepted updates.

    Consolidate URLs

    How many possible URLs represent the same content on your website? Websites should avoid duplicate content spread across multiple paths, subdomains, and protocols. There should be one strong public-facing match for your distinct content.

    Which one of the following URLs represents the profile page of Twitter CEO Evan Williams (username ev).

    • http://twitter.com/ev
    • http://twitter.com/ev/
    • https://twitter.com/ev
    • http://explore.twitter.com/ev
    • http://m.twitter.com/ev
    • and more...

    Websites need to pick a winner and funnel visits into that best representation of content. Search engines are crawling multiple versions of Twitter right now and splitting authority between many different options.

    Be aware of URL propagation when you introduce new subdomains and schemes with relative URLs inherited from a common template. You might be buying new servers to keep up with a crawl load across your millions of pages as a result.

    Expressive markup

    TwitterFE uses an xHTML vocabulary to express content, CSS for positioning and styling, and JavaScript for progressive enhancement and interactions. Gone are the table-based layouts of Twitter.com and its heavy DOM footprint. Resources are split into dynamic and static content and served from geographically-distributed datacenters for optimal performance. Twitter currently stores static assets such as profile pictures on Amazon S3 and does not use a distributed CDN to address speed of light issues.

    Comparing a user such as Al Gore on TwitterFE vs. Al Gore on Twitter.com shows a 41% difference in required resources sent over the wire (89 vs. 152 KB). The new site also reduces the total DOM footprint for faster parsing, layout, rendering, and addressing.

    Microformats expose unique structured objects within each page such as people, relationships, and feed mappings. Search engines such as Yahoo! tap into microformat content to expose deeper information about a page. I cleaned up microformat support on Twitter pages and added support for Internet Explorer 8 Web Slices.

    Expressive markup helps web browsers and search engines better understand the content within your pages. Sites can fully utilize xHTML vocabulary sets independent of default styling to best define the content and the rendered display of each page.

    Split the page load

    Web pages should respond quickly with progressive enhancement added after the main page content renders on the page. We might, for example, load a page and then apply search field listeners, autocomplete, or menu expansions as a second wave. Splitting our pages into "must have" and "nice to have" segments helps us deliver core content quickly while still providing the on-page interactions and magic sprinkles that thrill our visitors.

    Twitter's profile pages load 36 of the profile owner's following list onto each page. That's 36 tiny little 700 byte profile images all waiting in line for a remote connection and display on page. I tripled the total number of displayed member pics but loaded the list asynchronously after the rest of the page finished loading. I can pre-fetch these components into cache on the original profile request and respond very quickly to the async request after page load.

    Know your cache settings

    How long should browsers and other requesting agents hold on to a piece of content before requesting a fresh copy? A frequently changing profile page might expire its HTML every 5 minutes or so while static assets such as a site logo or icon should be kept in browser cache for a long period of time instead of requested with each page. In some cases Twitter sets image Expires headers 5 minutes into the future, slowing down pages and increasing bandwidth costs for the company and its visitors.

    Review access controls

    Some sites split pages into public-facing and login-required access models. Twitter places pages such as a following list or a full-sized profile picture behind a login screen while exposing the same data over their APIs without such restrictions. Twitter.com is losing search engine exposure and logged-out user browsing capabilities due to these inconsistencies in implementation, not policy.

    Expose site search

    Firefox OpenSearch Twitter

    The OpenSearch format exposes site search options to web browsers and search engines alike. If your website offers site search you should be lighting up the browser chrome with new search options for the given page. Twitter acquired a search company in July but has not exposed available search hooks in their main website's front-end. Think about how you might want to scope a search to the currently viewed user account as well as an expanded site-wide option.

    Summary

    TwitterFE is a read-only clone of Twitter's front-end that fixes many of my frustrations with the site's front-end engineering and creates a new platform for future third-party development. Any site could roll these types of improvements back into their core services. Twitter APIs are full-featured enough I can clone the Twitter front-end without creating yet another stand-alone Twitter-like site.

    There is a difference between a website or widget rendering in a browser and having the same site perform exceptionally well. Established web teams should revisit their web content to optimize experiences.

    TwitterFE.com is the result of one person working part-time for a week to re-write the front end of a website serving millions of monthly visitors. Similar lessons apply throughout the Web world.

    I now have a new platform to develop features beyond what's currently offered on Twitter.com. If you're an iPhone developer in need of a headless Twitter API proxy for push updates let me know.

    What other front-end features do you wish established websites would invest time and effort to improve?

  6. Feb05

    Sniff browser history for improved user experience

    The social web has filled our websites with too much third-party clutter as we figure out the best way to integrate content with the favorite sites and preferences of our visitors. Intelligent websites should tune-in to the content preferences of their visitors, tailoring a specific experience based on each visitor's favorite sites and services across the social web. In this post I will teach you how to mine the rich treasure trove of personalization data sitting inside your visitor's browser history for deep personalization experiences.

    I first blogged about this technique almost two years ago but I will now provide even more details and example implementations.

    1. Evaluate links on a page
    2. Test a known set of links
    3. Live demos and examples
      1. Online aggregators
      2. Social bookmarks
      3. OpenID providers
      4. Mapping services
    4. Summary

    Web browsers store a list of web pages in local history for about a week by default. Your browsing history improves your browsing experience by autocompleting a URL in your address bar, helping you search for previously viewed content, or coloring previously visited links on a page. Link coloring, or more generally applying special CSS properties to a :visited link, is a DOM-accessible page state and a useful method of comparing a known set of links against a visitor's browser history for improved user experience.

    • New Site
    • Visited site

    A web browser such as Firefox or Internet Explorer will load the current user's browser history into memory and compare each link (anchor) on the page against the user's previous history. Previously visited links receive a special CSS pseudo-class distinction of :visited and may receive special styling.

    <style type="text/css">
    ul#test li a:visited{color:green !important}
    </style>
    <ul id="test">
      <li><a href="http://example.com/">Example</a></li>
    </ul>
    

    The example above defines a list of test links and applies custom CSS to any visited link within the set. Your site's JavaScript code can request each link within the test unordered list and evaluate its visited state.

    Any website can test a known set of links against the current visitor's browser history using standard JavaScript.

    1. Place your set of links on the page at load or dynamically using the DOM access methods.
    2. Attach a special color to each visited link in your test set using finely scoped CSS.
    3. Walk the evaluated DOM for each link in your test set, comparing the link's color style against your previously defined value.
    4. Record each link that matches the expected value.
    5. Customize content based on this new information (optional).

    Each link needs to be explicitly specified and evaluated. The standard rules of URL structure still apply, which means we are evaluating a distinct combination of scheme, host, and path. We do not have access to wildcard or regex definitions of a linked resource.

    In less geeky terms we need to take into account all the different ways a particular resource might be referenced. We might need to check the http and https versions of the page, with and without a www. prefix to more thoroughly evaluate active use of a particular website and its pages.

    I group my tests into sets of URLs with the most likely matches placed at the beginning of the set. I evaluate each link in the set until I find a match thereby exhausting positive indicators of site activity while prioritizing the data scan.

    Live demos and examples

    Sniffing a visitor's browser history has good and evil implications. An advertiser can determine if you visited Audi's website lately, drill down on exact Audi models, and offer related information without ever placing code on the Audi website. I have been scanning the browser history of my site visitors for the past few months and I have coded a few examples to show benevolent uses for improved user experience.

    Online aggregators

    Feed aggregator button grid

    Clusters of feed subscription buttons clutter our websites, displaying tiny banner ads for online aggregators of little use to most of our site visitors. My blog checks a known list of online aggregators against the current visitor's browser history and adds a targeted feed subscription button for increased conversion. A Google Reader user will see an "Add to Google button" and a Netvibes user will see an "Add to Netvibes" button without cluttering up the interface. I insert direct links to each site's feed handlers to help convert the current visitor into a long-term subscriber.

    Once I match a particular service I could also check to see if the current visitor is already subscribed to my feed. I would simply need to run a second test against the data retrieval URL, such as feedid=1234, to match web traffic with subscriber numbers.

    Visit my live example of link scanning popular online feed aggregators for a demo and the applicable code.

    Social Bookmarks

    Social bookmark button sample

    I like to see my latest blog posts spread all over the web thanks to social bookmarking sites and other methods of content filtering and annotation. Most sites spray a group of tiny service icons near their blog posts and hope a visitor recognizes the 16 pixel square and takes action. Suck. There has to be a better way.

    I can scan a current visitor's browser history to determine an active presence on one or more bookmarking sites. Once I determine the current visitor is also a Digg user I can show live data from Digg.com to prompt a specific action such as submitting a story or voting for content. I can create a much better user experience for 3 services I know my visitor actively uses instead of spraying 50 sites across the page.

    Visit my live example of link scanning popular social bookmarking sites for a demo and the applicable code.

    OpenID providers

    Pibb OpenID signin

    OpenID is an increasingly popular single sign-on method and centralized identity service. OpenID lets a member of your site sign-on using a username and password from a growing list of OpenID providers including your instant messenger, web portal, blog host, or telephone company account. Visitors signing up for your site or service shouldn't have to know anything about OpenID, federated identities, or other geeky things, but should be able to easily discover they can sign-in with a service they already use and trust every day.

    I can scan a list of sign-in endpoints for a list of OpenID providers and only present my site visitor with options actually relevant to their everyday web usage. Prompting a user to sign-in to your service with their WordPress.com account should be much more effective than an input field sporting an OpenID icon. Link scanning for active usage should increase new member sign-ups, reduce support costs due to yet another username and password, and make your members happy.

    Visit my live example of link scanning current OpenID providers for a demo and applicable code.

    Mapping services

    Facebook map drop-down

    Online mapping services have changed the way we interact with location data. Need to get to 123 Main Street? Not a problem, I'll just send that data over to your favorite mapping service to help you find your way.

    I can scan a visitor's browser history to determine their favorite mapping service. Perhaps she is most comfortable with MapQuest, Google Maps, or Yahoo. Or maybe she uses a Garmin GPS unit and would prefer a direct sync with that specialized service. Determining my visitors' favorite mapping tool helps me deliver a valuable visualization or link I know they prefer.

    Visit my live example of link scanning map API providers for a demo and applicable code.

    Summary

    Websites should take advantage of the full capabilities of modern browsers to deliver a compelling user experience. Built-in capabilities such as XMLHttpRequest took years of implementation before finding its asynchronous groove in data-heavy websites. I hope we can similarly probe other latent useful features to improve the social web through more personalized and responsive experiences.

    I have been the browser history of my website visitors for the past few months to gracefully enhance adding my Atom feed to their favorite feed reader. Easily recognized branding such as "Add to My Yahoo" has yielded much higher conversion rates than a simple Atom link with a minimal effect on page load performance. Dynamically checking for active usage of 50 or so aggregators allows me to extend my total test list and promote an obscure tool that might never make the cut for permanent on-screen real estate.

    How will your site utilize your visitor's browser history for a more custom user experience? How will you connect data in new ways once you have concrete knowledge of the new feature developments that will be most useful to your visitors' online lifestyle?

  7. Jan29

    Data interchange for the social web

    Data portability is only useful if outside systems can comprehend the exported data. Well-described and interoperable data sets open new possibilities for context-aware social applications, importing your friends, photos, or genetic markup from an existing system into your current tool of choice. In this post I will discuss website best practices for exporting portable, descriptive data sets in the name of data portability. This post builds upon user authorization concepts covered in my last post.

    Expressing data between two unrelated systems is difficult at best. You need a shared set of vocabulary to explain even the basic data points (time, person, etc.). Good data exports will want to represent as much data as possible with the least probable data loss.

    Voyager golden record cover

    NASA launched the Voyager 1 spacecraft into space in September 1977 with a set of golden records onboard. These records communicate small pieces of human knowledge to any intelligent life that may discover our small explorer. The graphic above is humanity's attempt at data interoperability, teaching alien explorers the proper positioning of an included stylus over a record rotating once every 3.6 seconds (time is expressed as the fundamental transition of the hydrogen atom). Thankfully web developers do not have to worry about interoperability with so many unknown measures, but your data could just as easily lost and never played back for other worlds to hear.

    Identify exportable data

    The first step in data export is identifying the unique pieces of information you would like to package and ship outside your walls. What information might be useful to a user seeking to backup or otherwise export his or her data? How would you like to import such data back into your own website?

    Google Mail message listing sample

    Pictured above is a list of messages stored in Gmail. One message is part of a continuing conversation or thread, another message is flagged, and two messages have custom labels. A typical e-mail system might just export a list of raw messages but could possibly lose key data such as a flagged state or labels/tags.

    Research existing data standards

    Data interoperability is not a new concept and your current challenges may be easily solved by existing certified and de-facto standards. Standards increase the chances your data will be consumed, processed, and understood by others. You could invent an entirely new dialect and vocabulary to describe your information but you will be much more successful at disseminating data if you are easily interpreted.

    Standards organizations have spent years analyzing the essential elements and interoperability requirements of many common forms of data. Below are just a few standard data formats for elements of the social web.

    People, Places, and Things
    vCard
    xNAL
    KML
    LDAP
    Events
    iCalendar
    News articles
    Atom Syndication Format
    News Industry Text Format
    Human DNA
    NCBI homo sapien genome build 36.2, FASTA.

    Each data markup has a specific set of required data intended for a specific audience or interpreter. Google Maps prefers a feed of business listings and locations in xNAL while Google Earth prefers KML for example. Bloggers output news articles in Atom for consumption by a specific set of tools, while mainstream publications mark up their stories in a news industry format for increased granularity. Some formats may not be applicable if your product does not store all the required types of data (i.e. you know their name but not their hometown). Your company will need to select a target output format based on expected external use and how your information might map onto a format's required elements.

    Extend where appropriate

    Each format supports extended namespaces for custom data not covered by the base vocabulary. A member's favorite food or soccer club is not an essential component of an international standards body but can easily be extended with your own custom namespace where appropriate.

    The same rules of data loss apply to custom namespaces: custom definitions are more likely to be missed while common namespaces are more easily understood. Extended namespaces may already be in active use by a big company or a coalition, increasing your chances of data visibility. An AOL Instant Messenger screenname is defined as "X-AIM" in a vCard context for example, where the X- represents an extension element.

    Summary

    Data portability and interoperability on the social web continues to be a hot topic. While there are PR benefits for first-movers I expect there will not be widespread adoption until portable data has a remote consumer. Startups with limited resources will need to see a possible consuming service for their exported data before carving out part of their product cycle for the new feature. I think data portability is a great project for this summer's interns, providing deep exposure to data complexity and the industry as a whole while balancing proper authenication and privacy concerns.

  8. Jan21

    Data Portability, Authentication, and Authorization

    The social web is booming, signing up new users and generating new pieces of unique content at a steady clip. A recurring theme of the social web is "data portability," the ability to change providers without leaving behind accumulated contacts and content. Most nodes of the social web agree data portability is a good thing, but the exact process of authentication, authorization, and transport of a given user and his or her data is still up in the air. In this post I will take a deeper look at the current best practices of the social Web from the point of view of its major data hubs. We will take a detailed look at the right and wrong ways to request user data from social hubs large and small, and outline some action items for developers and business people interested in data portability and interoperability done right.

    General issues

    Friends, photographs, and other objects of meaning are essential parts of the social web. We're much more inclined to physically move from one city to the next if our friends, furniture, and clothes come along with us. The interconnectedness of the digitized social web makes the moving process much simpler: we can lift friends from one location into another, clone your digital photographs, and match your blog or diary entries to the structure of your new social home. Each of these digital movers represent what we generally call "social network portability" or, more generically, "data portability."

    Social networks accelerate interactions and your general sense of happiness in your new home through automated pieces of software designed to help you move data, or simply mine its content, from some of the most popular sites and services on the Web. These access paths are roughly equivalent to a new physical location setting up easy transit routes between some of the largest cities to help fuel new growth.

    Facebook Friend Finder e-mail import

    Your e-mail inbox is currently the most popular way to construct social context in an entirely new location. Site such as Facebook request your login credentials for a large online hub such as Google, Yahoo!, or Microsoft to impersonate you on each network and read all data which may be relevant to the social network such as a list of e-mail correspondents. Every day social network users hand over working user names and passwords for other websites and hope the new service does the right thing with such sensitive information. Trusted brands don't like external sites collecting sensitive login information from their users and want to prevent a repeat of the phishing scams faced by PayPal and others. There is a better way to request sensitive data on behalf of a user, limited to a specific task, and with established forms of trust and identity.

    1. Use the front door
    2. Identify yourself
    3. State your intentions
    4. Provide secure transport

    Use the front door

    Google, Yahoo!, and Microsoft all support web-based authentication by third parties requesting data on behalf of an active user. The Google Authentication Proxy interface (AuthSub), Yahoo! Browser-Based Authentication, and Microsoft's Windows Live ID Web Authentication issue a security token to third-party requesters once a user has approved data access. This token can allow one-time or repeated access and is the preferred method of interaction for today's large data hubs. The OAuth project is a similar concept to web-based third-party authentication systems of the large Internet portals, and may be a common form of third-party access in the future.

    Google Accounts Access example

    Supporting websites provide limited account access to a registered entity after receiving authorization from a specific user. The user can typically view a list of previously authorized third parties and revoke access at any time. The third-party retains access to a particular account even after the user changes his or her password.

    Imagine if you could give your local grocery store access to just your kitchen, but not hand over the keys to your entire house. A delivery person would be automatically scanned upon arrival, compared against a registry, and granted access to the kitchen if yo previously assigned them access. You could revoke their access to your kitchen at any time, but they never have access to your jewelry box or other non-essential functions within your house.

    Identify yourself

    Third-party applications requesting access should first register with the target service for accurate identification and tracking. Applications receive an identification key for future communications connected to a base set of permissions required to accomplish your task (e.g. read only or read/write). A registered application can complete a few extra steps for added user trust and less user-facing warning messages.

    State your intentions

    Your application or web service should focus on a specific task such as retrieving a list of contacts from an online address book. Your authentication requests should specify this scope and required permissions (e.g. read only) when you request a user's permission to access his or her data.

    Google services with Gmail highlighted

    An application declaring scope lets users know you are only interested in a single scan of their e-mail and you will not have access to their credit card preferences, stored home address, or the ability to send e-mails from their account. Not requesting full account access in the form of a username and a password creates better trust from the user and the user's existing service(s).

    Provide secure transport

    Armored Truck How will you transport my user's data back to your servers? Did you bring an armored car with your company's logo prominently displayed on the side or will my data sit in the back of your borrowed pick-up truck? Requesting applications should transport user data over secure communications channels to prevent eavesdropping and forged messages. Registered and verified secured communications will result in less user-facing warning messages of mistrust, and secure certificates are relatively inexpensive. Large portals such as Google or Microsoft will bump your communications (and privileges) to mutual authentication if you are capable.

    Twitter SSL certificate Firefox view

    Register an SSL/TLS certificate for your website to enable secure transport and further identify yourself. Certificates vary in cost and complexity from a free self-signed cert to paid certificates from a major provider with extended validation and server-gated cryptography. Google and Yahoo! use 256-bit keys. Windows Live and Facebook use 128-bit keys.

    Summary

    Data authorization is the first step in data portability. Emerging standards such as OAuth combined with established access methods from Internet giants provide specialized access for third-parties acting on behalf of another user. Sites interested in importing data from other services should take note of these best practices and prepare their services for intelligent interchange.

  9. Dec01

    Facebook cleanses Pages of supposed fakesters

    Facebook Pages iconFacebook is proactively deleting Pages and other content from its site in an attempt to limit fake listings created by unauthorized entities. The new enforcement procedures started on Thursday night with many Facebook users receiving notifications of their deleted pages and an apparent violation of the Facebook Pages Terms of Service. The main issue seems to be individuals on Facebook proving they have the authority to create a Page for their company, band, or product. Facebook is now requesting a yet unspecified amount of documentation from each user before they create a Facebook Page to avoid future deletions.

    Yesterday morning many brands sat down in front of their computers and learned Facebook had deleted their pages from the social network. Facebook called into question the identity of athletes, startup company founders, web hosting companies, video game development houses, and television networks maintaining pages on Facebook to connect with their fans. Electronic Arts was no longer engaging fans of its video games on Facebook, Fox television shows disappeared. Athletes such as David Beckham were no longer kicking it with their fans. The new proactive fake Pages sweep by Facebook has isolated brands and celebrities currently evaluating the Facebook social network and its benefits. This early negative experience will likely harm Facebook's attempts to reach out to brands for advertising and promotional opportunities on Facebook.

    We need a document showing that you (or your company, which we will need proof of affiliation with, as well) have the rights to represent these companies and individuals. A document on the company letterhead would be a good start. You can email all documentation to advertise@facebook.com, and we will then be able to assist you.

    Facebook is asking each person creating a new Facebook Page to first contact Facebook's advertising department via e-mail with documentation stating you have the right to create the page. If your request is approved Facebook staff will "make a note on your account so that [the page] won't be removed" according to an e-mail I received from Facebook's sales staff. I have submitted requests for Facebook to reinstate pages for two products I own, Startup Search and Widget Summit. I have also submitted a copyright counter notification just in case someone filed a copyright infringement notice against one of my own sites and logos although I received no notification of copyright violation from Facebook for any pages, only statements questioning my authority to create such content.

    What is a Facebook Page?

    Facebook Page Blockbuster

    Facebook Pages were introduced on November 6 as part of the new Facebook Ads suite of products. Facebook profiles correspond to an individual person while Facebook Pages are a specialized type of Facebook profile created for a local business, brand, product, non-profit organization, celebrity figure, or other entity. Facebook Pages are administrated by one or more Facebook members and allow anyone to become a "fan" of a product, company, or service.

    Brands such as Blockbuster might create a page on Facebook to connect fans of its video rental service and reach new customers. Blockbuster can distribute its custom Facebook application through this channel and help these new fans share their video rental history throughout the social graph. Blockbuster can also accelerate its fan growth through targeted advertisements on Facebook.

    A band or celebrity might create a Facebook Page to let its marketing staff promote and measure their online identity. David Beckham could have an individual profile on Facebook, but the friends requests would become overwhelming and lose all personal meaning. Instead Beckham can create a Facebook Page with multiple administrators and managers experimenting with how to best connect with fans online.

    The Fakester Problem

    Social networks have always been plagued by fake accounts, popularized under the "Fakester" term during the Friendster era. Anyone can create spammy social network profiles for a popular Christmas toy, pharmaceutical drug, or an actress in the news. These pages take advantage of the celebrity or brand power of another entity for personal gain and may confuse online visitors or hurt an online reputation.

    A trademark and service mark is one way we protect online brands and establish authorized uses. Companies often take control of domain names in the hands of domain squatters through legal enforcement afforded trademark owners.

    Some social networks choose to play a continual game of Whac-A-Mole with fake profile creators, trimming the unwanted parts of its user base to keep a clean community of real people conducting meaningful interactions. Most social networks choose not to review new profiles and instead wait for a reported violation from a brand or copyright owner through established channels such as the Digital Millennium Copyright Act and its safe harbor provisions. Google and Second Life regularly deal with reported fakesters, trademark infringements, and copyright violations and have established well-documented processes to deal with each request.

    Permission-based inclusion

    Facebook Pages terms of service state "Facebook does not review Facebook Pages to determine if they were created by an appropriate party" yet recent actions and statements by the company indicate a Page is considered fake and subject to removal by Facebook unless written documentation is filed with the company to assure your authorization to create such content on behalf of all involved entities. There are a few big problems to this approach that will likely cause companies to walk away instead of submit papers for each employee and service provider.

    Local business authorization is expensive or impossible

    Facebook would like to create local listings Pages covering bars, restaurants, bookstores, and other places of business in your local town. The owner of my favorite cafe uses a Gmail address and my uncle runs his auto repair shop from Comcast services. They both have filed appropriate false name forms (doing business as) with their local county governments but the hassle of locating or reissuing these documents for a social network such as Facebook is a prohibitive barrier to entry. VeriSign will walk into your startup's office, verify your existence, and issue an extended validation web browser certification for $1500 if you have the money and the patience to endure that level of verification.

    Big brand, many managers

    Companies have teams of individuals working on a product or service both inside and outside the company. A smaller website such as RockYou might have a marketing consultant, public relations firm, and a product team interested in engaging a larger audience on Facebook or other networks. A video game such as Rock Band is developed by Harmonix, published by MTV, distributed by Electronic Arts, marketed to the press by a public relations firm, and marketed to an online audience by a specialized social marketing firm.

    The number of people involved in a particular brand, product, or service makes verifying each business and its employees a prohibitive cost of engagement. Like the $1500 browser certification, companies will only go through the hassle if they have a significant opportunity for large returns on their investments. Facebook and other social networks are still in the experimental stage at many large brands.

    The techie friend

    Less technically skilled businesses and brands rely on the help of a techie friend for many tasks in the techie world. My local cafe might not know how to log-in to a website, claim their business, and input their hours of operation. Traditionally the techie friend role for local listings was the phone company calling every business for yellow page listings and perhaps a few ads. Earlier this year Google started paying $10 for each local business listing created by its users going door to door snapping photographs and collecting local business data.

    It is difficult to verify the techie friend since they are helping out someone who was not Internet or social network savvy enough to fill out the appropriate online forms or mailers. I had not heard of any restaurant or cafe listings removed from a local database until Facebook's recent cleaning spree.

    Protecting your company and brand

    There are a few things your company and staff can do to decrease the chance Facebook might remove your employees, products, or advertisements from its site.

    1. Create a Facebook Network for your company and its employees.

      Facebook Networks associate Facebook members with their current or past places of employment. Company membership will be validated by e-mail address, allowing anyone with a "yourcompany.com" e-mail address to join your corporate Facebook network.

      Web hosting company Joyent recently had their Facebook Page deleted even though the company had an established business partnership with Facebook. Joyent's CTO and VP of Marketing were Page administrators yet there was no Joyent company network and therefore no established link to help their case. The executives had no official Facebook validation and verifications on their account and were listed as regular members of the San Francisco network.

    2. Submit a documentation request to Facebook's Advertising department, asserting your ability to create a Facebook page for your brand. Name each of your employees or outside partners you would like to grant explicit permission to administer or contribute to your Facebook Pages or content.

      Facebook requires documentation from "all companies involved" showing each Page administrator has been authorized to represent the product. The exact documentation and assertion required from Facebook is still unclear, but a pro-active approach should help protect your Pages from deletion.

      Facebook also offers people-friendly URLs for some verified pages. Your page could have a URL of facebook.com/blockbuster instead of Facebook page 5973937214 after verification.

    3. Create a Facebook ad.

      Paying Facebook a few dollars might help ensure the long-term existence of your Page. Fraudsters might be less likely to spend money promoting a false page or entering identifying information such as a credit card in Facebook's system. An advertisement history associated with your page may be a positive signal indicating your engaged interest in the success of your Page.

      Facebook advertisements have a minimum cost of a penny per click and a daily ad budget of $5. If you would like to associate an ad with your Page for little to no money just target an obscure group for a 1-day ad campaign, such as Texas vegetarians who prefer Kobe beef, to help ensure your fees never approach even $5.

    Summary

    Facebook is a closed network and the company reserves all right to determine when, where, or how you or your brand might exist on its site. It's a shaky ground to enter but the promise of millions of eagerly waiting customers act as siren calls to brands entering the wild west world of social networking and user-generated media. Large brands such as Coca-Cola are pulling back from their planned involvement and instead opting for a "wait and see" attitude, and other brands may follow their lead as Facebook works out the kinks of each new system.

  10. Apr30

    Podcast: Social media trends with Charlene Li

    Social computing has changed the way we interact with the Web. Our information consumption and production benefits from the participation of the crowd in its various forms, creating niche audiences and new types of curators independent of space and time. We're connected to local experts on hiking, cooking, parenting, programming, and much more. Yet social media extends beyond the realm of content creators, bolstered by the comments, ratings, rankings, sharing, and reading masses that help us find the content we seek.

    Forrester Research released a report last week, Social Technographics, detailing levels of social media participation among 10,000 adults and youth. Their sample panel provides new insights and statistics into how users are currently engaging in social media activities, and the motivations which might drive such participation.

    Last Friday I sat down with Forrester Research analyst Charlene Li to discuss her report's findings and its implication for business on the Web. You can read more about the topics of our social media trends and engagement discussion on my podcast site, and view select results from the research report. Our podcast on social media trends is 20 minutes in length, a 9 MB download.

    Tip: The full Social Technographics research report costs $279 and is available as a downloadable PDF. The accompanying free PowerPoint slide deck contains key statistics and other summary data you might find useful while keeping your wallet in your pocket.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Recently Popular

Archives: Popular Categories

Sites: More from Niall