Data Portability, Authentication, and Authorization

The social web is booming, signing up new users and generating new pieces of unique content at a steady clip. A recurring theme of the social web is “data portability,” the ability to change providers without leaving behind accumulated contacts and content. Most nodes of the social web agree data portability is a good thing, but the exact process of authentication, authorization, and transport of a given user and his or her data is still up in the air. In this post I will take a deeper look at the current best practices of the social Web from the point of view of its major data hubs. We will take a detailed look at the right and wrong ways to request user data from social hubs large and small, and outline some action items for developers and business people interested in data portability and interoperability done right.

General issues

Friends, photographs, and other objects of meaning are essential parts of the social web. We’re much more inclined to physically move from one city to the next if our friends, furniture, and clothes come along with us. The interconnectedness of the digitized social web makes the moving process much simpler: we can lift friends from one location into another, clone your digital photographs, and match your blog or diary entries to the structure of your new social home. Each of these digital movers represent what we generally call “social network portability” or, more generically, “data portability.”

Social networks accelerate interactions and your general sense of happiness in your new home through automated pieces of software designed to help you move data, or simply mine its content, from some of the most popular sites and services on the Web. These access paths are roughly equivalent to a new physical location setting up easy transit routes between some of the largest cities to help fuel new growth.

Facebook Friend Finder find by email

Your e-mail inbox is currently the most popular way to construct social context in an entirely new location. Site such as Facebook request your login credentials for a large online hub such as Google, Yahoo!, or Microsoft to impersonate you on each network and read all data which may be relevant to the social network such as a list of e-mail correspondents. Every day social network users hand over working user names and passwords for other websites and hope the new service does the right thing with such sensitive information. Trusted brands don’t like external sites collecting sensitive login information from their users and want to prevent a repeat of the phishing scams faced by PayPal and others. There is a better way to request sensitive data on behalf of a user, limited to a specific task, and with established forms of trust and identity.

  1. Use the front door
  2. Identify yourself
  3. State your intentions
  4. Provide secure transport

Use the front door

Google, Yahoo!, and Microsoft all support web-based authentication by third parties requesting data on behalf of an active user. The Google Authentication Proxy interface (AuthSub), Yahoo! Browser-Based Authentication, and Microsoft’s Windows Live ID Web Authentication issue a security token to third-party requesters once a user has approved data access. This token can allow one-time or repeated access and is the preferred method of interaction for today’s large data hubs. The OAuth project is a similar concept to web-based third-party authentication systems of the large Internet portals, and may be a common form of third-party access in the future.

Google Accounts Access example

Supporting websites provide limited account access to a registered entity after receiving authorization from a specific user. The user can typically view a list of previously authorized third parties and revoke access at any time. The third-party retains access to a particular account even after the user changes his or her password.

Imagine if you could give your local grocery store access to just your kitchen, but not hand over the keys to your entire house. A delivery person would be automatically scanned upon arrival, compared against a registry, and granted access to the kitchen if yo previously assigned them access. You could revoke their access to your kitchen at any time, but they never have access to your jewelry box or other non-essential functions within your house.

Identify yourself

Third-party applications requesting access should first register with the target service for accurate identification and tracking. Applications receive an identification key for future communications connected to a base set of permissions required to accomplish your task (e.g. read only or read/write). A registered application can complete a few extra steps for added user trust and less user-facing warning messages.

State your intentions

Your application or web service should focus on a specific task such as retrieving a list of contacts from an online address book. Your authentication requests should specify this scope and required permissions (e.g. read only) when you request a user’s permission to access his or her data.

Google services with Gmail highlighted

An application declaring scope lets users know you are only interested in a single scan of their e-mail and you will not have access to their credit card preferences, stored home address, or the ability to send e-mails from their account. Not requesting full account access in the form of a username and a password creates better trust from the user and the user’s existing service(s).

Provide secure transport

Armored Truck How will you transport my user’s data back to your servers? Did you bring an armored car with your company’s logo prominently displayed on the side or will my data sit in the back of your borrowed pick-up truck? Requesting applications should transport user data over secure communications channels to prevent eavesdropping and forged messages. Registered and verified secured communications will result in less user-facing warning messages of mistrust, and secure certificates are relatively inexpensive. Large portals such as Google or Microsoft will bump your communications (and privileges) to mutual authentication if you are capable.

Twitter SSL certificate Firefox view

Register an SSL/TLS certificate for your website to enable secure transport and further identify yourself. Certificates vary in cost and complexity from a free self-signed cert to paid certificates from a major provider with extended validation and server-gated cryptography. Google and Yahoo! use 256-bit keys. Windows Live and Facebook use 128-bit keys.

Summary

Data authorization is the first step in data portability. Emerging standards such as OAuth combined with established access methods from Internet giants provide specialized access for third-parties acting on behalf of another user. Sites interested in importing data from other services should take note of these best practices and prepare their services for intelligent interchange.

  • Posted
  • Updated at
  • Comments [2]

2 comments

Commentary on "Data Portability, Authentication, and Authorization":

  1. Uno de Waal on wrote:

    Hey Niall,this is a great post on understanding data portability. I’m definitely sending this around!

  2. Pratik Desai on wrote:

    Excellent analogies for explaining the concepts.Not a big deal but i think the exact order should have been authorization, authentication and transport in the introductory paragraph:”…….but the exact process of authentication, authorization, and transport
    of a given user and his or her data is still up in the air……..”Really great article anyways!!