Rewriting Digg feeds using Atom 1.0

Digg logo with feed icon

Digg currently uses RSS 2.0 as a lightweight API, adding their own namespaced elements to explain Digg-specific values within the XML. The current Digg feed reinvents some elements (digg:category???) I feel could be better marked up with existing standards and namespaces. I’ll use Digg’s data in this post to show how some complex data and relationships can be expressed using Atom 1.0.

Simplifying drives adoption

It’s important to express your data inside pre-defined elements and attributes when possible for easy parsing by the many feed libraries used by developers all over the web. PHP developers don’t write their own parsers, they use something like Magpie instead. Python developers might use Universal Feed Parser. Windows developers might use the Windows RSS Platform. Each abstracted view of your feed might hide your proprietary namespaced data or at least make it more difficult for a programmer to access your one-off namespace.

Feed-level Identifier

<id>tag:digg.com,2006:technology</id>

Globally unique identifiers are a good thing. They help aggregators figure out when they have seen a particular resource in the past, and store or display that information accordingly. You can use a URL as your identifier, but URLs do tend to cycle and may not represent the same resource throughout time. The tag URI scheme, RFC 4151, is another way to create an unchanging, globally unique URI as in the example above. See Mark Pilgrim’s How to make a good ID in Atom for more information.

Simple List Extensions

<cf:treatAs>list</cf:treatAs>
<cf:sort ns=”http://digg.com/docs/diggrss” element=”diggCount” label=”Digg Count” data-type=”number” />

Digg’s feed is an ordered list and therefore a good candidate for Microsoft’s Simple List Extensions namespace. The first line excerpted above defines Digg’s feed as a list. The second line defines a sort option that may be rendered in a user interface such as Internet Explorer’s feed view allowing someone to sort by the number of “diggs” received by any one item.

Multiple link relations

<link rel=”via” type=”text/html” href=”http://example.com/news.html” />

A Digg story page is the appropriate HTML link alternate for the feed, but it is possible to provide additional meanings and links for the individual story. The via value signifies the source of information for the entry, which in this case is the URL originally submitted to Digg.

Published vs. Updated

A Digg story is originally published when a user submits information for the first time to Digg’s servers. The story is continually updated as members leave comments and “digg” actions throughout time. The Atom 1.0 specification defines updated as “modified in a way the publisher considers significant” which in this case could mean new comments, new diggs, or significantly buried by the user base.

Categories

<category scheme=”http://digg.com/” label=”Tech Industry News” term=”tech_news” />

Categories can be limited in scope and apply to a certain scheme. A category value is defined by the term element, and the label makes it a bit more pretty for your readers. Some aggregators might append the term value to your defined scheme, giving readers a way to dive into a particular topic right away.

Comment information

<link rel=”replies” type=”text/html” href=”[digg comments url]” thr:count=”101″ thr:updated=”2006-08-16T22:12:48Z” />

The Atom Threading Extensions help publishers define information about comment counts and the location of comments about the entry, among other things. The example above defines where users can read comments about the entry, the number of comments available at last update, and when the last comment was submitted for a given story.

Citing the source

I defined the Digg submitter using the source element including username, profile picture, profile web page, a feed of all submissions by that user, and his last submission.

Conclusion

There are many ways to express data and take advantage of deployed feed aggregators in the market today. The Atom 1.0 IETF standard is about 9 months old and introduces new ways of describing data able to be understood by a widely distributed number of feed parsers and interpreters. Digg is just one example of translating data described in a format such as HTML into easily digestible individual entries in Atom.

Let your users kick it old school

I moved off Blogger shortly after Pyra Labs was acquired by Google, but I tried out the new Blogger today anyways, to see what had changed. It was a walk down memory lane best experienced with a Blogger sweatshirt, my reward for being a Blogger Pro member for a few months before they sold.

Google Blogger sweatshirt

Wouldn’t it be cool if Blogger Pro users had some sort of special badging in their templates to let people know they are an old school G? I might even be less likely to change blogging platforms because I like the way that special piece of flair looks on my page and the way “member since 2002” commands respect from the n00bs.

These little touches create folklore and connection to products. If I move off of Movable Type I’ll lose the recently updated key I received for a $20 donation years ago to Ben’s personal PayPal account and bugged Mena daily until I had a rotating spot on the Movable Type homepage and the donors page. There’s a little bit of history every time I see that text input box in my blog configuration.

I’d like to see more recognition of long-term members in online software. Let us show our hipness and experience with our favorite web applications by giving us a special piece of bling to share with the world. Because I’m a long-time customer and proud of it.

Google WiFi is live, local coupons for everyone

Google switched on its public WiFi network in Mountain View less than an hour ago, its first experiment in location-based services covering 11-miles of tech-saavy real estate in the heart of Silicon Valley. Each WiFi user must have a Google Account and Google can pinpoint your location on the network based on your current access point.

Every Google account is GTalk enabled, including the just announced new GTalk client with free file transfer and voicemail. Every resident of Mountain View can now have free Internet access, email, voicemail, and much more.

Google also announced its first killer app for location-based connectivity: merchant coupons. Businesses can create and offer coupons through Google’s Local Business Center, prompting user action when viewing a Google Map or local listings. Businesses verify their information through an automated phone call for future click-to-call opportunities or a walk-in prompted by local listings.

Combine the two announcements and you get something really cool. Local merchants getting online and interacting with the local community through accurate information and perhaps a few freebies and discounts. Google will have a chance to play with the technology in its own backyard, soliciting feedback from local merchants and residents as they traverse the always-on mesh surrounding them. Life at Starbucks with their $10 Internet day passes may never be the same.

Facebook Developer API

Social networking site Facebook has opened up access to its service via a set of RESTian APIs, giving developers access to a user’s profile data, friend list, inbox, calendar, photos, blog posts, and more. The API could be used to create a desktop or mobile version of a user’s Facebook data or easily migrate them to a new social network, photo site, or calendar. (via GigaOm)

Web applications are limited to 100,000 requests a day and desktop applications are restricted to 5 requests per second.

AOL acquires Userplane

AOL acquired Userplane today, a 12-person startup in Los Angeles powering text, voice, and video chat for MySpace, IGN, Honda, Date.com, and others. Userplane is the first startup featured at SF Tech Sessions and later acquired but I know they won’t be the last. Attendees of April’s tech session had a sneak peek at the MySpace instant messaging integration a week before it was formally announced at Digital Hollywood.

Userplane is a small, privately funded team and will continue to deploy white-label messaging solutions to sites around the Web. The network will now federate with AIM’s network, allowing custom accounts and identities on a particular service — car enthusiast by day, dating profile by night — while not requiring a separate open tool for each network.

OS X Leopard includes feed platform?

The developer preview of OS X 10.5 “Leopard” includes an integrated feed syndication platform with Bonjour integration according to a message board posting from a WWDC attendee. The software was distributed to all conference attendees and should be available on your favorite file-sharing network shortly.

A new framework is included for publishing and subscribing to RSS and Atom feeds, including complete RSS parsing and generation. Local feeds can be shared over Bonjour zero-configuration sharing and discovery.

The new framework would provide an easy interface for Mac developers to include feed syndication features inside their products and share user data across applications. Bonjour integration could create a grid network of feed subscribers, allowing a user to grab the latest BoingBoing entry or NPR podcast from their local network instead of initiating a new external network request.

Open Source Lab Rackathon

The Open Source Lab at Oregon State is having a fundraising drive to continue their existing work and take on a few new projects. OSL serves up hosting and download space for Apache, Debian, Drupal, Eclipse, GNOME, KDE, Mozilla, OpenOffice, PHP, PostreSQL, and many others.

OSL rack

A $20 places your name in 10-point font on a piece of paper attached to the server cage for a whole year! Each additional $20 donated increases your font size by 1 point size. OSL sysadmins will have to look at your name for a full year and you’ll be helping a good cause. You can even watch sysadmins at work through the OSL webcam.

Leaving Microsoft

I am leaving Microsoft to start my own company. My last day at Microsoft is next Friday, August 18. It’s uncertain whether Microsoft will continue the feed platform work I started, but it’s some good stuff so I hope they do.

Ray Ozzie

RSS is the internet’s answer to the notification scenarios we’ve discussed and worked on for some time, and is filling a role as “the UNIX pipe of the internet” as people use it to connect data and systems in unanticipated ways.

I joined Microsoft in April excited to change the world and build an Internet-scale feed platform to power the experience of Microsoft’s hundreds of millions of users as well as opening up the feed experience to outside developers to leverage in their own applications. The opportunity presented to me was extremely unique and a way to change how the world interacts with syndication technologies such as RSS, RDF, and Atom. The launch of Windows Live and Ray Ozzie’s vision of Internet services disruption made me believe Microsoft was serious about the space and not being left behind in yet another emerging industry as they had been with the web browser and search.

The Windows Live initiative got off to a huge start, with lots of new services created and an “invest to win” strategy in the new division. There were so many new programs created and headcount opening up Microsoft told Wall Street it would be spending $2 billion more than anticipated in the short-term to cover these new costs including over 10,000 new hires over the last fiscal year.

Microsoft stock price April 2006 - August 2006

The stock plummeted on the announcement Microsoft did not have its costs under control. Microsoft’s market cap lost close to $59 billion in the six weeks after I joined and second quarter financials were released, more than the GDP of Ecuador and over half the market cap of Google. What do you do when the market responds to your 6 month-old online services strategy by reducing your valuation by 1.5 Yahoos? Windows Live is under some heavy change, reorganization, pullback, and general paralysis and unfortunately my ability to perform, hire, and execute was completely frozen as well.

I’m happy with what I was able to accomplish as a team of one attached to the Windows Live Alerts group. If we had the resources I truly believe we could have tackled the number of users Hotmail, Messenger, Spaces, or even Internet Explorer might supply, and then ask for more by opening up the platform to the world. I was able to borrow resources here and there, but there was no team being built around the platform in the foreseeable future. I could have stayed at Microsoft, waited for the other 85% of the company to ship their products, and then hope support for my group might be back on track again, but I didn’t want to sit around doing little to nothing until Vista, Office, and Exchange ship. It’s easier to get funding outside Microsoft than inside at the moment, so I am stepping out and doing my own thing.

So what’s next? I had a few startup ideas before joining Microsoft and those never went away. I want to change the way the world thinks about personal data, publishing, and search and I might have the right opportunity to do just that. The product(s) will hopefully be profitable in under a year and not rely on advertising revenue to get there. I fully own my IP rights again on August 19, so I won’t be talking much about past inventions until then to limit legal hassles (I invented this before Microsoft, but still playing it safe).

I’d also like to help out my friends with startups a bit more, and make sure they have everything they need to succeed. It was great to see Automattic engage the WordPress community last weekend at WordCamp and I’m proud of the work Om is doing with his new media empire. As long as I have a successful business paying the mortgage I’d love to continue helping out local startups in various ways without the many conflicts of interest that come with being part of a big company. On a similar note I’ve received a good response from people wanting to work together on a new venture and can see the tremendous opportunity ahead from many talented people building smart small agile businesses focused on thrilling users.

I’m driven by the many opportunities ahead to develop new user-centered products. I’ll be writing lots of Python in the coming weeks and months and I have a few good blog posts on feed syndication planned in the next week as I wind down at Microsoft. My personal contact information remains the same.