Recently in Web services Category

Web services and web programming. Includes REST and SOAP APIs, script enhancements, and web platform development.

  1. Jan29

    Data interchange for the social web

    Data portability is only useful if outside systems can comprehend the exported data. Well-described and interoperable data sets open new possibilities for context-aware social applications, importing your friends, photos, or genetic markup from an existing system into your current tool of choice. In this post I will discuss website best practices for exporting portable, descriptive data sets in the name of data portability. This post builds upon user authorization concepts covered in my last post.

    Expressing data between two unrelated systems is difficult at best. You need a shared set of vocabulary to explain even the basic data points (time, person, etc.). Good data exports will want to represent as much data as possible with the least probable data loss.

    Voyager golden record cover

    NASA launched the Voyager 1 spacecraft into space in September 1977 with a set of golden records onboard. These records communicate small pieces of human knowledge to any intelligent life that may discover our small explorer. The graphic above is humanity's attempt at data interoperability, teaching alien explorers the proper positioning of an included stylus over a record rotating once every 3.6 seconds (time is expressed as the fundamental transition of the hydrogen atom). Thankfully web developers do not have to worry about interoperability with so many unknown measures, but your data could just as easily lost and never played back for other worlds to hear.

    Identify exportable data

    The first step in data export is identifying the unique pieces of information you would like to package and ship outside your walls. What information might be useful to a user seeking to backup or otherwise export his or her data? How would you like to import such data back into your own website?

    Google Mail message listing sample

    Pictured above is a list of messages stored in Gmail. One message is part of a continuing conversation or thread, another message is flagged, and two messages have custom labels. A typical e-mail system might just export a list of raw messages but could possibly lose key data such as a flagged state or labels/tags.

    Research existing data standards

    Data interoperability is not a new concept and your current challenges may be easily solved by existing certified and de-facto standards. Standards increase the chances your data will be consumed, processed, and understood by others. You could invent an entirely new dialect and vocabulary to describe your information but you will be much more successful at disseminating data if you are easily interpreted.

    Standards organizations have spent years analyzing the essential elements and interoperability requirements of many common forms of data. Below are just a few standard data formats for elements of the social web.

    People, Places, and Things
    vCard
    xNAL
    KML
    LDAP
    Events
    iCalendar
    News articles
    Atom Syndication Format
    News Industry Text Format
    Human DNA
    NCBI homo sapien genome build 36.2, FASTA.

    Each data markup has a specific set of required data intended for a specific audience or interpreter. Google Maps prefers a feed of business listings and locations in xNAL while Google Earth prefers KML for example. Bloggers output news articles in Atom for consumption by a specific set of tools, while mainstream publications mark up their stories in a news industry format for increased granularity. Some formats may not be applicable if your product does not store all the required types of data (i.e. you know their name but not their hometown). Your company will need to select a target output format based on expected external use and how your information might map onto a format's required elements.

    Extend where appropriate

    Each format supports extended namespaces for custom data not covered by the base vocabulary. A member's favorite food or soccer club is not an essential component of an international standards body but can easily be extended with your own custom namespace where appropriate.

    The same rules of data loss apply to custom namespaces: custom definitions are more likely to be missed while common namespaces are more easily understood. Extended namespaces may already be in active use by a big company or a coalition, increasing your chances of data visibility. An AOL Instant Messenger screenname is defined as "X-AIM" in a vCard context for example, where the X- represents an extension element.

    Summary

    Data portability and interoperability on the social web continues to be a hot topic. While there are PR benefits for first-movers I expect there will not be widespread adoption until portable data has a remote consumer. Startups with limited resources will need to see a possible consuming service for their exported data before carving out part of their product cycle for the new feature. I think data portability is a great project for this summer's interns, providing deep exposure to data complexity and the industry as a whole while balancing proper authenication and privacy concerns.

  2. Jan21

    Data Portability, Authentication, and Authorization

    The social web is booming, signing up new users and generating new pieces of unique content at a steady clip. A recurring theme of the social web is "data portability," the ability to change providers without leaving behind accumulated contacts and content. Most nodes of the social web agree data portability is a good thing, but the exact process of authentication, authorization, and transport of a given user and his or her data is still up in the air. In this post I will take a deeper look at the current best practices of the social Web from the point of view of its major data hubs. We will take a detailed look at the right and wrong ways to request user data from social hubs large and small, and outline some action items for developers and business people interested in data portability and interoperability done right.

    General issues

    Friends, photographs, and other objects of meaning are essential parts of the social web. We're much more inclined to physically move from one city to the next if our friends, furniture, and clothes come along with us. The interconnectedness of the digitized social web makes the moving process much simpler: we can lift friends from one location into another, clone your digital photographs, and match your blog or diary entries to the structure of your new social home. Each of these digital movers represent what we generally call "social network portability" or, more generically, "data portability."

    Social networks accelerate interactions and your general sense of happiness in your new home through automated pieces of software designed to help you move data, or simply mine its content, from some of the most popular sites and services on the Web. These access paths are roughly equivalent to a new physical location setting up easy transit routes between some of the largest cities to help fuel new growth.

    Facebook Friend Finder e-mail import

    Your e-mail inbox is currently the most popular way to construct social context in an entirely new location. Site such as Facebook request your login credentials for a large online hub such as Google, Yahoo!, or Microsoft to impersonate you on each network and read all data which may be relevant to the social network such as a list of e-mail correspondents. Every day social network users hand over working user names and passwords for other websites and hope the new service does the right thing with such sensitive information. Trusted brands don't like external sites collecting sensitive login information from their users and want to prevent a repeat of the phishing scams faced by PayPal and others. There is a better way to request sensitive data on behalf of a user, limited to a specific task, and with established forms of trust and identity.

    1. Use the front door
    2. Identify yourself
    3. State your intentions
    4. Provide secure transport

    Use the front door

    Google, Yahoo!, and Microsoft all support web-based authentication by third parties requesting data on behalf of an active user. The Google Authentication Proxy interface (AuthSub), Yahoo! Browser-Based Authentication, and Microsoft's Windows Live ID Web Authentication issue a security token to third-party requesters once a user has approved data access. This token can allow one-time or repeated access and is the preferred method of interaction for today's large data hubs. The OAuth project is a similar concept to web-based third-party authentication systems of the large Internet portals, and may be a common form of third-party access in the future.

    Google Accounts Access example

    Supporting websites provide limited account access to a registered entity after receiving authorization from a specific user. The user can typically view a list of previously authorized third parties and revoke access at any time. The third-party retains access to a particular account even after the user changes his or her password.

    Imagine if you could give your local grocery store access to just your kitchen, but not hand over the keys to your entire house. A delivery person would be automatically scanned upon arrival, compared against a registry, and granted access to the kitchen if yo previously assigned them access. You could revoke their access to your kitchen at any time, but they never have access to your jewelry box or other non-essential functions within your house.

    Identify yourself

    Third-party applications requesting access should first register with the target service for accurate identification and tracking. Applications receive an identification key for future communications connected to a base set of permissions required to accomplish your task (e.g. read only or read/write). A registered application can complete a few extra steps for added user trust and less user-facing warning messages.

    State your intentions

    Your application or web service should focus on a specific task such as retrieving a list of contacts from an online address book. Your authentication requests should specify this scope and required permissions (e.g. read only) when you request a user's permission to access his or her data.

    Google services with Gmail highlighted

    An application declaring scope lets users know you are only interested in a single scan of their e-mail and you will not have access to their credit card preferences, stored home address, or the ability to send e-mails from their account. Not requesting full account access in the form of a username and a password creates better trust from the user and the user's existing service(s).

    Provide secure transport

    Armored Truck How will you transport my user's data back to your servers? Did you bring an armored car with your company's logo prominently displayed on the side or will my data sit in the back of your borrowed pick-up truck? Requesting applications should transport user data over secure communications channels to prevent eavesdropping and forged messages. Registered and verified secured communications will result in less user-facing warning messages of mistrust, and secure certificates are relatively inexpensive. Large portals such as Google or Microsoft will bump your communications (and privileges) to mutual authentication if you are capable.

    Twitter SSL certificate Firefox view

    Register an SSL/TLS certificate for your website to enable secure transport and further identify yourself. Certificates vary in cost and complexity from a free self-signed cert to paid certificates from a major provider with extended validation and server-gated cryptography. Google and Yahoo! use 256-bit keys. Windows Live and Facebook use 128-bit keys.

    Summary

    Data authorization is the first step in data portability. Emerging standards such as OAuth combined with established access methods from Internet giants provide specialized access for third-parties acting on behalf of another user. Sites interested in importing data from other services should take note of these best practices and prepare their services for intelligent interchange.

  3. Aug03

    JavaScript Map API comparison

    Mapping APIs are some of the most popular data services on the web today. If you've ever visited a restaurant website, taken your car in for service or browsed a mash-up you've probably come across interactive and static map images powered by Google, Yahoo, Microsoft, or AOL. In this post I'll compare performance and developer friendliness of JavaScript mapping APIs and dissect choices made by each platform that may affect your website.

    Maps API comparison

    I chose my office location in San Francisco's South Park district as a basis of comparison. The area has large buildings, small alleys, one-way streets, and newly constructed roads and bridges. Each map is a 400-pixel square centered on my office using lat/long coordinates and marked with each service's default marker. I specified a small zoom control and no marker events for quick and minimal interaction across platforms. This simple test is my attempt to recreate what a flower shop, mechanic, or other business might place on their website to help visitors add context to an address.

    Performance

    Every time you add a widget or other external feature to your website you are handing over part of your web page experience and performance to a third party. The responsiveness and total page load of these external services is one way obsessive geeks tune and tweak their sites for optimal performance. Let's take a look at how the major mapping players performed.

    Map API performance
    ProviderRender (s)Size (KB)FilesMemory (MB)
    Yahoo!3.052072413.43
    Google3.182232719.8
    Microsoft3.652362013.9

    Yahoo! offered the fastest performance, the smallest total download, and the smallest memory footprint in my tests. Microsoft's maps took a half-second longer to download and display than either of its competitors. Google had the highest total file count due to its mapping tile behavior.

    Measurement Method

    Map API performance was measured using Mozilla Firefox and Firebug network monitoring using a first-generation Apple MacBook Pro connected to the Internet over cafe Wi-Fi in San Francisco. I restarted Firefox for each of the three performance tests and requested each service with a clean cache.

    Closer look

    If you take a closer look at each of the APIs the performance impacts and general API design begin make a lot more sense. Increasing your domain count allows more parallel downloads but will also require a new DNS lookup for each new domain.

    Google Maps API 2.84

    Google served its 27 files from six separate domains. Google placed 16 map files in my browser, covering a 4096 pixel square to deliver my 400 pixel square window, or 10 times the total requested area. If a site visitor clicks and drags your map they will not have to load another tile as they explore the immediate vicinity.

    Yahoo! Maps API 3.7

    Yahoo! Maps utilize the YUI JavaScript library to interact with the page and handle new events. They are using old beta versions of YUI 2.0 libraries DOM, Event, Drag and Drop, and Animation.

    Yahoo! served files from 4 domains including 9 map tiles. Yahoo! adds a map scale to each page, adding a few more images and functions to your total page load.

    Windows Live Local Search Maps Virtual Earth API 5.0

    Microsoft uses its ASP.net Ajax ("Atlas") JavaScript libraries inside of your map view. The Virtual Earth API adds files and features you never asked for such as CSS files for traffic conditions and extra images typically used in its Dashboard interface.

    Microsoft rendered its map using only 6 tiles covering a 1500 pixel square, or about 4 times the requested map window.

    Quick notes

    My mapping test was a quick experiment to measure the strengths and weaknesses of various developer platforms and services. You can view my full test suite and alter each comparison for your own needs.

    All of the mapping API providers offered similar street data but differentiated itself with add-ons and overlays.

    Best developer site
    Virtual Earth Interactive SDK

    Microsoft's Virtual Earth Interactive SDK was the best-designed site for new map developers. I was able to easily browse examples, view each example on a live map, view source code, and dive deeper into each method's full documentation when needed. Microsoft helps upgraders quickly figure out what's new with a special call-out at the bottom of their dashboard.

    Best use of color
    Yahoo! Maps include color-coded neighborhoods, green parks, and dark blue water. The bright red freeways draw attention away from my points of interest. Yahoo! color-codes some building outlines but it's unclear how someone might decipher a pink office building or orange hotel.
    Best buildings
    ATT Park Virtual Earth

    Microsoft provided the best identification of major buildings in my tests. Their map showed the outline AT&T Park, a baseball stadium home to the San Francisco Giants, and not just the plot of land.

    Best map design
    I prefer Google's map markers over the talk bubbles, pushpins, kites, and stars used by other services. Google's placement and size of street names was the easiest to read in my tests. Street names are repeated every few blocks so you're never too far away from the proper context.

    I tried the MapQuest Open API beta but found the documentation and learning process extremely frustrating. The MapQuest API provided no live drag events in my testing and seemed to prefer surrounding your map in a colored navigation frame and web form. MapQuest may be the current mapping leader, but its API is far behind the large web portals.

    I like the ideals behind the OpenStreetMap but its United States coverage is almost non-existant.

    Summary

    Google Maps continues to dominate the mapping API space but there are some viable alternatives. Yahoo! and Microsoft may have an advantage bundling maps and other APIs with their freely available JavaScript, Flash, and Silverlight libraries and frameworks while Google adds advanced features such as street-level views and Google Gadget info windows.

    I'm glad big Internet portals are spending millions of dollars for online mapping supremacy I can easily integrate in my web applications. The mapping API space changes at least once every 6 months but I am still satisfied with my choice of Google Maps on Startup Search.

  4. Apr23

    Podcast: Taking Ajax offline

    Rich Internet applications are stepping out of the web browser and onto the desktop, helped along by a new set of toolkits. Web developers are able to code against desktop resources using familiar languages and toolkits such as JavaScript, Ruby on Rails, or HTTP interactions. Offline access for web applications is about much more than planes, trains, and automobiles -- it can accelerate performance and integrate with established desktop interactions as well.

    Offline web applications are a hot topic, but often misunderstood. In this week's podcast I step beyond the myths of offline web applications with special guest Brad Neuberg. Brad has spent years digging into reliable storage methods available within a browser environment, and most recently developed the Dojo Offline Toolkit for complete offline access. You can directly download the Offline Web Applications podcast or head on over to the podcast blog post to read more about discussed topics.

    Beyond disconnect

    Offline web application capabilities are about more than a missing Internet connection. Application data is stored on a local hard drive instead of a far away datacenter, boosting your load times. Web applications become searchable components of the local operating system, displayed inside a Windows Vista Search result or Mac OS X Spotlight. Your application data might become fully integrated with desktop calendar, address book, or web feed platforms, exposed to any requesting application including mobile phone synchronization or personal backups.

    Summary

    The offline web application space is a hot topic of discussion which may or may not apply to your product. Is offline access a graceful enhancement on top of your existing application? Are customers clamoring for it? Will you take your application offline using Adobe Apollo, Firefox 3, Joyent Slingshot, XULRunner, or Zimbra Offline? Those are just a few of the toolkits we know about this month, yet more are coming.

    It's time to demystify. I hope you enjoy my podcast with Brad Neuberg, one of the experts in the space of offline access for web applications, as a quick way to get your head around some of these larger issues in the future of web application development.

  5. Feb26

    Universality of the web widget

    Netvibes announced a "Universal Widget API" at last week's Future of Web Apps conference in London, promising a write-once run anywhere widget environment using an open-source widget runtime. The new widget system encourages publishers to author widgets using the Netvibes API and extend the reach of their content beyond the Netvibes user base through an adaptable wrapper. In this post I'll walk through some of the differences between widget deployment endpoints from the publisher's point of view, explaining just a few ways a widget must adjust its dialect and structure to adapt and optimize in different widget environments.

    1. Manifests
    2. Inline and collapsed widgets
    3. Storing local variables
    4. Platform look and feel
    5. Requesting remote data
    6. Internationalization and localization
    7. Libraries
    8. Summary

    The manifest

    Each widget platform requires some sort of manifest file, a definition of a widget's contents, file, and author information for interpretation by the widget platform and its widget directory. A manifest file will commonly collect pointers to files within the widget package (CSS, JavaScript, widget preview image), a description of the widget (title, summary, author), and any libraries or non-baseline features you would like to include with each widget render.

    The widget description is easily standardized, collecting an author's name, e-mail, and URL, the widget's homepage on the web, and a description of the widget's function. The widget title is a bit more difficult, as some publishers would like to dynamically update the title to display a total number of unread mail messages, the current weather, or an active search term.

    Inline and collapsed widgets

    Most web widget engines offer an "inline" option, placing your widget code directly into the page instead of inside of the default placement within a sub-page (iframe). Inline widgets can change the page's background color, interact with other widget content on the page, create a dynamic widget title, and more. Permissioning inline access varies by platform but can be a useful option for your widget.

    Collapsed Microsoft gadget weather

    Microsoft gadgets support a collapsed display mode for inline gadgets, displaying extra information about a gadget's content within the gadget title bar. In the example above I've collapsed the Windows Live weather gadget, displaying a weather icon and today's temperature high's and lows instead of a multi-day forecast. A collapsed mode allows more widget content in the same available space.

    Storing local variables

    Apple Dashboard widget weather location

    Your widget may require some configuration to function at its best. You might ask the user to input a ZIP Code, a user ID, or a favorite baseball team to deliver the appropriate content in your widget. You might also allow customization of colors within the widget frame or text, the total number of items to display (3 days of weather, the latest 5 San Francisco Giants news stories, etc.), or whatever else might be applicable to your widget's display and customization.

    Widget variables and configurations are stored on the widget platform's servers, letting users input their preferences once and have the same configuration across multiple computers. The default method of preference input is usually a text input field, but you can provide a drop-down list of possible options, radio buttons, checkboxes, and many other choices.

    Platform look and feel

    Each widget platform typically has its own design, styling a widget window, creating a specially styled button, or generally matching the look and feel of the surrounding environment. Windows Vista Sidebar will appreciate an Aero Glass look-and-feel, Mac users will look for Aqua-styled buttons, Windows Live gadgets might display a FancyButton, etc. Each widget can adopt the default settings of the web browser, explicitly bind with the target platform's UI, or package a custom CSSfile.

    Requesting remote data

    Web widgets typically request data through the hosting platform's proxy, supplying a remote URL to a specialized data type handler and specifying a preferred response such as XML or JSON. The platform caches each request by default for a period of time (usually about 30 minutes) to increase performance and lighten the load of all the Google Personalized Homepage users subscribed to BoingBoing's RSS feed or requesting earthquake updates for northern California.

    Here's an example of how developers can request an feed on three different personalized web homepages:

    Google
    _IG_FetchFeedAsJSON('http://example.com/rss.xml', CallbackFn, 5)
    Netvibes
    Ajax.Request(NV_XML_REQUEST_URL + 'url?' + escape(http://example.com/rss.xml), {method:'get', onSuccess:CallbackFn})
    Windows Live
    Web.Network.createRequest(Web.Network.Type.XML, 'http://example.com/rss.xml', {proxy:'rss', numItems:5}, CallbackFn)

    In the example above I specified a web feed URL, a callback function, the total number of feed items I would like returned, and I would like the data returned to my callback function as a JSON object. Each platform also has request proxies for generic data, XML, JSON request, etc. Your get, post, put, and delete proxied functionalities may vary by platform.

    Internationalization and localization

    Hello world. Hola mundo. G'day mate.

    The newest web products are launching to a global audience, finding popularity beyond English-speaking borders and becoming the hot thing in France, Poland, or Japan. Some widget platforms support internationalization and localization bundles, allowing a widget author to define unique phrases sent to English, French, Polish, or Japanese users. The widget platform interprets the language requested by each user's browser combined with the international domain being accessed and passes this information along to any widget willing to listen. A Google Canada user might view your widget and request the content in French for example. Google Personalized homepage currently supports 15 languages throughout its widget platform.

    Libraries

    Widget platforms can contain a set of basic features available to each widget but expand their capabilities through additional libraries. Google gadgets can request an analytics module to track widget views, add tabs, and more. My Yahoo! widgets might tie into already loaded YUI libraries. Windows Live Gadgets even have a preferred method of string concatenation (StringBuilder).

    A universal widget could load the appropriate library at each destination on demand, or provide a generalized runtime to load similar functionality through its own modules.

    Summary

    The immediate concern with any write-once run anywhere solution is an abstracted non-native feel for each deployment. Like the promises of Java and wxWidgets for desktop applications, cross platform solutions can sometimes feel a bit alien and heavy when compared to a native application. Developers can extend the reach of their applications quickly and easily, trading off a native look-and-feel, performance and memory utilization, and more for a broader list of compatible platforms.

    A universal widget might lose out on advertising opportunities available in each widget platform's gallery. Your widget may not be listed in the Apple Dashboard gallery, the Google Gadgets directory, Windows Live Gadgets Gallery, etc. Users of each platform typically are directed to these directories to find new pieces of content and it's a free form of advertising for your widget and your brand.

    I think universal widgets combined with a good statistics tool will help new widget developers better understand their deployed audience and tweak their offerings over time to deliver the best possible experience for each platform. Widget developers and publishing sites will want to better understand the parent environment of their widget (Google homepage, Live.com, Netvibes, tec.), web browser and version, language, country, and much more. Netvibes will likely see an increase in widgets available for their platform and abstract some of the work needed to stay on top of the changes at each widget platform.

    There are many more differences between platforms than I've covered in this post, but I hope it gives you a good idea of what it means to integrate well across multiple points of deployment.

  6. Feb08

    Netvibes module developer collects web credentials, personal content

    A French security blogger gained access to private user data on personal homepage service Netvibes last weekend, exposing stored usernames and passwords for popular integrated web services as well as user content loaded in the page. The blogger's account has since been deleted from Blog*Spot (currently cached on Yahoo!), but he provided extended details to French blog Le blog de ¥€$ (English translation). Netvibes has since claimed to patch "a security vulnerability in webnotes" exploited by this developer. I alluded to some of these issues with stored user information, phishing, and general brand confusion in a post two months ago about the popularity of available widgets regardless of their makers.

    Netvibes sample modules Gmail eBay webnote

    An external developer created a Netvibes module and submitted it for inclusion in the Netvibes Ecosystem module directory. A Netvibes employee examined and approved the submitted module for inclusion in the directory. The remotely-hosted module was then altered by the developer to retrieve stored preferences from other configured modules and store information from other modules loaded in the page such as the contents of a webnote, the user's latest Gmail messages, upcoming appointments and contacts, etc. The developer stored this data in a remote database and later examined his collected findings.

    Each Netvibes module is rendered inline, meshing the markup generated by the module with the rest of the page's content. A module developer is encouraged to access only their own module's content using a special Netvibes variable, but any developer can request other content on the page through standard JavaScript or the Prototype JavaScript framework.

    A developer can choose to store and retrieve small pieces of data through the Netvibes servers such as a ZIP Code, color preference, or the username and password to remote web services. This personal data is stored in the Netvibes database and authenticated using a token stored in a Netvibes.com cookie.

    This external developer was able to access other rendered content on the page, including content stored in other modules such as a user's latest e-mails or text stored in a webnote. A Netvibes employee stored his login credentials for an internal development wiki inside of a text note on his homepage, and the third-party developer was able to read this information and access data stored inside Netvibes development servers. The developer did not access the Netvibes.com storage methods directly, but was able to gain access through the internal database.

    Other web widget homepages place modules inside of an iframe by default, creating a page within a page with restricted access to other content. It's possible to create inline content on services such as Microsoft's Live.com or Google Personalized Homepage but it raises an extra user warning when granting this higher level of access. A widget directory might also store an approved snapshot of the developer's module code on their own controlled domain for quick, dependable access and reliability. Typically you want to separate the widget homepage and the widget storage into separate domains to restrict access to cookies and other information bound by a domain name.

    Update 2/10: Netvibes will roll out a new widget system over the next month to deal with these types of security issues according to Netvibes lead API developer François Hodierne.

    Tips for installing new widgets and modules

    Gmail widget prompt

    The exposure of this Netvibes user data is a reminder of the tradeoffs between the demand for content and the trustworthiness of the mini-applications we add to our websites and desktops. We might be eager to unblock our PayPal account when we receive a supposed e-mail alert, so eager that you might not even recognize the unfamiliar URL requesting that data with ill will. Widget users (and toolmakers) need to apply similar caution when adding special 200 pixel squares to their homepages and blogs as well, as they are allowing a new publisher to access both the data on the page and the data he or she configures within the module. Web developers should really be using authentication proxies such as Google AuthSub or Yahoo! BBAuth instead of creating their own input boxes for user credentials on those networks.

    Netvibes module install warning

    The best bet for end-users lies in the widget directories for their platform of choice or on the websites of already trusted brand. A Gmail module produced by Netvibes or Google is likely to be more secure than a third party module and provide trusted storage of credentials and secure over-the-wire direct access to your remote data.

  7. Feb07

    Yahoo! Pipes remixes the syndicated web

    Yahoo! released Yahoo! Pipes tonight, a visual editing interface for web feed manipulation and reconstruction. The 5-person Pipes team, part of the Yahoo! TechDev incubation group, spent about 5 months developing the product to help people better remix the syndicated content they find online.

    Yahoo! Pipes lets any Yahoo! registered user enter a set of data inputs and filter their results. You might splice a feed of your latest bookmarks on del.icio.us with the latest posts from your blog and your latest photographs posted to Flickr. You might automatically translate your favorite news sources to your native language, or only receive the 1 out of 20 news stories from your local paper that reference your town or local schools. A traditional web feed lets you select your news from a set menu, while tools like Yahoo! Pipes let you build your own dish with only the ingredients you care about.

    Yahoo Pipes sample edit interface

    The editing interface connects pre-configured modules and their option, creating a new feed accessible as RSS, Atom, or JSON. Anyone can share their modules, or clone the work of others to tinker a few things and enable their own customizations.

    Yahoo! Pipes opens up some interesting possibility for feed aggregators, letting users filter out unwanted content affecting their experience. Pipes opens up a few feeds that were not practical for a human to read in the past, either due to a high volume or possibly a foreign language. My favorite operator is the location extractor which analyzes an item's text attempting to identify addresses, locations, or the URLs of popular mapping services.

    Publisher Concerns

    Yahoo! Pipes has implications for web publishers, changing the reliability of delivered content, the relationship with the end user, and the polling frequency of a mashup that may or may not be actively utilized.

    Yahoo! Pipes makes it easy to remove advertising from feeds or otherwise reformat your content. I already know a few publishers who hold back the publishing the full content of their posts for fear of easy resyndication and brand dilution, and if Pipes becomes popular publishers might hold back a bit further or ban Yahoo! Pipes outright. A Yahoo! Mail user searching for a new feed subscription will likely choose an identical feed labeled "No Ads!!!" associated with their favorite brands.

    One of Yahoo!'s sample pipes, Aggregated News Alerts, uses the Technorati search API and republishes a key issued to an individual user. A site such as Technorati can increase that user's allowed queries per day, but they lose control over the issued unique key and its use.

    The Pipes troubleshooting section lists three ways of blocking the tool from using your feeds: modify your Apache settings to block User-Agent "Yahoo Pipes", add a new element to your feed, or send Yahoo! an e-mail asking them to manually add your URLs to a blocked list and verify your authority to make such a request. The suggested meta element added to your XML creates invalid feed markup and might cause your feed to stop appearing in some strict renderers.

    Summary

    Overall I really like Yahoo! Pipes, it's intuitive interface, and its "View Source" approach to building your own web services. I think a lot of people will build interesting new things using the service, and it ties in nicely with services such as Yahoo! Alerts. It's a pretty solid product from the Advanced Products group, leveraging web feeds as a simple web service.

  8. Jan29

    Boost Ajax performance using local storage

    The migration of popular computing applications to the Web has changed the way we view the web browser. Some of our most frequently used applications now exist within a tab of Firefox or Internet Explorer, constantly polling a remote server on our behalf and presenting the results in a rich interface powered by the latest features of JavaScript and/or the Flash Player plugin. These "live" web applications have pushed the browser to its limits (and sometimes beyond), consuming increasing amounts of memory and network bandwidth as our browser terminal remains connected to the data cloud. Storing data and preferences directly on the user's machine is one way to speed up a web application and even offer some offline capabilities, connecting to data stored on a local hard drive instead of relying on a remote server. In this post I'll walk through some of the ways web application developers take advantage of local storage to speed up applications, persist user preferences, and enable features for "occasionally connected" users.

    Storing web application data within a local cache opens up new possibilities for a future class of web applications by storing and loading user data directly on a user's hard drive. The future of asynchronous JavaScript and XML (Ajax) will extend its reach beyond client-server interactions and into local XML storage addressed from any web page and interpreted by the client web browser. A web application can rely on local storage options when disconnected from the Internet, saving changes locally and synchronizing results whenever an active Internet connection is available.

    Imagine a personal finance site storing your stock portfolio and historical prices locally, creating quick access to charting and planning tools powered by pre-loaded data. Your favorite blogging tool might already use local storage to automatically save drafts of your blog posts, checking for spelling and grammar mistakes based on locally stored individual preferences. A personalized homepage might store your selected widgets and their content locally, quickly loading your information dashboard with or without an available Internet connection.

    Local storage of client preferences and data is nothing new but, like DHTML, is being rediscovered as web applications squeeze as much as they can out of currently deployed browsers and popular plugins. Just like other web technologies such as JavaScript and CSS, support for local data files addressable from a web page varies by browser. JavaScript libraries such as Dojo Storage abstract each storage method into a single JavaScript call with appropriate storage based on available resources (thanks Brad Neuberg!), but it's useful to take a look at the low-level options and their respective limitations.

    Web browser local storage options
    BrowserStorageMax Size
    AnyCookies4 KB
    Flash Player 6 and aboveFlash local Shared Object100 KB
    Internet Explorer 5 and aboveuserData1 MB
    Firefox 2 and aboveDOM Storage5 MB

    A HTTP cookie persists user data in a single browser across multiple browsing sessions, allowing a website to track items placed in a shopping cart, recognize a logged-on user, save a site preference, and more.

    Cookies are limited to 4 KB of storage per domain and are a good way to persist user data for convenience or tracking. Modern web browsers contain cookie and privacy management features to wipe away stored cookies and their stored data and therefore have limited utility for continued persistence. Cookies are sent along with every request on a given domain, adding extra weight to every message exchanged between an end-user's browser and your site, even if the cookie data is only occasionally utilized.

    Browser cookies are the most common form of persisting data across multiple website visits but their limited size, common deletion, and added weight limit the usefulness of this time-tested storage method.

    Flash local Shared Object

    Flash logo

    Websites can take advantage of Abode's ubiquitous Flash Player to store data as a local shared object or Flash cookie. Flash storage objects are available in Flash Player 6 or above, reaching 96% of web users in mature markets as of September 2006.

    Flash Player can store up to 100 KB per domain without any user interaction. Storage limits can be increased by prompting the user for a larger allocation. Stored data is accessible across the user's Flash Player instances, loading stored data into Internet Explorer, Firefox, and even Flash apps such as Apollo.

    Adobe Flash storage settings

    It's possible to view, delete, and change Flash cookies stored on your computer through the Flash settings manager, but most storage will occur seamlessly behind the scenes without involving the user.

    Flash local shared objects provide a reasonable amount of storage across multiple browsers and applications. The Flash Player plugin requires some additional allocated system resources at runtime for a single function, but you can limit its use to only those pages on your domain requiring a local storage component.

    Internet Explorer userData

    Internet Explorer 7 logo

    Internet Explorer 5 and above supports data persistence using a userData behavior. Per-document and per-domain storage restrictions vary based on a site's security zone.

    userData storage limits by security zone
    Security ZonePage storageDomain storage
    Intranet512 KB10 MB
    Internet128 KB1 MB
    Restricted64 KB640 KB

    An enterprise application has access to up to 10 MB of storage for each internal domain and Internet applications can take advantage of up to 1 MB of storage per domain, or 128 KB on every page view. These XML files reside in the user's settings folder and will not be removed when the user clears out cookies, temporary files, or autocomplete settings in Internet Explorer.

    Internet Explorer exposes a relatively large local storage component for web applications to query when needed. It's especially useful in corporate environments, creating up to 10 MB of fast-access data for each user.

    Firefox DOM Storage

    Mozilla Firefox logo

    Firefox 2 supports local storage based on the WHATWG DOM storage method, simply referenced as DOM Storage in the Mozilla context.

    Current versions of Firefox 2 allow unlimited storage through the DOM Storage feature but future Firefox releases (post-2.0.0.1) will restrict usage to 5 MB per-domain. A website can access not only data within its own subdomain or domain, but within a given top-level domain (.gov, .com, etc.) or any requesting page, creating some interesting opportunities for shared data namespaces.

    DOM Storage can queue an alert events when a browser connects or disconnects from the network, prompting a data sync once a user's local changes are able to talk to a remote server.

    The standardization process behind WHATWG DOM Storage for web applications holds promise for future implementations of browser-based storage from other working group members such as WebKit/Safari and even Google. These storage methods are very new and I expect many implementation details will become solidified in the Firefox 3 development process.

    Conclusion

    Client-side storage addressable from any web page has the potential to change the way we build web pages and the division of labor between client and server. Just as CSS and JavaScript created new ways to style and interact with a page, the client-side storage capabilities of modern browsers will create a new concept of a web application runtime. It's yet another step in the progression of web applications trying to create the best possible experience using the latest widely deployed web browsers and browser plugins.

    Web applications using these latest technologies can deploy an upgrade on-the-fly, initializing a new set of libraries and web page templates after examining a user's browser and bandwidth for compatibility. Web applications such as Google Calendar might store your appointments locally, exposing this data to Google Maps or other mapping applications to plan the route to your next appointment without submitting a new server requests for the same data. Your webmail will be downloaded locally, quickly loaded even if you are on a plane.

    I'm excited to see more applications start to use client-side storage available in modern browsers such as Internet Explorer, Firefox, and the Flash plugin. I'll happily give up the space of a MP3 file in exchange for a better experience in my favorite web application. I think we'll hear a lot more about client-side storage for web applications in the coming year.

  9. Jan23

    Google and Microsoft gadget developer setup compared

    Modern web APIs embrace the self-publishing tinkerer, making integration an easy step for a variety of web publishers. A few lines of HTML and a quick copy and paste of some JavaScript might be all a publisher needs to add new functionality to their site or roll out a completely new feature. I think the most successful developer programs will offer resources for the tinkerers as well as the developers, extending their reach and developer base beyond those with a knowledge of post versus get.

    Google Gadget scratch pad

    The Google Gadgets getting started guide walks would-be gadget makers through the process of creating a gadget, introducing a few choices to consider depending on their implementation. At the bottom of the page Google includes a "scratch pad," letting visitors tweak a few existing gadgets and preview the results in a separate tab view. This view source development process is a proven learning tool letting people experiment with a familiar (and functioning) page, tinkering and changing a few pieces to see what happens.

    Contrast Google's process with Microsoft's competing Windows Live Gadget SDK.

    Windows Live Gadget first steps

    Microsoft greets potential new gadget developers with a set of web server install tips, caching configurations in Internet Explorer, and setting up Visual Studio. On first read my reaction is "I don't have that" and I walk away.

    Microsoft currently lists 477 total web gadgets in its directory. Google does not display a total number but I was able to page through over 1000 homepage modules in the Google directory. Google appears to have a big lead in gadget implementations and it's easy to see why.

    Summary

    Identifying your audience and knocking down any barriers to entry should help accelerate any developer network. I believe the API implementer can be an eager amateur as well as an experienced developer, and companies trying to extend their reach should embrace both the tweakers and the coders.

    How do you sell your API? Help your visitors visualize the end result as they attempt to gauge the amount of work and expertise needed to implement. The first step is letting the potential customer try customizing your product and visualize their own use before they personally dive in deeper or hire an integrator. Keep it simple.

  10. Dec19

    del.icio.us API for URL top tags, bookmark count

    Social bookmarking site del.icio.us has exposed a new API providing the top tags and total number of bookmarks for any URL in its system. Yahoo's Developer Network provided a short preview earlier tonight of a soon to be released del.icio.us web badge but currently anyone can request data from the open API. It's a useful feature to provide additional context for a URL, suggest tags, or measure one aspect of a site's popularity.

    endpoint
    http://badges.del.icio.us/feeds/json/url/blogbadge
    parameter
    hash

    Simply submit a request to the above API endpoint with a hex MD5 hash of the URL of interest as your hash parameter value. Del.icio.us returns results in JSON key-value pairs. Data includes the total number of del.icio.us users who have tagged the given URL and the top 11 tags (and tag count) used to describe its content.

    You can check out a few examples such as the response for del.icio.us, the response for apple.com, or the response for niallkennedy.com/blog. If you need help constructing a MD5 hash you can use Paul Johnson's implementation (del.icio.us uses the same script). You may specify a callback function using the callback parameter.

    The API is officially unreleased, may be shut down if not used in full Yahoo-constructed blog sidebar badge form, and may be subject to further terms of service. Hopefully the new set of del.icio.us servers can keep up with demand.

    Update: Del.icio.us officially announced Tagometer badges as well as a JSON feed of URL data about 16 hours after this post was published.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Latest feature: Widget development

Archives: Popular Categories

Sites: More from Niall