October 2006 Archives

  1. Oct31

    Google collaborative appliance on the way?

    Google Mini

    Google's moves into application bundles and collaboration software are setting it up for a bigger enterprise play, taking on Microsoft in an area that consistently feeds their R&D. Maybe you read about the JotSpot acquisition this morning on the Google enterprise blog.

    We look forward to putting those wikis to work.

    Google currently searches the enterprise through its search appliance, a brightly colored box you place in your rack and configure to crawl behind the firewall. Just one application on this box seems like a waste of space and could perhaps open up some more applications for small to medium sized businesses.

    Google Apps for Your Domain currently bundles Gmail, Talk, Calendar, and Page Creator for a specific group of users, all hosted on Google's server. Stick it in the box.

    I expect the new version of Blogger, Google Docs, and JotBox will eventually be integrated into the Google appliance. Google Groups to manage your work groups, Google Desktop to add to the local index what the appliance might have missed, and a special version of Toolbar to tie it all together.

    A quick browse through Google job openings leads me to an Engineering Manager/Director for CRM although that could just be internal.

    Just a theory on the progression of Google Apps bundles but it looks like the pieces are falling in place. Failure to compete in this area means Microsoft maintains a large percentage of the enterprise search business through Sharepoint and other offerings, funding more research and blocking Google from being the preferred search provider of enterprise knowledge.

  2. Oct27

    Bookmarking and social sharing trends

    The ability to save a URL has been around since Mosaic 0.2 but is currently experiencing a transformation as we learn more about the pages and content behind the pointers and share our findings with others through social networks. Hotlists, bookmarks, and favorites are changing and this month's SF Tech Sessions next Monday will take a look at a few new companies changing the way we think about sharing bookmarks.

    The inspiration for this month's SF Tech Sessions came out of a conversation with Jeff Weiner and Joshua Schachter of Yahoo! earlier this month. We talked about different ways people share data on del.icio.us, Yahoo! My Web, and Yahoo! Shopping as well as within smaller communities of interest.

    The bookmarking space continues to change, driven by changes in desktop software as well as modern web usage such as bookmarklets, extensions, and social sharing, but it's clear we're just getting started. Let's take a look at current methods of bookmarking a web page, and how individuals choose to share personal and social browsing behavior.

    Local bookmarks

    NCSA Mosaic Advanced Hotlist Manager

    Local bookmarks are stored in our web browser profiles and are often used the same way we might dog-ear a book. Local bookmarks can list a frequently visited site, an article you want to be sure to revisit later, or a decision in progress such as choosing a vacation or shopping for a new couch.

    You might bookmark the news page of your son's school to stay up-to-date on snow closures, events, and other relevant news. Local bookmarks might be relevant only to you, enabling shortcuts for frequent activities.

    Further reading: Internet Explorer Favorites, Mozilla Firefox.

    Live bookmarks

    Some bookmarks contain trackable updated content, expressed as a web feed, calendar data, or simply a file modification. It's possible to subscribe to a web page, displaying updated content within the bookmark listings or simply noting the page has changed in some way.

    Mozilla Firefox Live bookmark

    A live bookmark lets users quickly glance over changing data, and track the updates of many sites at once. In this case a bookmark is more like a subscription, creating a shortcut for visiting a page, identifying new content, and then visiting the location of the new content.

    Bookmark clusters

    Adding a bookmark used to mean saving the location of the current browser window. Today's modern browsers consist of multiple tabs, placing multiple web pages inside each window. These tabs might be organized collections, storing items you would like to recall on a regular basis or save as a collection.

    Mozilla Firefox tabbed interface

    Browser tabs form natural groupings and an easily saved state. I expect we'll see more bookmark collections in the future as tabs become common browsing tools in Internet Explorer, Firefox, and Opera. A user can save a group of bookmarks such as a trip planning, home improvement, or baby names.

    Synchronized bookmarks

    Bookmarks were one of the first pieces of local data to travel into the cloud, offering synchronization across multiple computers or web access when you are on the go. Synchronization may occur through a browser toolbar or plugin, operating behind-the-scenes while connecting to a backend such as Yahoo! or Google.

    Yahoo! Toolbar bookmarks

    Google and Yahoo! account for more than 95% of toolbar searches in the U.S. and I expect many of those users automatically sync their bookmark data.

    Bookmarking in public

    Sites such as del.icio.us or Furl allow you to sync and share your bookmarks, exposing your web pages of interest to other site members or the entire Web. Your descriptive behavior may change as you add a title, description, and tag for your own use and/or discoverability of others.

    del.icio.us add bookmark

    The integration of social bookmarking content in blog sidebars, spliced feeds, and site browsing has made bookmarking a substitute for a full blog post and commentary. Private bookmarks are a fairly recent addition to del.icio.us, showing the default nature of the site's users.

    Bookmarking for another individual

    Del.icio.us users can share a bookmark with a specific person, placing the pointer within the target person's bookmark stream. This bookmarking behavior is a virtual tap on the shoulder, suggesting new content of potential interest.

    A person's link behavior might be tied into a user account network, tracking the bookmarks of a group of people at once, and suggesting those same people as possible share points.

    Bookmarking for an affinity group

    Groups form in online communities, joining together people interested in squared circles, social networking or web design. Submitting links to a group creates a shared resource with a defined audience interest. Your work is archived, allowing new members to discover the group's past activity.

    ma.gnolia Identity 2.0 group

    Bookmark groups may also launch further conversation, either in real-time through a chat or through comments on the original submission. Adding a link to a new article in a trade publication might spark some debate, or a link to a corporate document might initiate further analysis.

    Try it: Ma.gnolia, Mugshot

    Shared collections

    Shared bookmark collections are a useful way of sharing research and soliciting input from others on multiple resources. Users can share their own personal resources such as the best coffee in San Francisco, waterfall hikes in Oregon, or the hottest prom dresses of the season.

    Kaboodle bridal collection

    Once a collection is shared it might be edited or commented upon by a group, enabling the wisdom of the crowd. Shared collections are an opportunity for revenue sharing, rewarding the recommendations and expert opinions of others while completing a purchase of displaying an advertisement.

    Check out: Amazon Guides, Kaboodle.

    Additional data collection and display

    A bookmarked URL can contain more information than just a URL text string. You can identify a bookmarked resource as an image, audio, or video and display the full content or a preview within your application. You can also recognize content from known structured sources such as Amazon, Flickr, and YouTube, pulling in additional data about the linked resource.

    Amazon product information

    Recognizing an Amazon URL and the ASIN within, a service could gather price, availability, product images, reviews, and more from the web page HTML or through available APIs. A Flickr or YouTube URL could be similarly recognized and additional data gathered and URL normalized based on the service's proprietary identifier and URL structures. I expect more social bookmarking services will build these specialized data displays as they seek to grow vertically and make their pages a bit less boring.

    Frequently visited non bookmarks

    You might frequently visit a site by typing some keywords into a search engine and clicking on the top result. I sometimes conduct the same search for a resource multiple times a month, visiting the top result. I consider these actions a type of soft bookmark. It's easier to initiate a search than save it, but my repeat visits are useful information to the search engine as it tries to shape personal search and social search preferences.

    Amazon product information

    The example search above shows a search for "NSI whois" on Google, my way of calling up a Whois data for a domain and I occasionally want to get the data within a few seconds.

    Bookmarks are searchable locally and inside of an online service, contributing strong signals about user preferences to the search process. A search for "digital camera" becomes more useful when you are reminded about previously bookmarked cameras. Search can serve as a recall for yourself and a filter for yourself and others, creating better results out of the millions of possible matches to your query. Your friend's guide to waterfall hikes is more valuable to you than a random publisher, and search engines with bookmarking abilities will continue to integrate your saved items, visited results, and more into your personalized search results.

    Summary

    There are many different approaches to bookmarking and recent changes in web browsers, add-ons, and a web of participation will continue to fuel growth in the sector. There's still a lot of work to be done in terms of search and service integration and creating compelling reasons to generate useful content, connecting users with the information they care about.

    If you made it this far and you live in the San Francisco Bay area you might want to check out SF Tech Sessions next Monday, October 30, from 7-9 p.m. at CNET to learn more from the people behind current social bookmarking products.

  3. Oct26

    Google Alerts for blog content

    Google Alerts now supports blog search content. If you subscribe to a Google News alert for your brand or topic of interest you can now receive the same style alerts for content in the Google Blog Search index.

    Google alerts configuration

    Google Alerts tracks news, blogs, Usenet groups, and Google Groups discussions using the same search syntax found on their respective website. The new blog search e-mail notification will be an easy extension of vertical search for existing users. Advanced users can setup an advanced search, or choose to receive general updates via web feeds and critical updates via e-mail. I expect Google will add Google Talk integration soon, delivering alerts over IM.

    Disclosure: I own a small piece of Technorati, a competitor to Google Blog Search.

  4. Oct25

    Planning a small conference

    I like small, focused events especially in the early days of an industry. Om and I organized the Widgets Live! conference to bring together the major players creating widgets, gadgets, and modules and the major endpoints of deployment. Many decisions were made along the way, and I'll share just a few in this post.

    When?

    We knew lots of people from around the web industry would be in San Francisco for O'Reilly Media's Web 2.0 conference November 7-9. Scheduling the widgets conference adjacent to the Web 2.0 conference creates a convenient opportunity for a few people to extend their visit to San Francisco from France, Ireland, Japan, Seattle, and many places in between.

    Hosting a conference the day before the Web 2.0 conference allows smaller companies and products to launch at our widgets-specific conference before the downpour of press releases and announcements that usually happen at the bigger conferences.

    Rough schedule

    What products and companies do we want to be sure are represented at the conference? Om and I listed some of the different widget sectors (desktop, social network, homepage, etc) and the key players we would like to see represented from each sector.

    How many attendees?

    The speakers and their coworkers created a base attendee level, and we knew more people would like to hear them speak and meet the other people in the room. But how many? We guessed there would be a total attendance of between 150-200 people. The total number of attendees limits available venues and rooms available to host the group so I rather sell out the available space and have a more intimate setting than restrict available rooms to something like a big ballroom.

    Room configuration

    Conferences are typically setup with either a theater or classroom seating arrangement. A theater arrangement consists of rows of chairs facing the stage. A classroom arrangement adds tables to each row, allowing attendees to place a laptop or notepad on a flat surface. I like the classroom setup a lot better, and restricted venue searches to rooms that can hold 150-200 people in this configuration with either 18" or 30" tables for laptops and notepads.

    I hate crappy WiFi

    There's nothing like crappy WiFi to ruin an otherwise good conference. It allows attendees to stay connected with their office colleagues, write e-mails, post to a blog, and connect with other attendees. The widgets conference needed to have working WiFi access for all the laptop-toting attendees, but would networks at possible event venues be able to handle the load? Could I trust the sales person who assures me they can?

    Clay-Jones and Sutro towers

    Fixed wireless access might help solve the issue, beaming microwaves of bandwidth from the hills of San Francisco. If the event venue has line-of-sight to a point of presense there may be hope. Two big towers in San Francisco are Sutro Tower on Twin Peaks and the Clay-Jones building on Nob Hill.

    Picking the venue

    We ended up choosing a locally owned and operated non-profit as our event venue. The Marines' Memorial Club provides discounted accommodations to visiting military personnel and recently renovated their event space. I like the high ceilings, vintage look, and a small venue for a small, focused conference.

    The event rooms on the 10th floor also happen to have a great view of the Clay-Jones building just a few blocks away on the top of Nob Hill. TowerStream happens to have a point of presence on Nob Hill, boosting available bandwidth at least 10-fold.

    Separating speakers and sponsors

    I wanted to select the best possible speakers regardless of their company's sponsorship role. I handled speaker selection for the conference and Om handles sponsors. This separation of duties is attempt to balance the best possible attendee experience with a good level of sponsor participation.

    We rented an additional room to provide exhibit space for sponsors and allow attendees to interact with the products and staff talked about at the conference. Maybe you've never seen Windows Vista or a tricked out MySpace page, or you want a hands-on experience with some widget hardware. Setting up a physical space of focused interaction creates a better experience for both sponsors and attendees.

    I think sponsors get a better value at a focused event, setting more focused objectives and getting their name and product in front of the appropriate community. I can reevaluate the perceived ROI in about two weeks.

    Summary

    Conferences are a lot of work, but I hope to see more small events in the future. There are definitely expediencies learned with experience and I'm open to sharing implementation details with anyone thinking of doing their own event.

  5. Oct23

    Conference industry basics

    I've been busy over the past few weeks organizing the Widgets Live! conference. I've talked to lots of people interested in various aspects of the conference industry, so I'll summarize a few logistics in this post.

    Venue costs

    Event venues typically charge a room rental fee combined with a minimum catering expense. All prices quoted are usually a "list price" and negotiable depending on factors such as the length of the conference, number of rooms booked at the hotel, your total catering spend, and your repeat business if you host multiple conferences a year. Many venues will waive the room rental fees based on a catering minimum, so be sure to ask.

    Catering

    hotel catering sample listing

    Catering fees in no way match what you might expect to pay at a sandwich shop or local restaurant. In my experience looking at San Francisco hotel catering menus a continental breakfast consisting of coffee, orange juice, muffins might cost around $25 a person. You'll have to add a service charge (typically around 20%) and sales tax to quoted prices. The sample listing in the picture amounts to $13 for each cup of coffee and about $80 for a dozen petite donuts.

    You can get a gallon of Starbucks coffee, about 12 cups, for about $13 and a dozen Krispy Kreme donuts for about $6, but you're paying for the atmosphere and the service environment during your important event.

    Audio-visual

    You'll need microphones, a mixer, speakers, a projector, and a projection surface for the event. Many venues partner with an outside audio-visual consultant and you can rent equipment and perhaps even hire a technician to make sure all the equipment runs smoothly during the event.

    Typical projectors available are either SVGA (800 × 600) or XGA (1024 × 768). More lumens means a brighter picture, which can make a big difference in a room with a lot of sunlight.

    WiFi

    Typically WiFi is provided as an a la carte item for conference organizers. You will most likely have access to a T1, which in theory could handle synchronous 1.536 Mbit/s I've typically seen T1 data access listed as "up to 50 users" by venue sales staff.

    A hotel might host a breakfast meeting for the local investors club, a wedding, and occasionally a technology conference. The network is typically not setup to handle the thrashing of a tech conference crowd.

    You can boost your bandwidth through the hotel if they are setup for extra capacity, or you could drop in a fixed point microwave connection if you have line-of-sight to a fixed wireless provider.

    Summary

    The first steps for a successful event are securing a good date, location, venue, room setup, and nourishment of the food, bandwidth, and power varieties. I'll address some of the decisions we made for the Widgets Live! conference in a separate post.

  6. Oct17

    New Googlebot controls for webmasters

    Google has added new features to its tools for webmasters, allowing us to request Google index our site faster and more thoroughly than before. Crank it up!

    Control Googlebot crawl rate

    Google Webmaster tools crawl rate request

    You can now control how frequently Googlebot crawls your site over the next 90 days. Webmasters can ask Googlebot to slow down or speed up for the next 90 days. Your choice may affect your total bandwidth usage but the tradeoff is possibly more frequent visits from Google's discovery and indexing tools.

    Enhanced image search

    Google Webmaster tools enhanced image search preference

    Webmasters can now opt-in to Google enhanced image search. If you opt-in Google may use tools such as Google Image Labeler to add keywords to associate your images with crowdsourced keywords.

    Summary

    I'm a big fan of Google Webmaster Tools to maintain a better relationship with what can be my biggest source of referrals. The new tweaks from Google allow Google to learn more about my site and possibly send a few more referrals my way using current and experimental methods. Not everyone uses Google's Webmaster Tools, so I feel like I have a bit of an edge, however artificial that hope may be.

  7. Oct17

    The current state of video search

    When I lived in L.A. it seemed like everyone wanted to be a movie star. The Starbucks barista waiting to be discovered as he pronounced "Frappuccino," friends scheming to be placed on a reality show and win a trip to a tropical island, and the many writers trying to get their latest script into the hands of Steven Spielberg. The recent boom in online video and its associated capture hardware has created a new class of stars. The next American Idol might submit a cover song to YouTube and video of a child's first steps are uploading to the Web for the world to see. How can search engines discover these new sources of video, extract relevant information, and successfully handle user queries? In this post I will take a look at the present state of video search, how machines make sense out of movies, and take a peek inside the state of the art.

    Eric Rice Show camera

    A multiplexed video file contains a set of sequenced pictures with an accompanying audio track. Digital video often adds a header describing the recorded work to assist in playback and location. Due to a video's composition many technologies from image search and audio search still apply, but with a few optimizations to take advantage of a larger amount of correlated data.

    File identification

    A general search engine contains links from all over the Web, including links to video files. A specialized video index may be formed by combing through a link index looking for links adhering to known file extensions:

    avi
    Audio Video Interleave, an older format popular on Windows machines.
    mov
    qt
    QuickTime container, popular on Apple computers.
    mp4
    m4v
    MPEG-4 Part 14 files. M4V is a popular expression used by Apple's iTunes.
    wmv
    Windows Media video.
    asf
    Streaming video using Microsoft technologies. The Advanced Streaming Format is not exclusive to video as it may contain streaming audio.
    flv
    Adobe Flash videos.
    divx
    DivX Media Format
    3gp
    3g2
    3G mobile phone format
    rm
    RealVideo format by RealNetworks.
    mpg
    mpeg
    MPEG-1 or MPEG-2 video file.
    ogm
    Theora video format.

    Windows Live Search allows users to restrict searches to pages containing links to files containing one or more file extensions, such as a search for mov, wmv, or m4v files on pages mentioning "dance." Bookmarking site del.icio.us uses this method to identify video bookmarked by its users.

    HTML markup

    Video found in the wild are often described and referenced within HTML pages. Here's an example of how an audio file might be described within a web page link:

    
    <a href="firststeps.mov"
     type="video/quicktime"
     hreflang="en-us"
     title="A longer description of the target video">
     A short description</a>
    

    The href attribute points to the location of the video file. The video/quicktime type value provides a hint for user agents about the type of file on the other end of the link. The hreflang attribute communicates the base language of the linked content. The title attribute provides more information about the linked resource, and may be displayed as a tooltip in some browsers. The element value, "A short description," is the linked text on the page.

    It's not very likely publishers will produce more data than the functional effort of href. Title is a semi-visible attribute and therefore more likely to be included in the description, but still uncommon. It's possible to identify video by a given MIME type such as video/quicktime but few sites provide the advisory hint of type in their HTML markup. Collecting a file's MIME type requires "touching" the remote file, and will most likely return default values of popular hosting applications such as Apache or IIS, so a search engine is likely better off relying on a local list of mapped extensions and helper application behaviors.

    Embeds

    Some videos are embedded in the page, complete with plugin handler descriptions that allow a webpage viewer to play back the audio file directly from its page context. This content may take the form of an object or an embed in the page markup. The old-style embed element seems to be preferred by the autogenerated HTML of popular video sites, presumably for backwards compatibility with more web browsers. Embedded content often specifies a preferred handler plugin and possibly a "movie" parameter, but it's difficult to tell from the markup if the referenced file is a video.

    A search engine may apply special handling to embeds from well known video hosts to gather link data for resource discovery and ranking. A YouTube video embed references the same identifier used to construct the URL of the full web page, and could be counted towards that page's total citations.

    Syndication formats

    It is possible for a publisher to provide more information about a video item and its alternate formats using a syndication format namespace extension such as Yahoo! Media RSS. Details such as bitrate, framerate, audio channels, rating, thumbnail, total duration, and even acting credits can be applied to information about the remote resource without actually "touching" the file. This method is currently used by large publishers such as CNN to provide Yahoo! with constant updates for its sites.

    Producers of Quicktime, MPEG-4, or H.264 video may provide more information about their content using Apple's podcasting namespace. Extra information such as subtitle, total duration, rating, thumbnail, and keywords may be associated with video content using this namespace. This data is displayed in the iTunes Store and by other compatible applications.

    Video metadata

    MPEG4 container
    MPEG-4 video (drawing by Apple)

    Video files are packaged in specialized containers containing header data and video content encoded in what could be multiple different codecs per container type. The drawing above is an example of the multiple components of MPEG-4 from descriptive elements to the audio and video tracks, and the synchronization to bring it all together. Information such as title, description, author and copyright are common and similar to an MP3's ID3 information. Additional data such as encoding format, frame rate, duration, height and width, and language may be included.

    Descriptors such as MPEG-7 can be applied to the entire file, or applied to just the audio or video track. A publisher may also describe a sub-section of a video with more information, such as a nightly news report containing descriptors for each individual segment.

    The Library of Congress maintains a directory on video formats aimed at preserving digital moving images and their descriptions throughout time. It's an interesting browse if you're into that sort of thing.

    Subtitles

    A video may contain timed text, otherwise known as subtitles. This information can be described using 3GPP Timed Text, for hearing impaired, language translation, karaoke, or many other uses. Search engines may use this data to easily gather more information about the track.

    Hosted video

    File size and bandwidth constraints of individual web hosts make specialized video hosting an attractive (and often free) option. Google will host your video files on Google Video or YouTube, Yahoo! hosts video at Yahoo! Video, and Microsoft has MSN Soapbox. Hosted video standardizes video formats for easy playback, extracts metadata at the time of upload, and collects ranking data such as popularity and derivative works through its user communities.

    Video hosting handles many of the current limitations of video sharing. Encoding is normalized and optimized with little noticeable difference to the casual user. Flash Video is a common hosted playback method thanks to the ubiquity of Adobe's Flash player, but hosts will use higher-quality video where appropriate such as Windows Video on MSN Soapbox or DivX on Stage6.

    A hosted video contains its own web page with additional captured (and public) data such as author, page views, category, tags, ratings, and comments. Ratings and other commentary is especially interesting because it allows a site to construct a social network around a particular publisher, learning about their likes and dislikes.

    Watching the movie

    Stumptown Coffee Roasters drink menu

    A movie is a series of still frames in sequence. A sampling of frames reveals context, such as recognizing the actors in a particular scene, the backdrop, or when that Coke bottle appeared during a television show. Image analysis outlined in my image search post can be applied to videos and used to better determine context combined with other available information such as audio.

    Parsing spoken word

    Video indexers can listen to the audio track and parse the spoken word in much the same way as stand-alone audio search. The presence of images provides additional context than pure audio and provides an extended yet focused vocabulary for comparison. Matching your pronunciation of "cappuccino" to the visual cues and sounds of a cafe assist in speech recognition. Similarly, the presence of a football on screen provides better context for the word "goal" during your family's weekend match.

    Tracking video citations

    Fingerprinting

    Professional videos are often "fingerprinted" with information about the work. A video producer might include frames that are ignored by humans viewing 24-60 frames per second, but identifiable by machines watching for the data.

    Television shows often have frames of text at their beginning to communicate the show title, episode number, year produced, and other data. Special frames may be used before and after a commercial to easily denote a switch from syndicated content to locally inserted media. Techniques from professional video production may find their way into more web videos, especially as amateurs begin using tools previously only within the reach of the pros.

    It is also possible to fingerprint a file based on its description data, length, and other factors. A video site could "roll up" these different references and track the original source by discovery date or direct reference where available.

    As videos are copied and redistributed their digital fingerprint will often remain intact, allowing indexers to recognize and attribute the piece to its original source.

    Videos within videos

    Videos sometimes contain citations and references to other videos. The nightly news referencing the President's State of the Union address will use a single source of video provided by the U.S. government. References to "education reform" may then be applied and ranked based on these video citations and history of heavy citations of government videos, similar to PageRank and other methods used today for other publicly addressable resources.

    Summary

    Video is a busy space and I feel like I've only scratched the surface with this long post. Expect more companies with expertise in image and audio search to get involved in video search. The image technologies of recent Google acquisition Neven Vision have already been applied to video feeds from security cameras. The audio search technology used by BBN Technologies is now being used by PodZinger to search a video's audio content.

    You can expect to see even more technologies making their way from the security sector into consumer use as we've already see happen in image and audio search. Sequence neutral processing may eventually be applied to the space, replacing the multiple serialized analysis passes we have today.

    Video is booming and is not going away anytime soon. Video capabilities are becoming more common in mobile phones, our capture quality continues to increase, and easy-to-use editing tools on the desktop such as iMovie put better tools in the hands of the average user. Video sharing used to involve recording to a VHS or DVD for sharing with friends but is now as easy as a menu option within an editing application or uploading via a web form on a popular hosting site. The growth of media-hungry sites MySpace and YouTube have proved the built-in audience waiting for new content. The cat videos, karaoke, short films, and breaking news reports will continue to roll in, creating a need for better search and discovery. Hopefully the search industry is up to the challenge and will continue to surface new and relevant information to an eager audience.

  8. Oct15

    The current state of audio search

    Online audio is definitely on an upswing, fueled by the iPod revolution, improved online playback, and broadband penetration. Audio search is keeping up with demand for new content, thanks in part to national security spending in the Cold War and beyond. In this post I will outline the current state of audio search, and how machines make sense of spoken word, progressing from easy to difficult.

    Mic

    First, let's define the space. I'm interested how a search engine might index content with non-professionally produced metadata. The President's weekly radio address contains a full transcript. Music catalogs are available for purchase from Muze and others to provide structured data about Bob Dylan and what he's saying. A voicemail message or a podcast might not be as thoroughly described.

    Let's take a look at audio files a search engine might discover during a web crawl and current methods of understanding the content.

    Filetype identification

    Audio content can be broken down into a few unique file extensions that hint at the remote audio container.

    wav
    The waveform audio format is a common form of uncompressed audio on Windows PCs.
    aiff
    The Audio Interchange File Format is a common form of uncompressed audio on Apple computers.
    mp3
    MPEG-1 Audio Layer 3 is a popular form of distribution for compressed audio files.
    wma
    Windows Media Audio, popular on Windows machines.
    asf
    Advanced Systems Format, a container for streaming audio and video commonly used by Microsoft products.
    m4a
    MPEG 4 audio files, most likely Advanced Audio Coding compressed audio created by Apple software.
    ra
    RealAudio format by Real Networks
    ogg
    Ogg Vorbis open source compression format.
    flac
    The Free Lossless Audio Codec is a compressed format used by audioheads and for archival purposes.

    A web search engine can take a look at all of the links in its index and identify possible audio files based on these file extensions without retrieving any file information from the host server. You can search Google for URLs containing "MP3" and referencing "Bob Dylan." Audio files are not currently supported in Google's file type operator. del.icio.us exposes bookmarked audio through the system:media:audio tag.

    HTML markup

    Audio files found in the wild are often described and referenced from within HTML pages. Here's an example of how an audio file might be described within a web page link:

    
    <a href="speech.mp3"
     type="audio/mpeg"
     hreflang="en-us"
     title="A longer description of the target audio">
     A short description</a>
    

    The href attribute points to the location of the audio file. The audio/mpeg type value provides a hint for user agents about the type of file on the other end of the link. The hreflang attribute communicates the base language of the linked content. The title attribute provides more information about the linked resource, and may be displayed as a tooltip in some browsers. The element value, "A short description," is the linked text on the page.

    It's not very likely publishers will produce more data than the functional effort of href. Title is a semi-visible attribute and therefore more likely to be included in the description, but still uncommon. It's possible to identify audio by a given MIME type such as audio/mpeg but few sites provide the advisory hint of type in their HTML markup. Collecting a file's MIME type requires "touching" the remote file, and will most likely return default values of popular hosting applications such as Apache or IIS, so a search engine is likely better off relying on a local list of mapped extensions and helper application behaviors.

    Syndication formats

    It is possible for a publisher to include more information about a file using a syndication feed combined with a specialized namespace such as the iTunes podcasting spec or Yahoo! Media RSS. A search engine may parse these feeds to gather more information about a particular audio item such as title, description, and length, which often provides a closer correlation than an audio link present on a web page.

    Hosted audio

    Large search engines such as Google, Yahoo!, and Microsoft have not created the same sort of hosted audio community for user-generated content as is present in images or video. Sites such as the Internet Archive host audio such as a Grateful Dead concert complete with data such as artist, title, performance date, equipment used, and audio editors.

    Apple's GarageBand software is one example of integrated recording, compression, descriptive markup, and remote hosting.

    Metadata containers

    Once you reach out and "touch" the audio file the search engine can discover more description information embedded within. An ID3 tag describes the track title, artist, album, genre, and other information provided by the publisher. The metadata descriptor might contain additional information such as album art, lyrics, or descriptions specific to a specific segment of the audio file described as "chapters." An audio metadata parser takes a look at each frame it knows how to read to extract the associated descriptive data.

    ID3 tags often occur at the beginning of the file to assist streaming applications and a metadata indexer might not grab the entire audio file, opting instead to only look for data in those first bytes.

    Parsing spoken word

    Speech recognition has enjoyed rapid improvement over the last decade, thanks in part to the large budgets of national security indexing spoken words captured through ECHELON and other methods. Similar technology is now being applied to medical and legal transcriptions and creating more searchable content for each podcast.

    AVOKE ATX Speech processing

    Speech-to-text software such as AVOKE from BBN Technologies is used to create transcripts of phone calls to call centers, the nightly news, and government surveillance. The system utilizes known vocabularies by language applied over a continuous density hidden Markov model to analyze speech phonemes in various contexts. The system uses multiple passes to determine context and associative clustering of words and phrases.

    Spoken word analysis is utilized in consumer search engine PodZinger to track a search term and jump to the appropriate marker within the file containing the given term. You can search for audio containing mentions of the Athletics and Tigers and view your results in the context of the file with direct links to that segment of the audio program.

    Summary

    Online audio content will only continue to get bigger, as more content makes its way online and into the ears of consumers on a PC, iPod, or other listening device. The maturity of online audio and the current business feasibility should consolidate audio format offerings into audio understood by dominant market players in the desktop, portable, and home theater markets.

    I expect even more speech-to-text work in the future as the CPUs, memory, and disk space available continues to become computationally and monetarily cheaper. Perhaps we might even see client-side analysis of content similar to analysis work being conducted on images. Windows Media Player and iTunes are just two examples of popular media players that connect to the Internet to retrieve more information about your media files, from album art to recorded year. In the future such applications might also query data services such as Last.fm, MusicBrainz, or the Music Genome Project to apply more data to each file based on a purchased database, collective intelligence, or expert analysis.

    Creating new sources of audio content is becoming easier. The popularity of VoIP will place new value on microphones connected to our PCs, gaming systems, and other connected electronics devices. Voice will become an integrated feature, allowing you to easily save a compressed audio file of a recent planning call or your Halo trash-talking session.

    I think many search engines have looked past audio search due to the litigious nature of the RIAA and others evidenced by last year's MGM vs. Grokster Supreme Court ruling. Google's recent $1.65 billion purchase of YouTube is perhaps a sign that search technology will continue to advance, challenging any emergent legal roadblocks along the way.

    As with most search sectors, audio search is still in very early stages. Expect known vocabularies and relationship mappings to increase over time, providing more insight not only into each word, but also speaker identification, tone, and possibly even relationships between events such as a power outage's correlation to customer service calls. We'll keep talking and publishing and search will attempt to keep up with our rate of speech, accents, and methods of describing our creations.

  9. Oct14

    The current state of image search

    A picture is worth a thousand words, especially to search engines trying to match a brief search query to a set of appropriate visual results. How can a web search engine collect enough data about a particular image to provide a user with relevant results? In this post I will outline image search concepts, the current state of the art, and outline some of the challenges with still image search.

    Image on your website

    Yoda statue

    You might recognize the depiction above as Yoda, a popular character the Star Wars movie series. More specifically this is a picture of a Yoda statue perched on top of a fountain at Lucasfilm's headquarters in San Francisco. Here's what Yoda might look like expressed on a web page.

    
    <img src="yoda.jpg"
     alt="Yoda statue"
     longdesc="yodainfo.html"
     width="195" height="240"
     xml:lang="en-US" />
    

    The above markup communicates a few attributes of the image the publisher would like to display using the img element of (x)HTML. A publisher will specify the location of the file but the other attributes are often not used to add further information about the image.

    I've provided a few extra pieces of data in my example. The alt attribute provides a brief description and is used by browsers as a placeholder while the image is retrieved, if it can be retrieved at all. The longdesc attribute links to a URL with a longer description of the image. The width and height of the image is described in pixels, and all values are provided in English. This extra data is uncommonly used, although XHTML requires both the location of the file (src) and a brief description (alt).

    Most search engines utilize the file name as an approximate descriptor of the image. A digital still camera will create serial file names such as DSC001.jpg, making things much worse!

    Hosted image libraries

    How do image hosting sites provided by major search engines change the ability to search your latest still image? Yahoo!'s Flickr and Google's Picasa encourage users to add extra descriptors to images to enable better discoverability and sharing. The description data is more visible than standard HTML markup, making that DSC0001.jpg image title look pretty ugly. Any Flickr-hosted photo displayed in another web page must also include a link to the full Flickr photo page, thereby creating a long description of the image for all search engines.

    My Flickr photo page of Yoda contains a short description, long description, a set of keywords provided by me and/or other site users, and metadata extracted from the image file such as the date and time reading on the digital still image device on capture. Other data such as geographic coordinates may be extracted and displayed on this page, or I might take a few extra steps to manually add the metadata.

    The popularity of a particular photo measured by the hosting site complements other ranking factors such as page- and author-level link ranks. Data gathering possibilities are defined by the manual data input of each user as well as the information present at time of capture and edit. It's very easy for a search service such as Yahoo! or Google to reach out and "touch" these images stored just a short fiber down the rack.

    External citations

    The image may also be described by links from other websites to the hosted page or the image itself. In this case image search can use similar citation analysis as traditional web search to note how other publishers reference a particular resource.

    Touching the image

    A few more pieces of data are available to indexers once they take a peek inside the actual image file. Date and time of capture, camera settings, location, and copyright data may be described in formats such as Exif or XMP, adding even more context.

    Date and time

    Most digital image capture devices include a clock and timestamp their photos. The time available on a mobile phone syncs over-the-air and generally more reliable than a typical digital camera which requires additional setup and menu navigation.

    Geolocation

    Where in the world did you take that photo? Mobile phones are delivering better location-aware services with each new release, fueled by government demand for better emergency services for mobile customers and the industry's desire to capitalize on location-aware service offerings. Some phones include an actual GPS receiver while others rely on the same cell tower triangulation that helps deliver a call to your handset.

    A standalone GPS can synchronize its coordinates with a stand-alone digital camera based on the timestamp on each device. A bicyclist with a GPS receiver and a stand-alone digital point-and-shoot can combine data from each gadget and plot their entire bike ride complete with pictures.

    A WiFi-enabled camera can ping nearby access points to approximate its current location using location data provided by the access point or by comparing the access point's digital fingerprint against a mapped database such as Microsoft Virtual Earth.

    Copyright data

    A publisher may describe a photo's copyright in plain text or by pointing to a URL with more information. If that URL is Creative Commons it's pretty easy to parse license terms.

    Machine viewing

    Text

    Stumptown Coffee Roasters drink menu

    An indexer might take a look at the photo and try to analyze its depictions. Pictured above is the drink menu from Stumptown Coffee Roasters in Portland, Oregon with lots of text. If a machine could recognize the words "espresso" and "latte" in the picture it could build a richer data set for this image. The same technology is useful for decoding image headers found in web pages and for testing CAPTCHA images designed to be parsed by humans, not machines.

    People, places, things

    Alex B. giggles

    Facial-recognition technology can identify the same photo subject across multiple photo captures by analyzing patterns across common facial attributes. The technology is used by security systems, such as comparing World Cup attendees against a list of known troublemakers. A photo publisher can install software on their desktop computer to analyze each photograph looking for people, places, and things familiar to that person or the software's larger community. The software can then identify things such as a previously identified person such as the boy pictured above, a picture of the Eiffel Tower already identified by other users of the software, or a Coke bottle present in a photo.

    Google acquired Neven Vision in August to boost their ability to extract information from image depictions. Riya is working on image recognition technology applied to image search.

    Summary

    A search engine has a variety of data available when trying to make sense of a particular image. The most reliable data comes from auto-configured machines, but humans can supplement and correct this data if they choose to involve themselves in the process. Advances in capture hardware and software will continue to add more valuable metadata surrounding the photo, allowing a search engine to better understand the image with less text from the publisher.

    The biggest area for search advancement currently lies in image analysis for text, people, places, and things. National security budgets are currently funding advanced research in this area that will hopefully trickle down to the consumer sector to help us better identify our family photo collections without repetitive data input.

  10. Oct13

    Bluetooth transfer of webpage data

    TransSend button

    The Bluetooth SIG announced wireless transfer of contact, calendar, and notes information via what it's calling TransSend. It implements the OBEX standard you may have used to "beam" someone your contact information in the past. You can send vCard, vCal, vNote, plain text, or image files from a PC to supporting handsets.

    An ActiveX plugin for Internet Explorer lets users send content from a web page to their mobile phone. You can send driving directions, your contact info, or event data.

    Sounds cool, but it looks like the plugin is relying on proprietary markup for recognition instead of using something like microformat markup. According to Phone Scoop not many U.S. have an open OBEX profile, further limiting the usefulness of the extra markup.

  11. Oct10

    Widgets Live! conference in San Francisco on November 6

    The first ever conference dedicated to widgets, gadgets, and modules will take place on Monday, November 6, in San Francisco. The one-day conference will capture and summarize the emerging widget economy and allow developers, business leaders, and content producers to collaborate and better understand how they might participate in syndication at the edge of the network.

    Widget endpoints

    A small web loosely joined.

    I am organizing a conference named Widgets Live! next month in partnership with Om Malik to capture the emerging webspace of widgets. There's so much happening in the fast-moving widget space right now it's a bit difficult to keep track of it all. Feed your Chia Pet on your desktop. Let your blog visitors play Hangman using popular words of the day. Consult your calendar from your homepage. Check the weather from your coffee maker. There is so much activity in the customizable web powered by widgets we felt it was time to bring together the major players for a one-day industry overview and tutorial. We hope you can join us.

    Tickets are only $100 and available now. I'll blog more details about what it's like to plan a conference and the decisions made by organizers at a later date as everyone I've talked to so far has been intrigued at the behind-the-scenes operations of the industry, from $13 for a cup of coffee to the real reasons why conference WiFi is often horrible. The faster the conference sells out the more time I'll have for those posts! ;)

    Previous related posts:

  12. Oct10

    Moderating video search panel tonight in Mountain View

    I am co-moderating tonight's Search SIG event on the video ecosystem with Om Malik. Speakers include founders of VideoEgg, CastTV, Dabble, and POSTroller. I'll do my part to make sure the topic stays on search and discoverability, and hopefully we won't get too caught up in billion dollar buyouts.

    The event takes place at Microsoft's Silicon Valley campus starting at 6:30 p.m. If you're in the area and have an interest in search or video come by and check out the crowd and presentations.

    I'll be down in Silicon Valley for most of the day, visiting Google for lunch and settling into a cafe (most likely Barefoot Coffee Roasters) for the afternoon.

  13. Oct10

    Google acquires YouTube for $1.65 billion

    Google shocked the online world this weekend with its acquisition of leading video site YouTube for $1.65 billion in Google stock. YouTube will maintain its brand and site, and move into its new San Bruno offices this week as planned. Hitwise estimates YouTube's market share in September at 46%, an even stronger share in Europe. Google Video had an estimated market share of 11% in the same period.

    The $1.65 billion acquisition places YouTube at about the same purchase price adjusted for inflation as eBay's acquisition of PayPal in 2002 for $1.5 billion. I'm sure the similarities are not lost on the founders and board members formerly of PayPal. I think the acquisition price is absolutely nuts.

    Online video traffic

    While Google Video has been a popular destination for copyrighted videos such as NBA basketball or Nickelodeon cartoons YouTube has become the hub for all things video. Users create videos specifically for videos and its audience and publishers cross-post their work just to tap into the sheer volume of YouTube watchers. The most popular videos from other sites often make their way to YouTube, including copyrighted works such as Daily Show clips or the latest soccer highlights.

    YouTube is a mass-market play for Google and differentiated from its paid distribution system highlighted by Google Video. The acquisition delivers millions of pageviews with Google high CPMs delivered through Google's targeting abilities and video advertising inventory. According to the Wall Street Journal Google's Eric Schmidt and Advertising title="Vice President">VP Tim Armstrong will meet with News Corp. Chairman Rupert Murdoch, President Peter Chernin and Fox Interactive head Ross Levinsohn later this week to discuss Google integration with MySpace and I'm sure YouTube will be a hot topic of discussion.

    YouTube launched on December 15, 2005, ten months after the co-founders registered the domain name. Chad Hurley was a former designer at PayPal and Steve Chen was still employed at eBay as a software engineer. The PayPal alumni network served them well, leading to two venture rounds from former PayPal CFO Roelof Botha at Sequoia Capital. Hedge fund Artis Capital Management and Wilson Sonsini Goodrich & Rosati lawyer Stephen Welles also participated in the B round.

    Timeline

    I've put together an interactive timeline detailing the history of YouTube from the day the first founder quit PayPal until today's acquisition. Each event is clickable and a few contain links to more information. Enjoy!

    YouTube timeline
  14. Oct08

    Movable Type turns 5

    Movable Type logo 2001

    Five years ago today Benjamin Trott and Mena Grabowski Trott released Movable Type 1.0. About 100 copies of the blogging software was downloaded within the first hour of availability, and over 500 people had requested notification of each release.

    We've never claimed to be the best.

    We've never presented MOVABLE TYPE as the program that will revolutionize weblogging.

    We're just developing a system with a lot of the features that we've heard users are looking for.

    Luckily, we've received a lot of good word of mouth. People are hoping that MT will be THE program and THE solution.

    A brief history

    Movable Type launched Six Apart, a company that originally made money through paid custom installs, donations, and commercial licenses. The company later hosted its own version of Movable Type named TypePad, selling monthly subscriptions and licensing the hosted group blogging software to companies around the world. Six Apart bought LiveJournal in January 2005. Six Apart has recently been working on Vox, its first blogging software written from scratch with the resources of a 125-person company.

    Happy birthday Movable Type and Six Apart. Five years seems like such a long time looking back before the multiple VC rounds, women baring their breasts in protest outside the office, Christmas parties, and over 100 employees around the world.

  15. Oct06

    Preparing your feeds for IE7

    IE7 thumbnail

    Internet Explorer will be released in just a few weeks, pushed to Windows XP users as a critical update. The Windows RSS Platform ships as part of IE7 and will likely become the most popular desktop aggregator by the end of the year. Are you ready for the switchover?

    There are changes to CSS and JavaScript handling and an OpenSearch search box you should probably code against if you would like quick and easy access to your site and its archive. I'm mainly interested in the changes in feed syndication so I'll walk through some areas that might trip you up as a publisher.

    IE7 feed view

    Valid XML only

    Is your feed valid XML? If you or your customers are outputting content with invalid characters, an undefined namespace, or a non-breaking space (&nbsp;) the Windows RSS Platform will disregard your feed updates. A snapshot of Google Reader's subscriptions last December found about 7% of the feeds it indexes are not well-formed XML.

    Use modern feed formats

    The platform includes support for feed formats RSS 2.0, RSS 1.0, Atom 1.0. If you are still outputting in RSS 0.91, RSS 0.92, or Atom 0.3 IE7 will still support the format, but you are encouraged to upgrade to a more recent feed format for the best support. Feeds that reference a DTD are considered a potential security issue and the feed parser will reject the feed and display an error message.

    Auto-discovery

    Feed button IE7

    Can web browsers easily find your feeds? Internet Explorer 7 tries to auto-discover feeds referenced as a link alternate in your HTML. IE7 mimics Firefox's auto-discovery behavior, so if you notice your feed(s) lighting chicklets in Firefox you should be all set. Your web server can help identify feeds by serving the correct MIME types for each feed type such as application/atom+xml or text/xml. Browsers take a number of steps when trying to identify your feed. If you produce better output the browser does less work!

    Check for valid feed names

    Do you have a valid feed name? The Windows RSS Platform supports feed names between 1 and 120 characters in length and may not contain a back-slash ("\") or Unicode control characters in range 0-31.

    Check for valid feed markup

    The Feed Validator can help you find more issues in your feeds that might cause problems for feed parsers. The feed validator project is open-source and you can run your own local copy using Python.

  16. Oct05

    Google Blog Search adds ping beacon, changes.xml

    Google blog ping

    Google Blog Search is now accepting pings and republishing the updates it observes. You can submit an update using XML-RPC or REST, similar to other blog services and easily added to your weblog's ping configuration.

    RPC endpoint: http://blogsearch.google.com/ping/RPC2

    Google publishes the last 5 minutes of ping activity in its changes.xml file. It is possible to receive pings of different recency by adding the last parameter to your request with a number of seconds between 1 and 300.

    The new service is a change in Google's view of the web, accepting the value of fresh index content within minutes instead of waiting for the regular polling schedule.

  17. Oct04

    Google Code Search

    Google Code Search

    Google has a new search product focused on source code. It peeks inside tarballs and other recognized formats, allowing you to search the index by regex, license, or language. It's pretty easy to see how many projects are using a given library (such as feedparser or magpie) and keep inventing new ways to explore software.

    You can access the code search engine through a GData Atom feed for easy integration wherever you choose.

    I find Google Code Search is easier to use than Koders, and may come in handy when looking for different ways of approaching a particular programming problem or library.

  18. Oct04

    NetRatings finds 40% of online Britons use news feeds

    A study by Nielsen//NetRatings found 40% of Britons receive automatic news feeds to their browser or desktop but 69% had never heard of Really Simple Syndication. About 15% of the people surveyed have heard of an iPod but are not sure what it is. A three-page PDF summarizing the study is available from NetRatings.

    British knowledge of technology terms

    The use of acronym's caused a marked drop in user knowledge. 29% of those surveyed knew what "IM" meant but 86% knew the term "instant messaging." The average online Briton now owns 4-5 digital or networked devices. 3G mobile phones were more common than iPods and DVRs were only slightly less popular than a gaming console.

  19. Oct03

    Google Gadgets on your webpage

    Google "Universal" Gadgets are now available for blogs and pages around the web. A single Google gadget can now be deployed on Google Personalized Homepage, Google Desktop, Google Page Creator, or via a JavaScript embed on any editable webpage.

    You can add PacMan to your blog sidebar or display photos uploaded to Picassa on your MySpace page, or add a Google Reader viewer anywhere.

    Google's support for webpage embeds brings the Google Reader story full circle. The team originally envisioned an RSS widget available on a blog sidebar and the project grew into much more. You can now access the web application in many different versions in various states of privacy including your shared items marked up in HTML or Atom, a full feed aggregator, or a widget for your homepage, desktop, or blog.

    Create your own borders

    Google is using a new domain, gmodules.com, to serve the embedded gadgets. The embeds reference a Google-styled border preference (http://gmodules.com/ig/images/) but you can create your own custom border of GIF files if you stick to Google's naming convention. You can also define your preferred border using CSS.

    Custom GIFs

    tl
    top left corner
    tt
    top top
    tr
    top right corner
    l
    left side
    r
    right side
    bl
    bottom left corner
    b
    bottom bottom
    br
    bottom right corner
  20. Oct02

    Open Hack Day helps build YDN from the inside

    Yahoo! hosted a public hack day last weekend, inviting 400 developers to learn more about the company, web development best practices, and how to use Yahoo! services in their own products and projects. The Yahoo! open hack day was the first big effort by a newly formed team seeking to gather support inside and outside Yahoo! as the programming world begins to embrace connected services in the data cloud. In this post I will provide some background on the team behind the event and present some of the direct and indirect benefits obtained within Yahoo! for their hard work.

    Background

    The Yahoo Developer Network and its parent product group has been reborn under new staff and management over the last three months. In June Bradley Horowitz stepped into a new role as VP of Product Strategy leading a product group that includes a few new initiatives and staff. Scott Gatz joined the group to work on project incubations and Caterina Fake is another recent addition in the technology development group.

    The Yahoo! Developer Network has been pretty busy the last few months under the new leadership of Chad Dickerson. In January YDN manager Toni Schneider left Yahoo! to join startup Automattic. A few other staff members left or were fired, and new management put in place under a new organizational structure under Bradley Horowitz. Yahoo! employees Jeremy Zawodny, Kent Brewster, and Matt McAlister replaced the empty headcount and a new team was formed

    Hack Day

    Yahoo! Hack Day Q3 2006 poster

    One of the first tasks of the newly assembled team was a mad dash towards putting together a public Yahoo! hack day a few weeks after the latest internal hackathon. It's time to get to know your teammates and pitch in during what would be a defining moment for the team both inside and outside of Yahoo!

    Ryan Kennedy of Yahoo! MailDouglas CrockfordKent BrewsterTenni Theurer

    A physical event with over 400 attendees expecting food, drinks, Internet connectivity, a space on the lawn, and perhaps help with personal hygiene such as showering takes a lot of work. The event catalyzed internal teams to expedite their API development work such as Flickr's JSON support completed in a day. The amount of external attention focused on Yahoo! over one weekend was a great catalyst for internal developers to put in a little extra work and see results of their labor first-hand, meeting developers face-to-face and bug-fixing API frontends in near real-time.

    The connection with the customer experience by Yahoo! API developers and product teams this weekend will strengthen support for YDN moving forward. The YDN doesn't have an easy job and needs all the internal support they can get to be successful. They rely on the development time and servers of individual teams within Yahoo! to remain a success internally as well as in the marketplace. The continued public exposure of their work to top executives such as Jeff Weiner, Ash Patel, and David Filo will help create continued support of a group without direct revenue.

    Summary

    Overall the open hack day was a big success for the newly formed team and helped solidify their identity within the company. The developer event benefited from first-mover advantage as other large Internet companies look for new ways to embrace web development efforts through open APIs and developer relation programs.

    In a future post I will write about how other companies might create an open culture of participation by hosting their own events actively engaging both focused and broad communities.

  21. Oct01

    Microsoft awards three Windows Live MVPs

    Microsoft MVP logo

    Microsoft has awarded three web developers with its Most Valuable Professional status. The MVP program is Microsoft's way of recognizing the work and contributions of independent developers and these individuals are rewarded with a fast-track to product feedback teams among other benefits. The first three awards include a consultant in Australia who maintains the Via Virtual Earth community and a consultant in Washington D.C. who creates Microsoft gadgets.

    Recognition of independent third party developers and community leaders will play a significant role in the rollout of web as a platform strategies from big Internet companies. How do you reward exceptional contributors to your development community and keep them loyal? Independent developers want to be heard, an on a small merit-based scale it's achievable to receive high quality feedback from the people who tweak your product every day. I would like to see more companies introduce MVP-like programs to reward outstanding external contributors.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Latest feature: Widget development

Archives: Popular Categories

Sites: More from Niall