Igor Jablokov interview on multimodal search

Igor Jablokov

Last Monday night I sat down with Igor Jablokov, an IBM program director working on new methods of multimodal search using open standards, to do a podcast. Multimodal search adds voice commands to a visual display to allow easy access to a long list of commands and contextual information. The technology is currently used in web browsers, mobile phones, and automobile computing systems. I also recorded a presentation by Igor on mobile search at Mobile Monday in April.

IBM is one of the contributors to the VoiceXML proposed standard. Opera and Motorola are also active contributors. IBM promotes a voice-activated system by combining XHTML, VoiceXML, and XML events. The open software works across many server and client platforms including an Eclipse-based environment for creating voice-enabled content.

Igor showed off a Samsung phone running Windows Mobile with a prototype of WebSphere multimodal browser. The browser accepts search queries for Yahoo! Local and returns voice-enabled results using Yahoo!’s web service APIs.

We discussed dynamic grammars, a new development in mobile search that creates acceptable grammars specific to a returned data set. If you are in your car waiting for an urgent e-mail you can ask your car to retrieve all new e-mails with an urgent status and build a grammar based on the senders in the returned data set.

Igor is tasked with building for the future. Many of the technologies we discussed are not expected to be mainstream until 2008 or 2010. Companies involved in creating these voice-enabled interfaces are already planning for 2015.

Thanks to Igor for requesting this interview and Text100 for making all the arrangements.

My audio interview with Igor Jablokov is available in MP3 format. The 28-minute interview is a 12.9 MB download.

Interview questions

  1. What are some of the biggest obstacles in mobile search today?
  2. What is the XHTML+Voice proposal?
  3. What devices and software support the service today?
  4. What companies are outputting content in this format?
  5. What is IBM’s involvement? What other companies are involved?
  6. How is it being used in the car?
  7. How can you accommodate a variety of accents and dialects? A thick Irish accent is supposed to be very difficult to compute.
  8. You brought a new mobile prototype with you today. What’s exciting about this advancement?
  9. Tell me about mixed initiatives. What are the current use cases and implementations?
  10. I’ve used voice software in the past and I felt the need to slow down and annunciate. How has voice recognition improved?
  11. Tell me about JSGF.
  12. How can you create dynamically generated grammars?
  13. Why should I, as a small company, be interested in X+V? Where is the ROI?
  14. What are some ways we can voice-enable our site? What changes do we need to make?
  15. What are some of the largest grammar implementations right now and what sort of hardware is needed to deal with that?
  16. What are some competing standards and implementations? Microsoft Speech?
  17. What are some of the tools I need to get started?
  18. What’s coming next? How can I build an application for the next generation of devices and standards?

Tags: , ,

WordPress Germany and Google Maps

The team over at Deutche WordPress just added the ability to browse WordPress blogs in Germany using Google Maps. Deutche WordPress maintains a searchable directory of WordPress-powered blogs in Germany and each directory level contains a new map overlay of locations. Check out WordPress blogs in Germany focused on soccer for example. Google Maps currently provides no map coverage of Germany so the site is using only the country outline at the moment.

Tags: ,

Matt Mullenweg on VeriSign’s move into the blog space

Matt Mullenweg recently posted his views on VeriSign moving into the blog space with its acquisition of Weblogs.com. Matt is a lead developer of WordPress, an open source blogging tool, and one of the developers of Ping-o-Matic, a ping relay service that currently forwards a blog update ping to over 20 destinations. Matt has some first-hand experience with the team at VeriSign.

We should have been better prepared for this. Earlier in the year Verisign had the Boston Consulting Group calling people in the space trying to pick their brains, while at the same time refusing to reveal who they were working for. (Shady.) The “real time web” group also took me to dinner at one point and outlined their view for a “value-added” ping ecosystem (with Verisign in the middle, of course). Every major content producer and every company relying on the ping stream should be very worried about this move.

I think the blogosphere is currently waiting for VeriSign to unveil more of its intentions for these services but so far they seem to be off to a shaky start.

Tags:

VeriSign acquires Moreover

Moreover Technologies has been acquired by VeriSign for between $25 million and $30 million. The acquisition is VeriSign’s second announced move in the blogging space in the past week having previously acquired Weblogs.com. Moreover founders David Galbraith and Nick Denton confirmed the deal on their own personal blogs. Rafat Ali reports Google came in with a higher bid a little too late.

Moreover currently powers sites such as My MSN feed modules, and Microsoft has to be questioning its relationship with Moreover if it was not already. It’s interesting to hear Google attempted to make a play for the company considering the company has taken a very do-it-yourself approach to content and search.

Tags: ,

Yahoo! Blog Search

Yahoo! launched its blog search product tonight as a sub-property of Yahoo! News. Here is a results page on Yahoo! Blog Search for “Bush” for example. The content is also exposed on Yahoo! News Search result pages. Here is a results page on Yahoo! News search for “Bush” showing 4 results in the right sidebar.

Yahoo! branded the new search as “Blog search” but it is obvious from the results that Yahoo! is currently focused on one file format: RSS. Every search result in my test searches includes a link to the source RSS feed.

Yahoo! Search blog notes the index “contains content from a subset of blogs” but they would like to increase their index to include sources pinging blo.gs (currently tracking 9,730,011 blogs). My guess is the index contains content from the My Yahoo!, a source that is already well structured. News.com mentions the index currently includes only “hundreds of thousands of blogs” which seems really really small. Search Engine Watch confirms the index source is My Yahoo!.

My Yahoo! feed count

The index seems limited to only recent posts. Searches for the last SuperBowl (SuperBowl XXXIX) returns only one result, and it’s from a site selling merchandise. Yahoo! might argue that as a news property blog search is only interested in timely data from the last month.

The search engine result pages (SERP) sort by “relevance” by default, applying Yahoo!’s secret sauce to bring popular items to the top. You can sort your results by date with an additional click.

Yahoo! RSS Search individual results include channel title, channel link, item title, channel title, publication date, and what appears to be a non-contextual excerpt. What’s non-contextual? Yahoo! quotes the beginning of the description instead of focusing your attention on the occurrence of your search term.

The SERPs include tagged photographs from Flickr on the right sidebar complete with the photograph’s title, thumbnail, and author.

The previously exposed alpha version of Yahoo! RSS Search included the ability to search by time, relevance, and popularity. The final version of Yahoo! RSS Search combines popularity and relevance into one sort function.

You can subscribe to any search as a RSS feed. Yahoo! promotes its own My Yahoo! property of course, but you can follow the link alternate or the orange-on-white XML button.

I am still digging deeper while attending the a mobile search event at Google. More later.

Blogger adds inbound links to posts

BackLinks configuration

Blogger blogs can now easily add a listing of inbound links to any blog entry. The new feature utilizes Google Blog Search’s URL search feature to display the a list of links on the individual post page.

Each view of the individual post page results in a dynamic JavaScript call to generate an array containing a link URL, title, excerpt, author, and time. Each link uses the “nofollow” attribute value. Users without JavaScript see a link to Google Blog Search for the post URL.

Blog authors can enable the inbound links display across their entire blog and alter the setting on a per-post basis. Authors can delete any of the listed links for their blog post. The feature is disabled by default.

Google supports comment notification via e-mail but I am not sure if it is currently possible to receive link alerts via e-mail as well.

The integration makes a lot of sense and is a good showcase for tracking links to your blog.

Tags: ,

Google Reader

Google just released their web-based feed reader named Google Reader. Users can login using their Google account and track web feeds in a two-column layout with a default sort of “relevance.”

Google is not currently pulling from its archive of past blog entries and only displays the items currently present in the feed. The first column displays a list of subscribed feeds and switches to a list of posts when you click an item. Posts are displayed in the sidebar drawer using title and publication date only. The complete entry display contains the entry title, author, publication date, link, and content. Google Reader recognizes audio and video enclosure, adding a link to “original audio source” and “original video source” when found.

Each account can add a star to a post similar to the Gmail interface. Google Reader also supports “labels” for feeds and posts that is just like tagging but by a different name.

The JavaScript makes obvious use of Gmail code, right down to variable names such as “_MSG_GMAIL.”

Google is using a User-Agent of “FeedFetcher-Google; (+http://www.google.com/feedfetcher.html)” to power its reader but unlike Bloglines or My Yahoo!, Google does not currently communicate the total number of readers or viewers of your content.

It looks like Google Reader plans to integrate a Flash-based audio player into the reading interface. Judging from the source code Google Reader also plans to add support for author tags.

Yahoo! RSS awareness whitepaper

Yahoo! conducted a study of Internet users in August in an attempt to quantify the ubiquity of RSS among Internet users. Yahoo! released a whitepaper covering some of their findings after surveying over 4000 Internet users in August.

Findings

Only 12% of those surveyed were aware of RSS, and only 4% have knowingly used RSS. 27% of respondents had interacted with RSS content in personalized start pages such as My Yahoo! but did not realize they were using RSS.

The average RSS user subscribed to 6.6 feeds and spend an average of 4.1 hours per week reading those feeds. Only 7% of RSS aware users mentioned instant updating as a benefit of the medium, suggesting the sources play a larger role than their timeliness. This finding allows aggregators to worry less about ping and poll frequency and more about sourcing their list of available feeds.

Only 17% of survey respondents had ever seen a white-on-orange XML button and only 4% had ever clicked the button. Users who clicked the button either copied and pasted the URL into a newsreader, clicked on another button on the feed view to add the feed to a newsreader, or left the site. 50% of RSS-aware respondents choose feeds from the list available in their aggregator

Reflections

The results of the survey could be interpreted as a user demand for My Yahoo!, small amounts of feeds, a browsable feed directory, and the spread of “Add to My Yahoo!” chicklets on every blog. Browsers such as Firefox 1.5 and Internet Explorer 7 are just now beginning to integrate web feeds as a rich software experience and should change usage of online services such as My Yahoo!. The survey shows there is a large available market for stand-alone aggregators to offer features and feed browsing capabilities beyond what exists on one browser window inside of a personal start page such as My Yahoo!.

The biggest surprise to me was the value of the browsable feed in each tool’s built-in listing. Blog authors should be aware of their placement within such listings and perhaps consider a paid listing for increased subscriptions.

Tags: ,