Recently in Google Category

News, announcements, and analysis about Google search, ads, and apps.

  1. Apr14

    Google search referer changes

    Google will roll out a change to its search results pages later this week designed to better capture outbound clicks. Google search result pages will link to a gateway URL before delivering the visitor to his final destination. These gateway URLs will replace search result URLs exposed via the Referer HTTP header. Google announced the new gateway page on its Google Analytics blog, giving webmasters a few days to prepare for the change.

    What is changing?

    The Referer path for Google search results will change from /search to /url. It is still not clear which URL parameters from the search page will be passed through the gateway. The search term, q, is still preserved inside the sample URL provided by the Google Analytics blog.

    Before
    http://www.google[.sld].tld/search
    After
    http://www.google[.sld].tld/url

    Scripts, plugins, and helpers replying on a set Referer path for content highlighting or targeting will need to adjust their code as Google's change spreads throughout their data centers worldwide.

    Why the change?

    Google is likely making this change to better track search actions and shield URL parameters from sites downstream. Gateway URLs dependably capture click data and reformat the information passed along to external sites.

    Search engines evaluate customer satisfaction based partly on outbound click behavior. Searchers who consistently click on the third search result may be sending Google a signal about that content's authority for a search term and therefore influencing the ranking algorithms. Traditionally such an action would be measured with a JavaScript onclick event added to the link to pass a signal back to the search engine's servers before taking the searcher to his destination. JavaScript tracking does not work on all clients, including clients accessing search results with JavaScript turned off (e.g. through Google's APIs or a feature phone).

    The search result page includes detailed information needed by Google to deliver the best possible result. A search might include a location from a GPS sensor, social context drawn from a group or custom search engine parameter, or other sources of questionable exposure. Google will only expose a few relevant parameters in URLs included in a web browser's Referer headers.

    Summary

    The way your website interprets traffic from one of its top providers will change later this week. You will need to adjust scripts and check for updates to analytics software where appropriate. If you notice a huge drop in measured search referrals from Google don't panic. Just make sure you are measuring the correct actions.

  2. Sep03

    The story behind Google Chrome

    Ben Goodger and Google Chrome

    Google released its second web browser yesterday afternoon, adding additional headroom for web applications stretching the limits of what it's possible to accomplish within a web browser. The Google Chrome team assembled domain experts in various fields over the past six years, both through direct hires and acquisitions, to create a new browser and its critical components from scratch. GMail and Google Maps pushed the Web to its limits, taking advantage of browser technologies invented in Redmond but left dormant for far too long. Contributing to Firefox's core, writing browser extensions, and championing HTML could only take the $150 billion company so far: they needed to own the full browser to push their Web efforts forward at full speed.

    1. Growing Frustrations
    2. Acquisition Boost
    3. A New Browser from Scratch
    4. Rev your JavaScript Engines
    5. Meet the Team
    6. Summary

    Growing Frustrations

    Brian Rakowski joined Google in July 2002 as the company's first associate product manager. His first assignment? Launch GMail with features and responsiveness to rival desktop mail clients. Gmail tapped into relatively dormant browser features such as XMLHttpRequest, sockets, prefetch, and more to create a web applications stretching the limits of what was possible inside web browsers of 2004. Today's Gmail continues to run into a browser's limits, setting minimum requirements of Internet Explorer 7+ and Firefox 2+. Google web apps teams such as Maps and Mail continually bump their heads against the latest capabilities of web browsers and in some cases invent their own runtimes.

    Ian Hickson first learned the inner workings of web browsers while an intern at Netscape. After working on Opera for a few years and creating tests for Firefox Ian joined Google to continue his work on new browser features. HTML5 and browser compliance "acid" tests are significant attempts by Ian and others to redefine Web browsers through specs, test, and implementations but until now Google could only offer development help and browser extensions such as Gears to accelerate browser capabilities.

    Google extended what it could not immediately add to the browser core. Gears for new application functionality on multiple browsers. Browser Sync to synchronize browser settings and data across multiple computers. Safe Browsing to create more web trust. Teams from each of these extensions are now working on Google Chrome.

    Acquisition Boost

    Google released its first official Web browser on August 18, 2008 with the beta release of the Android mobile operating system. Google acquired Android in August 2005 to establish a foot-hold on the fastest growing computer (and Web) market: mobile handsets. Android highlights Google's web properties through its WebKit-based browser and dependent applications. Google acquired Ottawa-based Reqwireless and its mobile web browser in the summer of 2005 to team up with the Android team on its web interface. Web views are an integral part of Android and Google Chrome shares much of Android's code, including its graphics engine.

    Google Chrome and Android both take advantage of the Skia vector graphics library developed by a small company in North Carolina Google acquired in 2005. The Skia team formerly worked on Openwave's popular mobile browser's graphics engine. Google Chrome browser includes Skia graphics engine ports for Windows, Mac, and Linux.

    Google acquired application security company GreenBorder in May 2007. GreenBorder technology automatically sandboxes web code and network traffic by creating a bridge between applications. The GreenBorder technology isolated Internet Explorer or Firefox instances into a "sandbox" inside virtual machine instances. These sandboxes form the code isolation layers of Google Chrome, protecting other tabs and the parent operating system from the code executing on each web page.

    A New Browser from Scratch

    Ben Goodger, Google Chrome's tech lead, is best known for assembling the Firefox web browser out of Mozilla's SeaMonkey application suite. Manticore, Camino, and later Firefox were all attempts in 2001 to rethink the Web browser for the modern age. Browsing took center stage away from a communications suite, user interfaces reimagined for Web efficiency, and (some) legacy cruft tossed to the side. Google hired Ben in 2005 to strengthen its own browser contributions and eventually fully rearchitect a web browser for the modern Web.

    Google hired top Firefox developers in 2005 and 2006 such as Darin Fisher, Pam Greene, and Brian Ryner. In Spring 2006 the team began work on a new browser prototype built on top of WebKit designed for broadband-connected, always-on, web applications such as Gmail or Google Maps. Could the browser experts give web apps some breathing room?

    Modern computers feature multi-core multi-gigahertz CPUs, gigabytes of memory, megabits of bandwidth, and bulky hard drives. Our web browsers should separate browser tabs into their own processes, multi-thread all communications with the operating system, boost cache sizes, and not be afraid to command more bandwidth when available. Internet Explorer 8, Firefox 3.1, and Apple Safari are taking fresh approaches to web browsers for modern machines but Google Chrome has the advantage of a fresh start to achieve some features not currently possible in other browser architectures.

    Features such as tab-isolation and task monitoring are difficult tasks to add inside an existing browser architecture of shared run-times and window models (as John Resig mentioned). Internet Explorer 8's Loosely Coupled IE partially abstracts browser tab instances and the industry is generally headed in this direction.

    Web application-specific resource monitoring should motivate more websites to reduce their browser bloat now that they've been identified. Individual users can also compare web application resource usage directly with their desktop counterparts.

    Rev your JavaScript Engines

    Lars Bak and his team in Århus, Denmark have spent many years writing virtual machines: the run-times that translate programming code into machine code. Lars wrote Sun's Java VM, HotSpot, and later slimmed down the VM for J2ME (CLDC HI project Monty). A few years ago Lars and his team in Denmark began work on a new interpreted JavaScript engine optimized for x86 and ARM architectures.

    The V8 engine is specifically tuned for recursive JavaScript tasks, optimizing commonly used components of your application. V8 is multi-threaded, opening up new parallel processing on multiple computing cores. V8 guesses how you might use your JavaScript code, and backtracks over any faulty assumptions. It's just one of the new engines we'll see inside our web browsers by the end of 2008.

    Google Chrome could have used the same JavaScript interpreter as its WebKit rendering engine (JavaScriptCore, SquirrelFish) but the team had an opportunity, and the funding, to rewrite an interpreter from scratch for desktop and mobile runtimes.

    The V8 engine enables new feature sets for Google's web applications such as Gmail and Google Maps. Web application developers avoid adding features that visibly slow down browsers or cause processing pauses in your application experience. New speed in new areas adds functionality to existing apps. Google programmers should create more efficient code, tested against multiple interpreters, and optimized for modern computers as a result of V8. Even if Google Chrome gains no significant browser market share I still expect it will be the best single-site browser for Google web applications.

    Google Chrome adds additional JavaScript functionality through Gears. Gears is bundled with every Chrome install, adding new features to the web browser faster than previous plugins. The Gears libraries include support for new local cache structures, local databases, location data, background tasks, and file handling. Chrome boosts the available Gears footprint for web developers, including Google's own apps such as Google Reader and Google Docs (and my blog). The current Gears code included in Chrome replicates V8 and sqlite code already present in the browser, a bolt-on that will hopefully be integrated in the near future.

    Chrome, V8, and Gears will be a new testing ground for Google's HTML5 efforts, winning a new seat at the table as an implementor with upstream standards groups such as W3C.

    Meet the Team

    Google Chrome team leads

    I am tracking at least 20 people involved in the Google Chrome project across Google. I'm sure Chromium commit logs will reveal even more (update: more complete list here), but below is a quick summary of Chrome staff.

    Brian Rakowski, Lead Product Manager
    Brian was Google's first associate product manager in 2002, assigned to Gmail. He later worked on the Google Browser Sync Firefox plugin.
    Ben Goodger, Software Engineer
    Ben is the former Firefox 1.0 project lead. He also authored the Firefox extensions system. He joined Google as 2005.
    Mike Pinkerton, Technical Lead
    Mike is one of the Google team members responsible for bringing Chrome to the Mac. Mike worked at Netscape and later on the Gecko-powered AOL client before co-founding the Camino project. Mike joined Google in September 2005 and continues to lead Camino development.
    Darin Fisher, Software Engineer
    Darin was a frequent contributor to the Firefox codebase. He specialized in network libraries, cookies and permissions, and the Netscape Portable Runtime. Darin joined Google in 2005.
    Lars Bak, Software Engineer, V8
    Lars was the core developer on Java HotSpot VM and Monty VM in J2ME for Sun. He co-founded object-oriented VM companies for embedded devices before joining Google. Lars worked on V8 from a farm in Århus, Denmark before moving the team to university offices.
    Kasper Lund, Software Engineer, V8
    Kasper shares a long history with Lars Bak working on virtual machines.
    Brian Ryner, Software Engineer
    Brian is a former contributor to Firefox where he added mousewheel support, tweaked the Gecko rendering engine core, password management, and Linux installers.
    Pam Greene, Software Engineer
    Pam is a long time Firefox contributor. She added OpenSearch to the browser and contributed to full-text search in Places/AwesomeBar.
    Ian Fette, Product Manager
    Ian is a former Firefox contributor who worked on anti-phishing, anti-malware, spelling correction, and the Safe Browsing API.
    Arnaud Weber, Software Engineer
    Arnaud is a former Director of Research and Development at Netscape and Borland before joining Google to work on a "secret project" in September 2006.
    Brett Wilson, Software Engineer
    Brett formerly worked on the Google Toolbar. He contributed to Firefox history and bookmarks functionality.
    Mike Belshe, Software Engineer
    Mike helped write an Outlook add-on called Chrome for Lookout Software before being acquired by Microsoft. Mike also formerly worked at Netscape and Good Technology.
    Huan Ren, Software Engineer
    Huan works on network flow control, negotiating browser interactions with network resources. Huan formerly worked at Microsoft.
    Erik Kay, Software Engineer
    Erik formerly worked on the AvantGo browser, Qurb anti-spam software for Outlook and Outlook Express.
    Glen Murphy, Software Engineer
    Glen specializes in user interface design. He previously worked on user interface. Firefox extensions. Google Browser Sync, Google Blog Search
    Evan Martin, Software Engineer
    Evan writes automated testing tools for Chrome and the Web.
    John Abd-El-Malek, Software Engineer
    John is part of the Windows specialist team at Google bringing Google Desktop, Google Talk , and Breakpad onto Windows XP and Windows Vista.
    Amanda Walker, Software Engineer
    Amanda is one of the people responsible for Chrome's upcoming Mac version.
    Mark Mentovai, Software Engineer
    Mark was heavily involved in moving Firefox for Mac to its current Intel-based architecture. He has worked on the Breakpad project and many levels of Chrome's code.
    Carlos Pizano, Software Engineer
    Carlos formerly worked on GreenBorder and continues to work on Chrome sandboxing.
    Mark Larson, Program Manager
    Mark is also formerly of GreenBorder and its sandboxing specialties.
    Aaron Boodman, Software Engineer, Gears
    Aaron improves user experience with JavaScript. He's best known for his work on Gmail, Greasemonkey, and Gears.

    Summary

    Google Chrome logo Google's business depends on the speed and availability of Web access to search, advertising, and applications. Chrome is Google's second attempt to better control the front-door to its content with full applications optimized for its heavy apps. Google Chrome builds on top of the work of Android by adding individual applications to already popular operating systems. Google has flirted with the idea of its own web browser for many years but has only recently released working implementations of its own full browser applications.

    Android, Chrome, and Gears will continue to grow in unison and extend individual pieces into established operating systems. Google is building a new suite of application extraction layers that should have strong leverage across Windows, Mac, and Linux to directly control the company's destiny on these platforms.

    It's an exciting time for new browser technologies as Internet Explorer, Firefox, and WebKit each compete over standards implementations and performance. Officially adding Google Chrome to the browser space only strengthens Google's position strengthening the future web and delivers strong single site browsing experiences for their core web applications.

  3. Jul23

    Writing Flash for search engines

    Flash logo On June 30 Google and Adobe announced a new indexer optimized for Flash (SWF) discovered by its web crawlers. The new partnership takes advantage of a server-side Flash player optimized for a search engine indexing environment and unidirectional text (e.g. no Hebrew or Arabic). Search engines previously discovered the location of a SWF file on the Web and perhaps indexed its metadata but did not take a deep look inside its binary content. Last month's announcement was a big change for both Adobe and major search engines as it is now possible to run a very GUI-based Flash file at the command line and interpret both its text content and interaction opportunities. In this post I will walk through what we currently know about the search engine Flash runtime and how it affects search engine optimization in Flash.

    Build for a blind, deaf user

    Search engine indexers are blind and deaf. They open a file, examine its contents, and try to deduce meaning through your page structure and its content. A web page designed for screen readers will also expose more content to search engines not evaluating your page's full render state of content, layout, and interactions.

    Search engines utilizing Flash player indexing are still restricted to this screen reader approach. Accessible Flash applications complete with names, labels, reading order and XMP should continue to be more search engine friendly than other SWF files on the Web. Google's tips for creating accessible, crawlable sites still apply, but in a new Flash context.

    The server-side Flash Player

    If we want to understand how search engines such as Google might interpret Flash content we'll first need to take a look at the Flash Player itself. Adobe provides little details in its official SWF searchability FAQ but we can infer a few implementation details. How would you rewrite Flash Player for server-side indexing of SWF content?

    The search engine Flash Player is likely a scaled-down, secure version optimized for machine readers. Strip out video, audio, fonts, and file system access. The server side Flash player should open a binary SWF file, pull out the functionality it understands, and create a data tree of all possible actions. These features are actually quite similar to a screen reader interface, but Adobe is instead targeting a Linux-based headless runtime. I believe the guts of the Flash Player for servers is built using the same accessibility abstraction layer Adobe currently uses for Windows and could extend to platform-level binds Mac and Linux desktops.

    The Adobe Flash Player creates a list of objects on the screen at each render and records this list into an accessible data tree (according to a 2005 white paper by Bob Regan). This data tree is updated with each change in the application state, allowing any application listening in to update an object model of clickable buttons, labels, and links.

    Adobe interfaces with OS-level accessibility frameworks on Windows currently and could extend this model to every major desktop platform. The Windows version of Flash Player binds to Microsoft Active Accessibility. Mac versions of Flash could bind to Universal Access. On GNOME the player could bind to the Assistive Technology Service Provider Interface (at-spi). A server-side version of Flash likely builds upon this same abstracted accessibility object model, passing screen objects to the search engine indexer for further interpretation or interaction.

    Windows Live Search was noticeably missing from the server-side Flash player announcement for search engines. It's possible Adobe has developed a server-side Flash Player for Linux that is not yet compatible with the Windows Server environment of Microsoft's Windows Live Search.

    Accessing deep content

    Googlebot can fill out forms, click buttons, and navigate deep within your site. Clickable Flash objects will likely behave the same way, exposing new content paths for Googlebot within your larger SWF. Flash websites can help ensure deep indexing of SWF content by adding individual SWF fragments to their sitemap. Reading order will likely play a roll in selecting important content on your page, and I expect Googlebot may follow the first item in your reading order sooner than the last.

    Googlebot still throws out references to a anchor name fragment in the URL (e.g. #section=menu) and this announcement does not change the general behavior of Google's URL storage and analysis.

    Do Flash versions matter?

    Emperor Tamarin monkey

    The official announcement from Google and Adobe makes it seem like all Flash is now universally indexed regardless of your Flash version but I think that's bogus. If a search engine wanted to index JavaScript they might run Rhino on the server and interpret results. If you wanted to build an advanced interpreter of Flash content you might use Tamarin or its derivatives, an AVM2 (Flash 9+) virtual machine. I believe AVM2-compatible SWF files will enjoy better search exposure than binaries built for the older AVM. I can't prove it; just a hunch.

    Dynamic object insertion

    Googlebot will detect common JavaScript libraries such as SWFObject used to dynamically insert Flash content at page load. Publishers can back up the dynamic insertion JavaScript with a noscript element just in case Google doesn't discover your dynamic insertion. Sticking with standard dynamic insertion libraries will help ensure your content is discovered through expected behaviors.

    Summary

    The new search version of Flash Player opens the binary SWF format to interpretation by text-focused search engines. Flash developers can take additional steps to package SWF content for accessibility and search discoverability. Developing for modern virtual machines, adding accessibility hooks, and wrapping your SWF in XMP.

  4. Apr10

    Google App Engine for developers

    Google App Engine

    On Monday Google launched Google App Engine, a hosted dynamic runtime environment for Python web applications inside Google's geo-distributed architecture. Google App Engine is the latest in a series of Google-hosted application environments and the first publicly-available dynamic runtime and storage environment based on large-scale propriety computing systems.

    Google App Engine lets any Python developer execute CGI-driven Web applications, store its results, and serve static content from a fault-tolerant geo-distributed computing grid built exclusively for modern Web applications. I met with the App Engine's team leads on Monday morning for an in-depth overview of the product, its features, and its limitations. Google has been working on the Google App Engine since at least March 2006 and has only just begun revealing some of its features. In this post I will summarize Google App Engine from a developer's point of view, outline its major features, and examine pitfalls for developers and startups interested in deploying web applications on Google's servers.

    What is Google App Engine?

    Google App Engine is a proprietary virtualized computing suite covering the major common components of a modern web application: dynamic runtime, persistent storage, static file serving, user management, external web requests, e-mail communication, service monitoring, and log analysis. The Google App Engine product offers a single hosted production web server stack hosted on Google's custom-designed computers and datacenters distributed around the world.

    Google App Engine is a managed hosting environment with a tightly managed stack running in a machine-independent environment. It simplifies the deployment and management of your web application software stack while constraining you to a specific stack. When I start a new web development project today I have to first setup a tiered system to effectively handle site growth:

    3tera Applogic grid
    1. Purchase dedicated servers or virtualized slices. Estimate necessary CPU, memory, disk space, etc. at each tier.
    2. Configure a web server for dynamic content. Install Python and its eggs, Apache HTTPd and extra modules such as modwsgi. Configure and tweak each. Open appropriate ports. Listen.
    3. Setup a MySQL database server and choose the appropriate storage engine. Configure MySQL, add users, add permissions. Tweak and optimize.
    4. Add an in-memory caching layer for frequently accessed dynamic content.
    5. Monitor your uptime and resource utilization with Ganglia and/or other tools on each machine.
    6. Serve static files such as JavaScript, CSS, and images from a specialized serving environment such as Amazon's Simple Storage Service.
    7. Turn your static server into an origin server for a CDN with points of presence close to your website's users.
    8. Connect each piece of the stack, keep its software updated to avoid security vulnerabilities, and hopefully respond to all website requests in less than a second.
    9. Dedicate work hours and expertise to all the above. Hire outside assistance if needed.
    10. Don't go broke trying.

    Your tiers will expand as your new web application gains popularity. Your single-server tiers become load-balanced services, message bus broadcasts and listeners, and distributed cache arrays at scale. You'll probably spend time rearchitecting your application at each stage of growth to incorporate for these new resource demands if you can afford the time, expertise, and effort.

    Google App Engine is a new and interesting solution for Python developers interested in adding features, not servers. Google spends hundreds of millions of dollars developing its custom infrastructure with 12-volt power supplies tapped into a hydro-electric dam next door and fat fiber pipes owned by local governments carrying requests and responses to their proper home. Google's physical infrastructure is vast array of highly optimized web machines, and we'll now be able to see how such infrastructure performs across more generic applications on App Engine.

    Freemium hosting model

    Google App Engine is a "freemium" business model offering basic features for free with paid upsells available for application developers exceeding approximately 5 million pageviews a month. This resource quota approximately matches the Google Analytics 5 million pageview limit. Google Analytics customers may currently exceed this limit if they maintain an active AdWords account with a daily advertising budget of $1 or more. The Google App Engine team plans to introduce pricing and service level agreements for additional resources, priced in a pay-as-you-go marginal resource structure, once the product leaves its limited 10,000-person preview period later this year.

    Quota TypeLimit / day
    HTTP requests650,000
    Bandwidth In9.77 GB
    Bandwidth Out9.77 GB
    CPU megacycles200 million
    E-mails2,000
    Datastore calls2.5 million
    External URL requests160,000

    Google publishes these quotas and provides administrative monitoring tools. The quotas are just a guideline as Google may cut off access to your application if you receive a traffic spike of an unspecified duration. The Google App Engine quota page specifies:

    If your application sustains very heavy traffic for too long, it is possible to see quota denials even though your 24-hour limit has not yet been reached.

    Google App Engine over quota

    Google App Engine already failed the Techcrunch effect and appears the platform is currently unable to handle referral traffic loads from a popular blog or news site typically associated with a product launch. The traffic spike cutoffs make me think twice about hosting anything of value on App Engine.

    The team

    The Google team behind App Engine has a long history in developer services. Team members include some of the top Python experts in the world, financial transaction specialists, and developer tool builders.

    • Python creator Guido van Rossum wrote the App Engine SDK and ported the Python runtime and Django framework for the new environment. Google App Engine is Guido's first full-time project at Google after his Noogler project Mondrian.
    • Technical lead Kevin Gibbs previously worked on the the SashXB Linux development toolset and multiple RPC projects at IBM before he created Google Suggest in 2004.
    • Developer Ryan Barrett wrote the BigTable datastore implementation and related APIs. Previously Ryan was tech lead on Moneta, Google's transaction processing platform and customer data store.
    • Product lead Paul McDonald has worked on Google Checkout, AdWords, and a Web-based IDE named Mashup Editor (all strong candidates for App Engine inclusion).
    • Product manager Peter Koomen has previously authored papers on natural language search and semantic analysis.

    The list above is just a sampling of the full-team behind App Engine.

    Feature limitations

    Google App Engine is not without its faults. Applications cannot currently expand beyond the quota's ceiling. It's still unclear how an application will dynamically scale on App Engine once it leaves the farm leagues, and at what cost.

    A few major issues include:

    1. Static files are limited to 1 MB. App Engine does not support partial content requests (Accept-Ranges).
    2. Cron jobs and other long-life processes are not permitted.
    3. Applications are not uniquely identifiable by IP address, leading to a lack of identification for external communications. Applications may suffer from bad neighbor penalties from API providers upset at another app on the service.
    4. No SSL support. No IP address complicates signing, but port 443 is open for requests. You can rely on Google services (and branding) for trusted login and possibly future payments.
    5. No image processing. Python Imaging Library relies on C, and is therefore not a possible App Engine module.
    6. Google user accounts. Site visitors are very aware of your choice in web hosts each time they attempt to logon to your application. I feel like this flow makes your application seem less professional, but may be a reasonable trade-off. Google will store your user data and potentially mine its data for better ad targeting.

    Summary

    Overall I am quite impressed with Google App Engine and its potential to remove operations management and systems administration from my task list. I am not confident in Google App Engine as a hosting solution for any real business while the host is in preview stage but those concerns may be alleviated once the product is ready for real customers and real service-level agreements.

    Python developers have just been granted a few superpowers for future projects. As an existing Python and Django developer I know how difficult it can be to find a managed hosting provider with modern Python support. Many hosts are years behind, running Python 2.3. I am excited App Engine already features the programming tools I use every day, with a few modifications for their proprietary systems. App Engine should introduce more developers to Python and the Django framework and hopefully cause other web hosts to provide better Python support as well.

  5. Jan17

    Upgrade your Google Analytics tracker

    Google Analytics logo

    Google released a new version of its Google Analytics tracking code in December after a two-month limited beta. The new Google Analytics tracker is a complete rewrite of JavaScript inherited from the Urchin acquisition in 2005 and the first time the two products have been officially decoupled. The existing version of Google Analytics tracker, urchin.js, has been deprecated but should continue to function until the end of 2008. Google will only roll out new features on the new ga.js tracker. If you currently track website statistics using Google Analytics you should upgrade your templates to take advantage of the new libraries.

    What changed?

    The new Google Analytics tracker supports proper JavaScript namespacing and more intuitive configuration methods (e.g. _setDomainName instead of _udn). My tests show about a 100 ms faster execution even with a 24% increase (1514 bytes) in file size (ga.js is also minified).

    The new tracking code makes advanced features a lot more accessible. You can now track a page on multiple Google Analytics accounts, which should help user generated content sites integrate their author's Google Analytics IDs alongside the company's own tracking account. The new event tracker lets you group a set of on-page related actions such as clicking a drop-down menu or typing a search query (very useful for widgets). Ecommerce tracking is now a lot more readable. You can read about all the tracker changes in the Google Analytics migration guide PDF.

    Implementation

    Switching your site tracker is pretty simple. Trackers are now created as objects and configured before the page is tracked.

    <script type="text/javascript" src="http://www.google-analytics.com/ga.js"></script>
    <script type="text/javascript">
    var pageTracker=_gat._getTracker('UA-XXXXXX-X');
    pageTracker._initData();
    pageTracker._trackPageview();
    </script>
    

    That's it. You are now running the new Google Analytics tracker. You'll need to swap in your Analytics account and profile IDs, which should be pretty easy to spot in your existing code.

    Summary

    Google Analytics tracking code is completely rewritten for faster on-page behavior that plays well with others. The old tracker will be deprecated within a year, and new features are only available to users running the new code. Existing Google Analytics users should swap out their tracking code to take full advantage of this free stats tool.

  6. Jan08

    Google processes over 20 petabytes of data per day

    Google currently processes over 20 petabytes of data per day through an average of 100,000 MapReduce jobs spread across its massive computing clusters. The average MapReduce job ran across approximately 400 machines in September 2007, crunching approximately 11,000 machine years in a single month. These are just some of the facts about the search giant's computational processing infrastructure revealed in an ACM paper by Google Fellows Jeffrey Dean and Sanjay Ghemawat.

    Twenty petabytes (20,000 terabytes) per day is a tremendous amount of data processing and a key contributor to Google's continued market dominance. Competing search storage and processing systems at Microsoft (Dyrad) and Yahoo! (Hadoop) are still playing catch-up to Google's suite of GFS, MapReduce, and BigTable.

    MapReduce statistics for different months
    Aug. 2004Mar. 2006Sep. 2007
    Number of jobs (1000s)291712,217
    Avg. completion time (secs)634874395
    Machine years used2172,00211,081
    map input data (TB)3,28852,254403,152
    map output data (TB)7586,74334,774
    reduce output data (TB)1932,97014,018
    Avg. machines per job157268394
    Unique implementations
    map3951,9584,083
    reduce2691,2082,418

    Google processes its data on a standard machine cluster node consisting two 2 GHz Intel Xeon processors with Hyper-Threading enabled, 4 GB of memory, two 160 GB IDE hard drives and a gigabit Ethernet link. This type of machine costs approximately $2400 each through providers such as Penguin Computing or Dell or approximately $900 a month through a managed hosting provider such as Verio (for startup comparisons).

    The average MapReduce job runs across a $1 million hardware cluster, not including bandwidth fees, datacenter costs, or staffing.

    Summary

    The January 2008 MapReduce paper provides new insights into Google's hardware and software crunching processing tens of petabytes of data per day. Google converted its search indexing systems to the MapReduce system in 2003, and currently processes over 20 terabytes of raw web data. It's some fascinating large-scale processing data that makes your head spin and appreciate the years of distributed computing fine-tuning applied to today's large problems.

  7. Oct06

    Google releases stand-alone desktop widget engine

    iGoogle Desktop widgets

    Google Desktop widgets can now be embedded in your iGoogle personal start page. This new functionality adds OS-level functionality such as CPU utilization, currently playing tracks in iTunes, or a battery indicator inside a Web interface. Google Desktop 5.5 is now available in a widget-only version for Windows 2000, XP, and Vista to bridge the desktop and Web worlds.

    The decoupling of Google Desktop Sidebar puts its desktop widget platform in direct competition with Windows Vista Sidebar and Konfabulator. Google can use its widget platform as a beachhead onto the desktop and later encourage its users to enable more Google Desktop features such as search and personalization.

    (Disclosure: Google is a sponsor of my upcoming widget conference, Widget Summit.)

  8. Sep18

    Google introduces Gadget Ads

    Google officially launched Google Gadgets as an ad unit tonight after about three months of pilot testing. Google's AdWords platform now supports Google Gadget content in addition to existing text, image, and video offerings. The gadget ads feature an entirely new widget analytics platform for tracking gadget success and interaction, an open caching proxy hosted by Google's geo-distributed servers, and the introduction of YouTube as a video hosting and transcoding platform free from any Google branding. I previously covered Google's upcoming advertising widgets in early May.

    Advertisers can create Google Gadget content in any size supported by AdWords images. In the example shown above Intel combined a Flash game with tabs displaying images and text with more information on the Intel Centrino Duo mobile processor. Each gadget interaction is recorded according to a set list of actions such as mouse over, tab views, entering a ZIP code, subscribing to a web feed, or initiating audio or visual playback. External links such as a visit to an external website pass through Google trackers for CPC billing.

    Mixed Media

    Starbucks meeting planner Google Gadget

    Gadget ads provide new mixed media interactions across Google's AdSense network. A Starbucks ad unit could display a web feed of the latest 5 tracks playing in its stores, query the local weather and suggest either an iced or hot drink, display local stores on a Google Map, and help you browse seasonal offerings from within a single ad unit. Google serves all of the content via proxy, and the rich media load never touches Starbucks' servers.

    Gadget ads also integrate with DoubleClick's DART for tracking as part of a larger portfolio. Google is currently limiting the number of publishers with access to widget advertising due to its more technical nature but existing Flash advertisers may already have the option exposed in AdWords.

    YouTube ad hosting

    Google is promoting YouTube as a video hosting and transcoding destination for advertisers. The Google Gadget Ads tutorial page includes detailed instructions for separating hosted Flash video content from an advertiser's video playback tools. This tutorial is the first time I have seen Google promote the use of YouTube in an without Google branding.

    Open caching proxy

    Google will cache almost any content passed to its gadget caching proxy including images, CSS, and JavaScript.

    http://gmodules.com/ig/proxy?url= + your URL

    Google delivers any file on your behalf from its thousands of servers distributed around the globe. It's like your own free CDN for your websites, although primarily designed for gadget content. I can cache my site's CSS through Google for example.

    Summary

    Online advertising is big business and the primary monetization engine of new web startups. Google's expansion of its dominant AdWords product into the widget space should extend the demand for quality gadget developers and designers, and bring even more attention to the space. Each advertisement is also listed in the Google Gadgets branded content directory, which may cause some product fans to integrate branded interactions for free on their blogs or personal homepage.

    Google is currently promoting gadget developers and companies experienced in Google Gadget development and design. It seems like a really good way to get exposure and potential contracts from big clients such as Honda or Coca-Cola. Designers and widget programmers may want to go get listed and take advantage of some new revenue opportunities.

    It's always exciting to see new advertising options emerge that may have richer interaction experiences and therefore drive a higher CPM. I added AdSense to my blog entry pages a few months ago hoping I might catch a new gadget ad in action -- it's so far not creating much revenue -- and I now expect even more regular Google Gadget content matched with my pages.

    Widget advertising is one of the emerging widget topics we will cover at this year's Widget Summit event October 15-16 in San Francisco.

  9. Jul17

    Scaling Google Gadget content

    Google spends hundreds of millions of dollars building efficient web serving infrastructure spread throughout the globe. Widget developers can take advantage of this optimized infrastructure to serve widgets quickly to one or a million users. In my last post I discussed best practices for scaling web widget content based on existing browser and Internet technologies. In this post I'll outline different ways you can take advantage of Google's architecture to serve your widget content to the world. I'll cover various depths of integration and their branding and licensing issues.

    1. Google widget hosting
    2. Caching remote content
    3. Summary

    Hosting your widget content on Google

    Google not only serves widget content in its personalized homepage and desktop products, it's also a web host.

    Google Gadgets Editor

    The Google Gadgets Editor helps developers build and test new gadgets inside their browser window. Once you are happy with the test results you can save and publish your gadget manifest file directly onto the "google.com" domain.

    The Gadgets Editor only saves XML files. Developers with external resources such as images, Flash, or movies, will need to serve those files using a different hosting option.

    Google Page Creator

    Google Page Creator logo

    Anyone with a Google account can create new web pages and upload files to their Google Page Creator account. Standard Google accounts can choose their own subdomain at "googlepages.com" and Google Apps customers can add pages and files under their own custom domain.

    Add a new site subdomain such as "widgets.example.com" or sign-up for an entirely new domain such as "examplewidgets.com" on the Google Apps website. All of your new content uploaded to Google Pages will be served from Google's infrastructure under the name you choose.

    Google Code Project Hosting

    Google hosts projects licensed under an Apache, Artistic, BSD, GPL, GLPL, MIT, or Mozilla open-source license. Google Code Project Hosting manages project collaborations on wikis, bug lists, and version control systems to help your users report issues with their widget or contribute new patches.

    If you're willing to open up your widget licensing and receive built-in community features Google Code Project Hosting may be the right widget hosting platform for you.

    Caching remote content

    Google caches your widget manifest file for quick response times and can cache even more files with the proper JavaScript calls. The core Google Gadgets library can cache your data sources, images, and any other form of remote content you choose.

    Remote data

    External data requests have an optional parameter, refreshInterval, that can override Google's default caching behaviors for all supported widget update data formats. You might lower your refresh interval to 15 to pull updates from a frequently changing data set or extend your refresh to 3600 for content that updates once per week.

    Remote images

    Google Gadgets has special handlers for caching and retrieving image files. The _IG_GetImage method constructs an img element for any URL for quick insertion into any DOM. The _IG_GetImageUrl replaces your typical src attribute with the appropriate location of a cached image on Google's servers.

    Cache anything

    Google will cache any URL specified in your code using the _IG_GetCachedUrl method. If you use external CSS, JavaScript, or Flash files you should pass each URL through the caching method to save your server some strain.

    Track statistics

    Cached content does not hit your server log files and therefore will not reflect the true use of your widget. Modern analytics tools have adjusted to the new forms of pageviews presented by heavy JavaScript and Flash utilization and your own measurement software should be able to plug-in to this new reporting style.

    Google Analytics customers can track widget pageviews using their existing software and accounts. You can create a new Analytics profile for each new widget or integrate reporting with your existing website, it's up to you.

    You need to require the Google Analytics library in your gadget manifest to load the appropriate libraries. You can then log new page requests using the _IG_Analytics method, a passthrough for the urchinTracker method found in the traditional Google Analytics and Urchin code. You can assign unique page names to each gadget action to track popular activities and use cases.

    Summary

    Popular widget content has the potential to melt your servers but the right code and planning can offload a lot of that burden onto Google's server farms. Your content will appear on each widget user's screen faster and with higher availability than you might be able to offer on your own servers. External free hosting and caching also helps you experiment without the fear you might crash your existing boxes.

  10. Jun27

    Google offers seed funding for widget startups

    Google is directly investing in small companies to expand the popularity of its iGoogle product. Google Gadget Ventures grants popular Google gadgets $5,000 for further development. Popular widget businesses are eligible for $100,000 in seed capital with Google taking an equity stake in each company. Google expects to invest $700,000 or more in third-party widget development over the next year.

    About 9% of Google's gadget directory would be eligible for the grant consideration based on the Google Gadget pageview analysis I last conducted in April.

    How it works

    A gadget must have at least 250,000 gadget views per week to qualify for free Google money. If you develop a Sudoku puzzle from Ireland, Super Monkey Poop Fight, or a collection of daily comics you might be eligible for Google grant money. A selection committee within Google reviews one-page summary e-mails for eligibility and grants money around the world (presumably to anywhere they already have financial means to do so, similar to the Google Summer of Code restrictions).

    Successful grant recipients located in the United States are eligible for a $100,000 seed investment for further gadget development. Google invests an initial seed amount and provides incremental slices (trances) of the $100,000 total investment as additional usage targets are met.

    Why it's a big deal

    Google now has a corporate investment arm focused on third-party developers. Many of the popular widgets on iGoogle are developed by a single person in their spare time who might achieve surprising success. Success on a platform such as Google Gadgets or Facebook might not have immediate financial impact, but developers now have an additional source of income to build a small business around their JavaScript and Flash skills applied to a small web widget. A developer receiving funding might be able to hire a designer, buy better server hardware, or finance more hours developing fun tools for the Google Gadget platforms.

    If surprise success stories such as Desktop Tower Defense are any indication, viral gadget content could receive a new business viability onramp with the supply of these new funds. Google Gadget Ventures could also boost Google's general developer program and encourage the use of additional Google developer products and services within the gadget window. The introductory Google Gadgets Ventures blog post on the Google Code blog already hints at some of the other Google development tools available to gadget developers such as GData or Google Gears.

    In theory a popular web feed could receive a $5,000 grant from Google to develop a Google Gadget version of their news or information. A feed such as Engadget (4,386,688 weekly gadget views) or Daily Kos (226,863 gadget views) might receive $5,000 to create a nice-looking gadget with advanced functionality. Google just provided a widget budget to companies and services who might be on the fence about widget development.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Recently Popular

Archives: Popular Categories

Sites: More from Niall