Recently in Python Category

Python programming language including tools, frameworks, and success stories.

  1. Feb13

    Google App Engine 1.1.9 boosts capacity and compatibility

    Google App Engine logo

    Google App Engine released hosted platform version 1.1.9 earlier this week with big boosts in capacity and compatibility. The new App Engine supports standard HTTP libraries, larger files, triples the response deadline, and removes limitations on CPU-intensive processes.

    Standard HTTP libraries

    App Engine now supports Python's standard web requesters urllib, urllib2, and httplib. Programmers on App Engine were previously required to use Google's proprietary urlfetch API, which still provides the best integration for the Google request gateway. Support for standard Python request libraries means better compatibility with open source libraries developers would like to include in their web application.

    30 seconds or die

    Stopwatch 10 seconds

    App Engine scripts now have up to 30 seconds to respond to any incoming request, a big change from the previous 10 second limit. I often request data from the web, process and interpret results, write to the datastore, and then issue a response to the consuming agent. The extra headroom opens up new possibilities for data processing, especially in a programming environment without queues or background tasks.

    10 MB response sizes

    App Engine processes can now send and receive files up to 10 MB in size, a 10x boost from the previous limit of 1 MB. Need to share a podcast, PDF, or large image? It's now possible. Developers can deploy up to 1000 static files of up to 10 MB each, creating up to 10 GB of geo-distributed static file storage per App Engine instance.

    No more high CPU limitations

    App Engine scripts are no longer limited to 2 CPU-intensive requests per minute. Scripts are still limited to 30 active dynamic connections at any given moment (under the free plan). Processing power is made available on-demand for up to the full 30 seconds of your process.

    Datastore now supports IN operator

    Google App Engine's datastore now supports the IN operator for Megastore queries. You can now query on a list of values instead of chaining data requests through a for loop, a big efficiency change for me.

    Summary

    Google App Engine raised its capacity this week and opened up new possibilities for developers creating web applications on its service. The platform is still in free "preview release" without the ability to purchase additional processing headroom but the per-process restrictions have loosened up a lot.

  2. Jul08

    Google App Engine optimizations

    Google App Engine

    I have developed a few web applications powered by Google App Engine since its launch in May. It has been a fairly easy transition from my traditional programming in Python and Django backed by MySQL to the distributed App Engine environment, Bigtable, and the limitations of each. I have learned a few App Engine best practices over over the past month and would like to share some best practices for App Engine development gained mostly through trial and error. In this post I will share data optimization tips for Google's hosted Bigtable instance, reduce the errors and resource usage of your application, and add a few steps to your deployment checklist.

    Key-based lookups

    I program Django applications referenced by a set of short unique object labels named slugs. A slug column is uniquely queried across a model and easily indexed for fast scans. In the Bigtable world of Google App Engine slugs are optimally stored as a model's key name. Key names are limited to 500 bytes and must be unique across your defined entity. This unique key lookup directly copies the entity into memory without needing to scan an entire distributed hashtable.

    Entity key names provide very fast lookups for developers who like to plan ahead. You cannot alter the key name once it's set and it cannot start with a number or underscores. If you can accept these limitations within your code you'll experience an even snappier reads from your data store.

    Reduce indexed columns

    It's tempting to choose a Datastore property by its input helper or based on names similar to a SQL equivalent. So what's the difference between a short String and Text? An index.

    According to Guido, a 300 byte string stored as Text is the same size as String but without an index. If you have a short string you never query or sort you'll optimize your data queries if it's stored as Text.

    Define a favicon

    App Engine developers should define favicon.ico, robots.txt, and other frequently requested file paths. Google App Engine logs frequent errors inside your administrative console if it has to hunt for your icon with every browser request.

    Define the location of your static favicon file directly from app.yaml for fast response times:

    - url: /favicon.ico
      static_files: static/favicon.ico
      upload: static/favicon.ico
    

    You should follow a similar pattern for robots.txt and optionally the verification files from Google Webmaster Tools, Yahoo! Site Explorer, and Windows Live Search.

    Define default 400 and 500 response templates

    Your site is not perfect. Visitors will inevitably request pages that do not exist or generate an internal server error. Your site should define default templates for 404 and 500 status codes or risk displaying whatever is sitting on Google's NetScaler.

    Google App Engine default 500 page

    The screenshot above shows an error page of an App Engine application without a defined 500 handler. A link on the page suggests a visit to Google's support website where your visitors will find no support options of interest.

    Django developers should define 404.html and 500.html in your app's templates directory. Django will load and render each file for the default page_not_found and server_error views respectively.

    Deploy and request

    Developers should prime Google's distributed server networks by issuing requests for key URLs a few minutes after deploy. These automated requests trigger your memcache storage and distribute your app instance across Google's distributed servers. The first request requires more CPU cycles and memory than subsequent requests as Google tries to prioritize active application instances and their versions. You can speed things up by always issuing one or more requests after a successful deploy.

    This process is not unlike flushing and re-populating CDN PoPs with new content from your origin server or propagating dynamic handlers across your front-end cluster. It's best to kick off the process early and have the latest version of your content waiting for new visitors on subsequent requests.

    Summary

    Google App Engine simplifies the scaling process but is not a magic cloud that will erase all latency and resource usage issues in your app. App Engine requires new approaches to data storage, data latency, and resource requirements in a metered and opaque environment. Hopefully my trials and experience will speed up your App Engine web apps as you create new services in the cloud.

  3. Apr10

    Google App Engine for developers

    Google App Engine

    On Monday Google launched Google App Engine, a hosted dynamic runtime environment for Python web applications inside Google's geo-distributed architecture. Google App Engine is the latest in a series of Google-hosted application environments and the first publicly-available dynamic runtime and storage environment based on large-scale propriety computing systems.

    Google App Engine lets any Python developer execute CGI-driven Web applications, store its results, and serve static content from a fault-tolerant geo-distributed computing grid built exclusively for modern Web applications. I met with the App Engine's team leads on Monday morning for an in-depth overview of the product, its features, and its limitations. Google has been working on the Google App Engine since at least March 2006 and has only just begun revealing some of its features. In this post I will summarize Google App Engine from a developer's point of view, outline its major features, and examine pitfalls for developers and startups interested in deploying web applications on Google's servers.

    What is Google App Engine?

    Google App Engine is a proprietary virtualized computing suite covering the major common components of a modern web application: dynamic runtime, persistent storage, static file serving, user management, external web requests, e-mail communication, service monitoring, and log analysis. The Google App Engine product offers a single hosted production web server stack hosted on Google's custom-designed computers and datacenters distributed around the world.

    Google App Engine is a managed hosting environment with a tightly managed stack running in a machine-independent environment. It simplifies the deployment and management of your web application software stack while constraining you to a specific stack. When I start a new web development project today I have to first setup a tiered system to effectively handle site growth:

    3tera Applogic grid
    1. Purchase dedicated servers or virtualized slices. Estimate necessary CPU, memory, disk space, etc. at each tier.
    2. Configure a web server for dynamic content. Install Python and its eggs, Apache HTTPd and extra modules such as modwsgi. Configure and tweak each. Open appropriate ports. Listen.
    3. Setup a MySQL database server and choose the appropriate storage engine. Configure MySQL, add users, add permissions. Tweak and optimize.
    4. Add an in-memory caching layer for frequently accessed dynamic content.
    5. Monitor your uptime and resource utilization with Ganglia and/or other tools on each machine.
    6. Serve static files such as JavaScript, CSS, and images from a specialized serving environment such as Amazon's Simple Storage Service.
    7. Turn your static server into an origin server for a CDN with points of presence close to your website's users.
    8. Connect each piece of the stack, keep its software updated to avoid security vulnerabilities, and hopefully respond to all website requests in less than a second.
    9. Dedicate work hours and expertise to all the above. Hire outside assistance if needed.
    10. Don't go broke trying.

    Your tiers will expand as your new web application gains popularity. Your single-server tiers become load-balanced services, message bus broadcasts and listeners, and distributed cache arrays at scale. You'll probably spend time rearchitecting your application at each stage of growth to incorporate for these new resource demands if you can afford the time, expertise, and effort.

    Google App Engine is a new and interesting solution for Python developers interested in adding features, not servers. Google spends hundreds of millions of dollars developing its custom infrastructure with 12-volt power supplies tapped into a hydro-electric dam next door and fat fiber pipes owned by local governments carrying requests and responses to their proper home. Google's physical infrastructure is vast array of highly optimized web machines, and we'll now be able to see how such infrastructure performs across more generic applications on App Engine.

    Freemium hosting model

    Google App Engine is a "freemium" business model offering basic features for free with paid upsells available for application developers exceeding approximately 5 million pageviews a month. This resource quota approximately matches the Google Analytics 5 million pageview limit. Google Analytics customers may currently exceed this limit if they maintain an active AdWords account with a daily advertising budget of $1 or more. The Google App Engine team plans to introduce pricing and service level agreements for additional resources, priced in a pay-as-you-go marginal resource structure, once the product leaves its limited 10,000-person preview period later this year.

    Quota TypeLimit / day
    HTTP requests650,000
    Bandwidth In9.77 GB
    Bandwidth Out9.77 GB
    CPU megacycles200 million
    E-mails2,000
    Datastore calls2.5 million
    External URL requests160,000

    Google publishes these quotas and provides administrative monitoring tools. The quotas are just a guideline as Google may cut off access to your application if you receive a traffic spike of an unspecified duration. The Google App Engine quota page specifies:

    If your application sustains very heavy traffic for too long, it is possible to see quota denials even though your 24-hour limit has not yet been reached.

    Google App Engine over quota

    Google App Engine already failed the Techcrunch effect and appears the platform is currently unable to handle referral traffic loads from a popular blog or news site typically associated with a product launch. The traffic spike cutoffs make me think twice about hosting anything of value on App Engine.

    The team

    The Google team behind App Engine has a long history in developer services. Team members include some of the top Python experts in the world, financial transaction specialists, and developer tool builders.

    • Python creator Guido van Rossum wrote the App Engine SDK and ported the Python runtime and Django framework for the new environment. Google App Engine is Guido's first full-time project at Google after his Noogler project Mondrian.
    • Technical lead Kevin Gibbs previously worked on the the SashXB Linux development toolset and multiple RPC projects at IBM before he created Google Suggest in 2004.
    • Developer Ryan Barrett wrote the BigTable datastore implementation and related APIs. Previously Ryan was tech lead on Moneta, Google's transaction processing platform and customer data store.
    • Product lead Paul McDonald has worked on Google Checkout, AdWords, and a Web-based IDE named Mashup Editor (all strong candidates for App Engine inclusion).
    • Product manager Peter Koomen has previously authored papers on natural language search and semantic analysis.

    The list above is just a sampling of the full-team behind App Engine.

    Feature limitations

    Google App Engine is not without its faults. Applications cannot currently expand beyond the quota's ceiling. It's still unclear how an application will dynamically scale on App Engine once it leaves the farm leagues, and at what cost.

    A few major issues include:

    1. Static files are limited to 1 MB. App Engine does not support partial content requests (Accept-Ranges).
    2. Cron jobs and other long-life processes are not permitted.
    3. Applications are not uniquely identifiable by IP address, leading to a lack of identification for external communications. Applications may suffer from bad neighbor penalties from API providers upset at another app on the service.
    4. No SSL support. No IP address complicates signing, but port 443 is open for requests. You can rely on Google services (and branding) for trusted login and possibly future payments.
    5. No image processing. Python Imaging Library relies on C, and is therefore not a possible App Engine module.
    6. Google user accounts. Site visitors are very aware of your choice in web hosts each time they attempt to logon to your application. I feel like this flow makes your application seem less professional, but may be a reasonable trade-off. Google will store your user data and potentially mine its data for better ad targeting.

    Summary

    Overall I am quite impressed with Google App Engine and its potential to remove operations management and systems administration from my task list. I am not confident in Google App Engine as a hosting solution for any real business while the host is in preview stage but those concerns may be alleviated once the product is ready for real customers and real service-level agreements.

    Python developers have just been granted a few superpowers for future projects. As an existing Python and Django developer I know how difficult it can be to find a managed hosting provider with modern Python support. Many hosts are years behind, running Python 2.3. I am excited App Engine already features the programming tools I use every day, with a few modifications for their proprietary systems. App Engine should introduce more developers to Python and the Django framework and hopefully cause other web hosts to provide better Python support as well.

Niall Kennedy Niall Kennedy is a web technologist in San Francisco, California in the United States. I am very interested in the world of... MORE »

Search this weblog:

Subscribe:

Recently Popular

Archives: Popular Categories

Sites: More from Niall