pycon-2012-talk



pycon-2012-talk

0 2


pycon-2012-talk

Sources for my talk at pycon uk 2012

On Github 1stvamp / pycon-2012-talk

The Asynchronous Plumber

Wes Mason

@1stvamp

www.serverdensity.com

Who?

  • @1stvamp (twitter/github)
  • Product Engineer at ServerDensity
  • I like Python and evented code

The web is our API

  • Everything does HTTP
  • REST is great - let's do more of this
  • Almost everything can do HTTP asynchronously

Our Internal Services

  • RESTful
  • Stateless
  • Mostly thinlayers
  • Common documented API
  • Callable from frontend

Brave New (Async) World

  • Longpolling, WebSockets and Server Sent Events
  • UIs are responsive or GTFO
  • Libraries like socket.io make this pretty easy for devs

Come Back To Me

  • Jobs running in parallel using task systems
  • What we want: async responses to the browser
  • What happens: Jobs complete and...die. *sniff*

Come Back To Me (cont.)

  • HTTP or WebSocket request kicks off a task
  • Task runs asynchronous to requests
  • Task completed and returns data...
  • ...how?

Segway to Celery

  • Great: Takes care of distributing tasks to clusters
  • Not so great: can't do multiple pubsub

Publish and subscribe

  • Need results back from Celery by account ID
  • Pubsub is ideal for this
  • Each account has multiple users
  • What we want: one task per account, same result to each user

Many to many with pubsub

  • Redis let's you do this (but we don't know Redis)
  • In theory AMQP let's you do M2M pubsub (and Celery does AMQP by default)

...well, kind of

>>> result = task.delay(foo) 
AsyncResult(id="some-unique-id") 

>>> result.get()
...

Cue the haxx

  • Threads (urgh)
  • Polling MongoDB (we do know Mongo)
  • Using MongoDB as Celery backend and keeping results (hello race conditions!)

You Had Me Hooked

  • The stack's just web APIs.. ..why not just use webhooks to return data?
  • They're async, just like handling our frontend requests
  • They can be tested the same way as our other APIs

The Webhook Registry

  • Register/unregister callbacks by a given key (e.g. an account ID)
  • Web handler accepts data payloads via HTTP POST
  • Webhook URI persisted somewhere (e.g. MongoDB)

Skyhoooks to reach the Tornado

  • Open source library with registry for Tornado - soon Gevent and Twisted.
  • Registers callbacks and handling of multiple instance URIs
  • Available now: github.com/serverdensity/skyhooks

Meanwhile, back on the [server] farm

  • Tasks running somewhere, somehow
  • Data returning from one or more tasks
  • One or more INSERT_SERVER_HERE (e.g. Tornado) instances at one or more URIs awaiting data

# import requests, import pymongo
# setup collection etc. etc.

hooks = webhook_collection.find({"account_id": id}) 
if data and len(data) > 0:
    for hook in hooks:
        requests.post(hook["url"], data) 
  • N.B. skyhooks does this for you

Is this the only way?

  • ZeroMQ?
  • Crossroads.io has bi-di-pubsub
  • MongoDB trailing cursors
  • Do you need it?

sys.exit(0)