Showing posts with label atompub. Show all posts
Showing posts with label atompub. Show all posts

Monday, September 17, 2007

GData AtomPub Podcast

I just listened to The world of Google data APIs. It is 42 minutes long. I took notes.

Here is the agenda:

  • What "Google data API" actually means (the parts and pieces)
  • What Atom, Atom Publishing Protocol, and other tech behind GData are all about
  • What GData adds to the mix on top of Atom and APP
  • How Atom compares to RSS
  • What are ETags? And how can they help me?
  • Why REST, the style, was chosen for these APIs
  • Where REST makes sense, and where it doesn't. Resource driven vs. RPC.
  • What the first GData APIs were
  • How the killer app of syncing data with Google Calendar
  • How you actually use the APIs? What do they need to learn? What tools do we give them?
  • Can you write APIs that implement the same GData APIs?

Notes: Atom is an IETF standard in case you didn't know. RSS isn't.

Google has been working with Atom for over 2 years.

AtomPub provides a basic REST API - Google thought this was a great starting point. Atom leaves query to the student. Google uses URLs to do this. You can have output sorted etc. Atom doesn't say anything about this.

ETags were unfortunately not implemented. They chose a version number in the URL so that multiple people can write to the same entry. Plan to implement ETags in the future.

Are ETags magic? No, a lot simpler than they sound. A little string that tells you the version of the entry. Just a great way of making caching work & making the web more efficient.

Chose Atom because of momentum. And sold on REST. Not SOAP because the web is based on REST. Much easier for devs to learn REST than SOAP. SOAP require tooling. REST is simply manipulating tables of entries.

What are disadvantages of REST? AtomPub is just an implementation of the REST style. Difficult to map certain types of operations to REST - translation API example: send text & send back a list of options & then feedback how to improve the translations. Document centric request / response that was never saved on the server. Just a straight RPC call. REST is about manipulating resources.

How about transactions? Each request is essentially transactional. REST not concerned with multi-resource transactions.

How do devs use? Can use curl if they want to - some do create apps with shell scripts and curl. Have a Java, .NET, PHP, Python, Objective C API. Contributed APIs: Lisp (Patrick!), Ruby, Flash in development. It's just XML over HTTP - it's NOT THAT HARD!!

Authentication - "Client Login" is pretty straight forward auth. URL for uname/passwd where you get a token back that is your identity. Authentication for Web Apps - "Auth Sub" used for "on behalf of" type stuff. You can grant access to other web sites. You control who has access to.

Can I use Google Auth? They are meant to be open standards. Google legal hasn't signed off yet, but licensing will be worked out soon.

What is this "Kinds" business? Entries get passed around a lot - kinds concept is Atom categories to tag each entry with a "kind". Just gives more semantic information to computers (clients).

AtomPub Google Interop Event? 12 devs from different orgs came. Great success. Google's basic AtomPub worked fine. Google custom auth scheme was more problematic. After that everything worked smoothly. A lot of the impls built around AtomPub introspection document - Google doesn't use them much. Will in the future.

Any GData / AtomPub tips? Google has a particular approach to designing APIs. AtomPub good at certain APIs. Things that map well - RPC not one of them. On occassion Atom Entry is just a pointer to the real data (e.g., photos). Prefer to put data in entry as much as possible. There is a lot of art/style/parsimony to it. You can achieve a lot with 1 feed with a lot of query parms. Have to review APIs & ensure that they have good clean concepts: this feed clearly includes these types of elements.

WADL? Haven't found the need for it.

Atom can be tough to get at first, but once you do it is amazingly simple & then applicable to many. Very good programmers who don't understand AtomPub gargen/language can. Concepts are very simple. Feeds, entries, links to other entries. Very simple mental model once you get it. All APIs make the same. Very powerful.

Google working with the IETF on improving Atom/AtomPub. Introduced batch model that increases efficiency a lot. Auth. Teams in Google are very ambitious - hope to make publically available as drafts that can become standards.

Some talk of JAY-SAHN (JSON).

Wednesday, September 05, 2007

Tuesday, August 28, 2007

Atom & Pub/Sub Revisited

Tim Bray picked up on my Pub/Sub vs. Atom & AtomPub? post.

During the last year, I have not been doing that much messaging, but the previous 7 years I did a substantial amount of it - both point-2-point and lots of pub/sub. I absolutely don't think that Atom / AtomPub by itself will replace messaging. I don't know enough about XMPP to know if it will replace traditional messaging some day. I am a big fan of AMQP, but don't know if it is viable yet or will be viable soon. Tim is dead on that PSB is an impressive corpus of work

My point, was only that Atom & AtomPub appears to be a very good format / protocol to do business events with a polling model. Having done a lot of pub/sub, I have done a lot of clustering. The buffer that Tim describes is not so simple when you get lots of clients and lots of consumers. That is where flow control starts to appear (if you really want to guarantee message delivery) and where the dark corners of fail over and clustering rear their heads. It is nothing against messaging per say - just how it is. My point really was that even "guaranteed" messaging is often not truly guaranteed. And that for the right usage scenario the poll model is simpler and more appropriate.

Furthermore - push models can be appealing to developers because it feels cool (has felt very cool to me) - even simple. My point is just that this isn't typically really the case.

It really just comes down to your requirements. For example message rate, how quickly your event sinks need to process, etc. For where I am thinking right now, high level business events don't typically need to be delivered more than every 10 minutes - perhaps ever minute. I'd much rather deploy some simple HA web infrastructure & Atom/AtomPub than AMQP, TIBCO, SonicMQ, ActiveMQ, etc. to meet those requirements.

In my experience with pub/sub style integration architectures when you are pushing information amongst major domains (e.g., party/customer information), there is always the nagging question if you missed an event - or how do I deal with drift. This is because even though it is supposedly guaranteed messaging, there is always a risk. I've seen people try to capture events and persist them so that they can be replayed in the event of this type of a problem. Or more commonly, check the master system every month to ensure that the systems are synchronized. An Atom feed seems like a good way to avoid this failure all together - in that rather than pushing a copy of the event to each interested sink, the sink just reads the feed. Sure the client has to be a little bit more intelligent, but it just seems more deterministic to me. But I have more to learn.

Monday, August 27, 2007

Atom Service Directory

I re-read some of the Atom specifications on a plane today.

I haven't been as excited about a data format since creating a mostly proprietary, but somewhat based on ACORD one at a Canadian insurance company in 1999-2000.

I still have a lot to learn about ATOM, but it seems to address all of the pain points I have seen in integration data formats so I'm optimistic.

I obviously can't be the first one to think of it, but wouldn't Atom / AtomPub make a nice simple Service Directory (think UDDI, but with legs)? Just a feed with entries for each service instance? Categrories and other meta data? You could dynamically register services via AtomPub - perhaps services would have to re-register every so often. Is there something like this already?

I also looked back at some of the blogosphere spaz-outs regarding Atom yesterday. I have more to learn, but perhaps it is best that Atom get's a head of steam before embrace and extend sets in anyway.

Here's to hoping that Atom stays simple - seems to be a beautiful format.

Sunday, August 19, 2007

Pub/Sub vs. Atom & AtomPub?

I came across an interesting article by Sean McGrath entitled I'll push and you pull. The mashup approach to application integration.

I posted about Push vs. Pull in the spring. I'm a sucker for Lean thinking so the title of this article got my attention.

Sean says:

Now here is the kicker. The web - with all its concomitant bits'n'bobs from XML to RSS/Atom to AJAX - is an extremely good platform for pull-centric design. On the Web, if you try to pull some piece of information and something goes wrong, well you just pull again and again until you get it or give up. Nothing fancy. Just brutish repetition. Something machines are extremely good at. If you want to look at information from yesterday, you just go to the URL that contains yesterday's information. Nothing fancy. Just a simple naming convention that includes dates in URLs.

I have done a fair amount with pub/sub over the years (complete with clustering etc., fail over strategies, etc.). I like it, but it is not without it's challenges. If only there was real guaranteed delivery etc. If only there wasn't "flow control". If only fail over always worked. If only messages didn't get trapped, etc. All I'm saying is having implemented several large implementations of pub/sub (EDA whatever you like), I know that it isn't easy - that nothing is really "guaranteed".

Clearly Atom can't replace ALL pub/sub use cases, but for every day integration architecture where you want business events / EDA why can't we use Atom feeds? In an extreme case, you might have an event sink requesting the feed every 10 seconds - in most cases every 10 minutes would likely be fine?

Who is doing this today? Any lessons from the trenches?

Update I meant to read the article earlier linked off of this post by James Snell. Just did. Dang. It is a sample of using Apache Abdera.

Update 20-AUG Bill de hÓra has a pretty $$ response - thanks Bill.

Update 21-AUG Dan Creswell has some thoughtful comments. When I blog I often do the x vs. y thing - but clearly the truth is in the middle. I used to be a push bigot and have just learn the hard way how difficult it is to achieve. Clearly, not everything can be pull - of course it is a mix. In case anyone is curios, I'm thinking of inter-domain integration (e.g., getting customer additions/updates/deletions to many interested systems) with this line of thinking rather than some sort of intra-domain integration (e.g., trading system where there is massive high performance pub/sub). You have to choose the right tool for the job. More and more for me, simplicity is winning out. This is just an evolution of my thinking - a year ago I was still a MOM bigot.