Saturday, January 06, 2007

Simplicity & Integrating with Less Code

I have been on a distributed caching & JavaSpaces kick lately due to the reasons listened here.

I was talking to my co-worker Erik yesterday and we concurred that one of our biggest goals is to come up with an architecture that is as simple as possible and requires the least amount of code.

Coming from my experience with SOA of various flavors and EDA, I have been on many projects that wrote way too much code. The more code the more defects. This is what is so appealing to me about JavaSpaces and potentially distributed caching (e.g., memcached, Tangosol Coherence) (or some of both). In the core of an application it is possible to use these layers as your service orchestration tier as well as your transient data store. This is very appealing as you skip a significant amount of OR and OX mapping which saves heaps of time and defects.

The JavaSpaces API itself is extremely seductive. Lifted from here:

  • write: Places one copy of an entry into a space. If called multiple times with the same entry, then multiple copies of the entry are written into the space.
  • read: Takes an entry that is used as a template and returns a copy of an object in the space that matches the template. If no matching objects are in the space, then read may wait a user-specified amount of time until a matching entry arrives in the space.
  • take: Works like read, except that the matching entry is removed from the space and returned as the result of the take.
  • notify: Takes a template and an object and asks the space to notify the object whenever entries that match the template are added to the space. This notification mechanism, which is built on Jini's distributed event model, is useful when you want to interact with a space using a reactive style of programming.
  • snapshot: Provides a method of minimizing the serialization that occurs whenever entries or templates are used; you can use snapshot to optimize certain entry usage patterns in your applications. We will cover this method in detail in a later article.

With the addition of JavaSpaces05, there is collection based (i.e., bulk) read, take, and write.

The integration patterns possible with this API are very powerful. Master/Worker, distributed, highly scalable data structures, etc.

It appears that with certain distributed caching vendors, you can achieve some semblence of these patterns, although I am not convinced of that yet, and the associative nature of JavaSpaces (i.e., you use templates containing objects with what you are looking for in a space to find it) is just amazingly seductive.

Love it or hate it, JavaSpaces comes with Jini. To a newbie like myself, Jini takes some getting used to. The whole mobile code bit tends to make people who have lived through J2EE classloader hell nervous. But there are those who say it works fine in Jini. It is a different paradigm and this makes people very nervous.

Maybe what is needed is a mix of J2EE (specifically servlet, JMS, JMX MBeans), distributed caching, and JavaSpaces. Maybe you can use JavaSpaces for both caching and service orchestration. Maybe you should use all of Jini. Maybe there is something else.

From previous lessons learned around persistence in EDA, I am 100% convinced that we need some sort of flexible transient data store (no work in progress database - I beg you!). I see that today as being some form of a persistent distributed cache that requires very little if any mapping code. I also have seen the power of asynchronous integration and SEDA and am not about to give up on it. JavaSpaces and Master/Worker appear to give me that. What other ways are there? Maybe things that I would have considered anathema a year ago like sending objects through JMS for the eventing layer isn't really that awful? I do know that I do not want XML at the core of a brand new system again - again way too many defects with mapping etc. Too slow, too lame. Sure, at the periphery to integrate with services that require it, but never again in the core of the system.

Lots of thinking results in lots of blather . . .

Bottom line, I want this to be as simple as it can be and I want to write as little code as possible. Oh and it should scale like the dickens, be simple to specialize, maintain, etc.


Jack said...

I am 100% convinced that we need some sort of flexible transient data store (no work in progress database - I beg you!).

Hi Mike, I agree with you. I call it the Global Dataspace and in the current marketspace I think an ESB can fulfil this role very well.

See also my blog on this subject.


fuzzy said...

Hi Jack,

Thanks for the comment.

I think we are talking about different things (depending on the details).

My point from previous pain endured is to avoid XML Shredding as much as possible. And in the core of an application (very different then calling a service) avoid OX (Object to XML) mapping as well. In the core of an application I still want to embrace services that do one thing well. I just don't want XML anywhere near it.

I think you need to be very deliberate about persistence with SOA, EDA, SBA, etc.

A distributed cache (e.g., Tangosol) or perhaps a JavaSpace is perfect for this.

The great thing about these types of technologies is that your transient state is represented as objects and not mapped to a relational form. This is very flexibile. You get performance benefits and more importantly (IMHO) heaps less defects with this because you aren't mapping in and out.

Obviously when you need long lived storage, you stick it in a database or a mainframe or whatever.

Keeping transient state out of any long lived storage system in something like a distributed cache or a JavaSpace enables this.

Sarge said...

One of the things I like about writing Cocoa apps is the lack of code.