Panic From Fuzzy: EDA Lessons Learned

Monday, April 24, 2006

EDA Lessons Learned - Persistence

Our EDA is Stream Event Processing based.

There are *lots and lots* of events. It blows your mind to listen to all of the events at once (just subscribe to # in SonicMQ). It is like watching the Matrix. Most of the events aren't terribly meaningful to a typical subscriber. Many of the events are only used in the source system in "event workflows".

EDA is highly distributed. Our system has many different processes and machines publishing and subscribing to events. Persistence is very easy to get wrong in an environment like this. The first lesson is to put enough data in each event so that subscribers of the event do not need to gather more data from the source. Also, in terms of inserts and updates, you have to be careful of table / row locks etc. If you are not careful, you'll have N processes with N threads on N machines trying to hit the same data source.

For us, a lot of these "ordinary / non-notable to downstream service events" are used in event workflows. State is important - you have to put it somewhere. The key is to think very carefully about where you put it and non screw it up.

Here is a list of some general thoughts on the subject:

As mentioned above, put the appropriate amount of data in each event so that subscribers do not need to gather more data from the sending system
If you are using persistent messaging & durable subscriptions as your EDA backbone, trust it! You do not need to persist state in each service that handles an event. If you do, you are going to regret it
XML shredding is expensive and results in a high defect rate (on the way in and out)
Separate transient (i.e., part of event workflow), terminal state (i.e., completed event workflow), and reporting data bases (ODS, Data ware house, OLAP)
Beware of O/R tools

Work ok in experienced hands, but caching difficult if multiple services in different JVMs have separate cache, mistakes can be hard to fix
Use proper granularity
Emitted SQL not always performant
Often ends up being more complicated then JDBC

Consider XML database, high performance reliable file system, or caching solution for transient data

3 comments:

Anonymous said...: Hi Mike,

My name is John Wright and I find your blog very helpful and informative. We have similar backgrounds. I recently started working for CapeClear Software and am interested in your EDA experience but have found all of your entries to be very thought provoking.

Cheers; Tue Apr 25, 04:46:00 PM PDT
Anonymous said...: I've had very similar experiences to what you describe.

Peace.; Tue Apr 25, 07:01:00 PM PDT
fuzzy said...: Hey thanks John - glad to hear that it is useful.; Tue Apr 25, 08:31:00 PM PDT

Panic From Fuzzy

Monday, April 24, 2006

EDA Lessons Learned - Persistence

3 comments:

Twitter Updates

Lijit Search

Blog Archive

LibraryThing

Blog Roll

Links