LinkLog: Streaming Data, Distributed Execution Engines

It is rare that in three days time you come across four references to similar technologies. That is what happened to me a couple of days ago.

There was a reference to Hadoop on Twitter. I almost forgot about Hadoop, the open source equivalent of Map/Reduce.
I was watching a rather unusual Google Tech Talk the other day. It was unusual, because a person from Microsoft Research was talking about Dryad, their distributed execution engine, at Google.
One of the participants asked a question whether the speaker can compare Dryad to IBM’s Stream Processing Core. So I had to look it up.
Following a few links from the IBM article, I found SPADE, a declarative language for handling streaming data. I have always been fascinated by domain specific languages to solve special problems, especially with data. You learn a lot by just understanding the high level concepts.

So here they are. A set of related technologies with some overlap.

Roger Rea says:

March 24, 2009 at 4:10 am

Hi, Dorai……you might be interested to learn that a second version of the SPADE specification has been published at: http://domino.watson.ibm.com/library/cyberdig.nsf/papers/DC60E0487859F75A85257577005678A9

Roger Rea
IBM InfoSphere Streams Product Manager
1. dorai says:
  
  March 24, 2009 at 4:14 am
  
  Roger,
  Thanks. Will take a look.
Pingback: Amazon tethers balloons for now; attention turns to crunching data in the Cloud with Elastic MapReduce web service | Paul Miller
Yuvi says:

April 6, 2009 at 7:07 am

Isn’t Hadoop an opensource implementation of Map/Reduce? I mean, Map/Reduce is an ancient LISP concept which Google implemented on a massively Parallel Scale, which was then implemented independently as Apache Hadoop. Right?

Just some nitpicking:D
1. dorai says:
  
  April 6, 2009 at 7:46 am
  
  That is correct. Several companies contributed to Hadoop project (with a lot of contributions from Yahoo). Just a couple of days ago, Amazon announced a cloud service for Hadoop.
  
  It may be an ancient concept but building it and making it scalable is the real challenge.

Comments are closed.