[colug-432] Hadoop interest ?

Scott McCarty scott.mccarty at gmail.com
Wed Dec 15 10:38:16 EST 2010

I feel exactly the same, I "get it" but we (my company) are having trouble
finding a use for it without completely redesigning the way we build and do
things. The "cloud" solutions we have been investigating are "libcloud"
(python/java) and Overmind.

Libcloud wraps about 20 providers (amazon, rackspace, linode) and Overmind
controls this all through a web interface. The problem is you have to start
thinking ephemerally with load balancers and get all of your data back (eg.
Casandra cluster or MySQL replication). We are a hosting provider so it is
hard to figure out how deploy in this manner unless you Amazon/Google.

My two cents
Scott M

On Wed, Dec 15, 2010 at 10:20 AM, Angelo McComis <angelo at mccomis.com> wrote:

> Tom / all:
> I was speaking to a friend of mine who works at Google, and he was
> intimating how wonderfully awesome the Map Reduce / Hadoop stuff is. His
> example was a computational job that would not be able to complete in his
> lifetime on a single server can be distributed out to multiple nodes and
> crunched and completed in minutes or hours, depending on how much capacity I
> had to throw at the work.
> To consider the direction of the IT industry as a whole, this is certainly
> an interesting discussion to have -
> - Companies are trying to do more cloud-like things, and a Hadoop elastic
> cloud makes a lot of sense there, but getting that much data from onsite to
> the cloud is a challenge. But, if the data set is that big, would you not
> spend more $ on bandwidth transfer putting it to and getting from the cloud
> than the GDP of some smaller countries?
> - Doing an internal Hadoop architecture - certainly the way to go, but what
> is the value of redesigning your data and processes to take advantage of
> Hadoop when the investment has already been made in the larger, vertically
> scaling hardware?
> - Doing an internal Hadoop architecture that's based on an internal elastic
> cloud (e.g. use capacity when needed, give it back when finished) makes
> sense, but to the point of making the investment of taking existing data and
> processes and converting it to the style needed to be able to distribute the
> rows out to Hadoop, then it becomes problematic.
> I guess in short, I get it, but I don't see where it makes sense yet,
> unless you are Google, Amazon, or one of the other "Top 10" biggies out
> there.
> Maybe this is where more public forum and discussion comes into play.
> Interested in others' comments on what cloud is, and where it makes sense.
> Angelo
> _______________________________________________
> colug-432 mailing list
> colug-432 at colug.net
> http://lists.colug.net/mailman/listinfo/colug-432
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.colug.net/pipermail/colug-432/attachments/20101215/0485736c/attachment.html 

More information about the colug-432 mailing list