[colug-432] Hadoop follow-up questions

Tom Hanlon tom at functionalmedia.com
Fri Mar 25 19:20:13 EDT 2011


On Mar 24, 2011, at 9:58 AM, Scott Merrill wrote:

> On Thu, Mar 24, 2011 at 9:40 AM, Scott Merrill <skippy at skippy.net> wrote:
>> I have a couple of questions this morning. Anyone should feel free to
>> answer, not just Tom, if you have any insight.
> 
> One more question: since HDFS redundantly stores data blocks in
> triplicate, does it make sense to still use traditional backup methods
> on data stored in HDFS? If one puts data into HDFS, can one reasonably
> rely on the built-in fault-tolerance of the triplicate copies of that
> data, or should one still be putting data to tape?

Your data, your call. 

It is still prone to Human error, although there is a trash function that allows deleted files to sit in the trash for a day. 
Does replicated data in one location need a backup ? 

I would think of it as RAID, is raid a backup ?? Not really, does it protect against single disk failure.. yes, and hdfs saves you from single hardware failure as well. 

But human error, localized disaster, a flaming rack , so if you consider your tape backups to be a Disaster Recovery solution, I think you would need a similar Disaster Recovery Scenario for your cluster.

--
Tom 




> _______________________________________________
> colug-432 mailing list
> colug-432 at colug.net
> http://lists.colug.net/mailman/listinfo/colug-432

Tom Hanlon
tom at functionalmedia.com
Cloudera Certified Hadoop Developer
Certified MySQL DBA




More information about the colug-432 mailing list