Hadoop 101 by Chris Wensel

What conversation about cloud computing is complete without a mention of big data, distributing processing, and distributed databases?  There is a recent trend away from relying exclusively on the traditional relational database for everything.  Newer technologies like BigTable and Hadoop provide an alternative mechanism for storing and processing large sets of data that don’t necessarily have extensive relationships needing modeling.  These technologies allow for a much more scalable solution.

In fact, they help in two ways: one by allowing an application to process more data using horizontal scalability (aka ‘elasticity’) and two by reducing load on the primary relational database and hence allowing you to go longer before ‘sharding‘.

Chris Wensel is the man when it comes to understanding Hadoop and he recently gave a couple of talks introducing Hadoop.  Here is one of them:

Post to Twitter

  • tomgullo
    Great slides. I agree, Map-Group-Reduce is more descriptive.
  • This is a pretty good intro to the Hadoop world! I like how you mention avoiding 'sharding' and the pains of RDBMSs.

    I think you might like an article I wrote on the pitfalls of sharding and RDBMSs on my blog -- http://www.roadtofailure.com . Let me know what you think :)
blog comments powered by Disqus

Twitter links powered by Tweet This v1.7.1, a WordPress plugin for Twitter.