Thursday 26 June 2014

Top 10 most popular false belief about Hadoop

Top 10 most popular  false belief about Hadoop

Hadoop and Big Data are practically synonymous these days. There is so much info on Hadoop and Big Data out there, but as the Big Data hype machine gears up, there's a lot of confusion about where Hadoop actually fits into the overall Big Data landscape. Let’s have a look at some of the popular myths about Hadoop.

false belief #1: Hadoop is a database

Hadoop is often talked about like it's a database, but it isn't. Hadoop is primarily a distributed file system and doesn’t contain database features like query optimization, indexing and random access to data. However, Hadoop can be used to build a database system.

false belief #2: Hadoop is a complete, single product

It's not. This is the biggest myth of all! Hadoop consists of multiple open source products like HDFS (Hadoop Distributed File System), MapReduce, PIG, Hive, HBase, Ambari, Mahout, Flume and HCatalog. Basically, Hadoop is an ecosystem -- a family of open source products and technologies overseen by the Apache Software Foundation (ASF).

Wednesday 25 June 2014

13 V's in Big Data

13+ V's in Big Data                   

  • Volume
  • Velocity
  • Variety
  • Veracity
  • Value
  • Visualization
  • Volatile 
  • Variability
  • Viability
  • Venue
  • Vocabulary
  • Vagueness
  • Validity