Tuesday, 19 May 2015

Apache Kafka installation with Zookeeper

For Video Reference 



Step 1: Download  Zookeeper , Kafka and JDK 1.7



Step 2: Untar the zookeeper 

Step 3: Start the Zookeeper (find the command from the below picture and give jsp to check the daemon)

Step 4: Download Install kafka


Step 5: Start the kafka 
bin/kafka-server-start.sh config/server.properties

You can see the log like this 


Step 6: Now give jps to check both the Zookeeper and kafka daemon is up and running.



Step 7: Create a new Topic "demo" and list all the topics.


Step 8: Start the console producer to produce (send) some message 


Step 9:  Start the console consumer to consume (receive) some message






Thursday, 9 April 2015

Apache Storm single node installation

Video Reference 


Step 1:  Download Zookeeper

wget http://www.eu.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz

Step 2:
 tar -zxcf zookeeper-3.4.6.tar.gz
 cd zookeeper-3.4.6
 cd conf

 Step 3:
 cp zookeeper_sample.cfg zoo.cfg
 vi zoo.cfg
 tickTime=2000
 dataDir=/home/hadoop/zookeeper
 clientPort=2181


Step 4: Download Storm

wget http://apache.mesi.com.ar/storm/apache-storm-0.9.3/apache-storm-0.9.3.tar.gz

Step 5:
tar -zxvf apache-storm-0.9.3.tar.gz
cd apache-storm-0.9.3
cd conf

Step 6:
vi storm.yaml

storm.zookeeper.servers:
    - "localhost"

 storm.zookeeper.port: 2181
 nimbus.host: "localhost"


Step 6: Start all the service (Zookeeper + Storm )

Zookeeper

bin/zkServer.sh start
jps
QuorumPeerMain


Storm 
 bin/storm nimbus


 bin/storm supervisor

 bin/storm ui

JPS over all service



Step 7:Check UI 

localhost:8080


Additional Native dependencies:(Optional to install but need when you go for advance )

wget http://download.zeromq.org/zeromq-2.1.7.tar.gz

tar –xzf zeromq-2.1.7.tar.gz

cd  zeromq-2.1.7

./configure

Make

sudo make install

Download and installation commands for JZMQ:

Obtain JZMQ using

git clone https://github.com/nathanmarz/jzmq.git

cd jzmq

sudo apt-get install autoconf
sudo apt-get install automake
sudo apt-get install libtool


./autogen.sh

./configure

make

sudo make install






Wednesday, 18 March 2015

Apache Spark and Hadoop Integration with example

Step 1 : Install hadoop in your machine  ( 1x or 2x) and also you need to set java path and scala path in .bashrc ( for setting path refer this post Spark installation )

Step 2: Check all hadoop daemons are up running.



Step 3: Write some data in your hdfs (here my file name in hdfs is word)

Step 4: Download apache spark for hadoop 1x or 2x based on you hadoop installed version in step 1.



Step 5: Untar the spark hadoop file.

Step 6: Start the spark hadoop shell.



Step 7: Type the following command.once spark shell started



Step 8: See the Out put In terminal.



Step 9: Check the Namenode UI (localhost:50070)







Step 10: Check the spark UI (localhost:4040) for monitoring the job


Tuesday, 10 March 2015

Apache spark word count in scala and python

Wordcount in Scala

bin/spark-shell

scala>val textFile = sc.textFile("/home/username/word.txt")

scala>val counts = textFile.flatMap(line => line.split(" "))map(word => (word, 1))reduceByKey(_ + _)

scala>counts.collect()

Wordcount in Python

bin/pyspark

>>>text_file = sc.textFile("/home/username/word.txt")

>>>counts = text_file.flatMap(lambda line: line.split(" ")).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b)

>>>counts.collect()

Input File for word.txt

I love bigdata
I like bigdata

Spark web UI


Friday, 6 March 2015

Apache Spark videos

Apache Spark Quick introduction  Lesson 1



Apache Spark wordcount in scala and python 



Apache Spark Installation

Step 1 Download Spark Click here

Step 2 Download Scala Click here


Step 3 Download Java

Click here

NOTE install git Go to -->terminal --> sudo apt-get insatll git 

Step 4 Untar Spark , Scala , Jdk




Step 5 Set the environment path in .bashrc


Step 6 Start building spark using sbt 




Step 7 Start spark shell



For Video Apache Spark installation  



Thursday, 13 November 2014

Top 30 Hive Interview Questions

1.What is Hive Shell ?

The shell is the primary way that we will interact with hive by using hiveql commands.In other words shell is nothing but a prompt which is used to enter the Hiveql commands for interacting the Hive shell

2.How we can enter into Hive shell from normal terminal ?

just by entering the hive command like ‘bin/hive’

3.How we can check the hive shell is working or not ?

After entered into hive shell just enter another Hiveql command like ‘show databases;’

Is it necessary to add semi colon (;) at end of the Hiveql commands ?

Yes,We have to add semicolon (;) at end of the Hiveql every  command.