Skip to content

dltkr77/Homework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

39 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Project Directory ๋งŒ๋“ค๊ธฐ

Command ์ฐฝ์„ ์—ด์–ด์„œ ์•„๋ž˜์™€ ๊ฐ™์ด ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ๋งŒ๋“ค๊ณ , ํ•ด๋‹น ๋””๋ ‰ํ† ๋ฆฌ๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค. vagrant init์„ ํ†ตํ•ด ์ดˆ๊ธฐํ™”๋ฅผ ํ•ด์ฃผ๊ณ , vagrant box add ๋ช…๋ น์–ด ์‚ฌ์šฉํ•ด์„œ ๋ฐ•์Šค๋ฅผ ์ถ”๊ฐ€ํ•ด ์ค๋‹ˆ๋‹ค.

cd \
md project
cd project
vagrant init
vagrant box add ubuntu/trusty64
vagrant box list

========== Logs ==========
C:\Project>vagrant box list
ubuntu/trusty64 (virtualbox, 14.04)

Vagrantfile ์„ค์ •

์œ„์—์„œ ๋งŒ๋“ค์–ด์ง„ Vagrantfile์„ ์—ด์–ด, ์•„๋ž˜์™€ ๊ฐ™์ด ํŽธ์ง‘ํ•ฉ๋‹ˆ๋‹ค.

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure(2) do |config|
  # Master node
  config.vm.define "master" do |master|
    master.vm.provider "virtualbox" do |v|
      v.name = "master"
      v.memory = 4096
      v.cpus = 1
    end
    master.vm.box = "ubuntu/trusty64"
    master.vm.hostname = "master"
    master.vm.network "private_network", ip: "192.168.200.2"
    master.vm.provision "file", source: "./ssh_setting.sh",
      destination: "/home/vagrant/ssh_setting.sh"
    master.vm.provision "shell", path: "./setup.sh"
  end

  # Slave1 node
  config.vm.define "slave1" do |slave1|
    slave1.vm.provider "virtualbox" do |v|
      v.name = "slave1"
      v.memory = 2048
      v.cpus = 1
    end
    slave1.vm.box = "ubuntu/trusty64"
    slave1.vm.hostname = "slave1"
    slave1.vm.network "private_network", ip: "192.168.200.10"
    slave1.vm.provision "file", source: "./ssh_setting.sh",
      destination: "/home/vagrant/ssh_setting.sh"
    slave1.vm.provision "shell", path: "./setup.sh"
  end

  config.vm.define "slave2" do |slave2|
    slave2.vm.provider "virtualbox" do |v|
      v.name = "slave2"
      v.memory = 2048
      v.cpus = 1
    end
    slave2.vm.box = "ubuntu/trusty64"
    slave2.vm.hostname = "slave2"
    slave2.vm.network "private_network", ip: "192.168.200.11"
    slave2.vm.provision "file", source: "./ssh_setting.sh",
      destination: "/home/vagrant/ssh_setting.sh"
    slave2.vm.provision "shell", path: "./setup.sh"
  end
end

Shell Script ์ž‘์„ฑ

VM์—์„œ ์‚ฌ์šฉ๋  ํ™˜๊ฒฝ์„ ๋งŒ๋“ค์–ด ์ฃผ๊ธฐ ์œ„ํ•œ Shell Script๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. ํ•ด๋‹น ํŒŒ์ผ๋ช…์€ 'setup.sh'๋กœ ๋งŒ๋“ค์–ด์„œ project ๋””๋ ‰ํ† ๋ฆฌ์— ๋„ฃ์–ด์ค๋‹ˆ๋‹ค.

#!/bin/bash

# Variables
tools=/home/hadoop/tools
JH=/home/hadoop/tools/jdk
HH=/home/hadoop/tools/hadoop

# Install jdk
apt-get install -y openjdk-7-jre-headless
apt-get install -y openjdk-7-jdk

# Install expect
apt-get install -y expect

# Install git
apt-get install -y git

# Add group and user
addgroup hadoop
useradd -g hadoop -d /home/hadoop/ -s /bin/bash -m hadoop
echo -e "hadoop\nhadoop" | (passwd hadoop)

# Make directory for hdfs
host=`hostname`
if [ $host == "master" ]; then
	mkdir -p /home/hadoop/hdfs/name
else
	mkdir -p /home/hadoop/hdfs/data
fi

# Modify ssh_setting.sh(encoding problem)
sed -i 's/\r//' /home/vagrant/ssh_setting.sh
cp /home/vagrant/ssh_setting.sh /home/hadoop/

# Download hadoop
mkdir $tools
cd $tools
wget http://ftp.daum.net/apache//hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz
tar xvf hadoop-1.2.1.tar.gz
ln -s $tools/hadoop-1.2.1 $tools/hadoop
ln -s /usr/lib/jvm/java-1.7.0-openjdk-amd64 $tools/jdk

# Download Maven
cd $tools
wget http://mirror.apache-kr.org/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz
tar xvf apache-maven-3.2.5-bin.tar.gz
ln -s $tools/apache-maven-3.2.5 $tools/maven

#== Hadoop Setting ==#
# hadoop-env.sh
echo "export JAVA_HOME=/home/hadoop/tools/jdk" >> $HH/conf/hadoop-env.sh
echo "export HADOOP_HOME_WARN_SUPRESS=\"TRUE\"" >> $HH/conf/hadoop-env.sh
echo "export HADOOP_OPTS=-server" >> $HH/conf/hadoop-env.sh

# core-site.xml
echo "<?xml version=\"1.0\"?>" > $HH/conf/core-site.xml
echo "<?xml-stylesheet type=\"text/xsl\" href=\"configuration.xsl\"?>" >> $HH/conf/core-site.xml
echo "" >> $HH/conf/core-site.xml
echo "<!-- Put site-specific property overrides in this file. -->" >> $HH/conf/core-site.xml
echo "" >> $HH/conf/core-site.xml
echo "<configuration>" >> $HH/conf/core-site.xml
echo "  <property>" >> $HH/conf/core-site.xml
echo "    <name>fs.default.name</name>" >> $HH/conf/core-site.xml
echo "    <value>hdfs://master:9000</value>" >> $HH/conf/core-site.xml
echo "  </property>" >> $HH/conf/core-site.xml
echo "</configuration>" >> $HH/conf/core-site.xml

# hdfs-site.xml
echo "<?xml version=\"1.0\"?>" > $HH/conf/hdfs-site.xml
echo "<?xml-stylesheet type=\"text/xsl\" href=\"configuration.xsl\"?>" >> $HH/conf/hdfs-site.xml
echo "" >> $HH/conf/hdfs-site.xml
echo "<!-- Put site-specific property overrides in this file. -->" >> $HH/conf/hdfs-site.xml
echo "" >> $HH/conf/hdfs-site.xml
echo "<configuration>" >> $HH/conf/hdfs-site.xml
echo "  <property>" >> $HH/conf/hdfs-site.xml
echo "    <name>dfs.name.dir</name>" >> $HH/conf/hdfs-site.xml
echo "    <value>/home/hadoop/hdfs/name</value>" >> $HH/conf/hdfs-site.xml
echo "  </property>" >> $HH/conf/hdfs-site.xml
echo "" >> $HH/conf/hdfs-site.xml
echo "  <property>" >> $HH/conf/hdfs-site.xml
echo "    <name>dfs.data.dir</name>" >> $HH/conf/hdfs-site.xml
echo "    <value>/home/hadoop/hdfs/data</value>" >> $HH/conf/hdfs-site.xml
echo "  </property>" >> $HH/conf/hdfs-site.xml
echo "" >> $HH/conf/hdfs-site.xml
echo "  <property>" >> $HH/conf/hdfs-site.xml
echo "    <name>dfs.replication</name>" >> $HH/conf/hdfs-site.xml
echo "    <value>3</value>" >> $HH/conf/hdfs-site.xml
echo "  </property>" >> $HH/conf/hdfs-site.xml
echo "</configuration>" >> $HH/conf/hdfs-site.xml

# mapred-site.xml
echo "<?xml version=\"1.0\"?>" > $HH/conf/mapred-site.xml
echo "<?xml-stylesheet type=\"text/xsl\" href=\"configuration.xsl\"?>" >> $HH/conf/mapred-site.xml
echo "" >> $HH/conf/mapred-site.xml
echo "<!-- Put site-specific property overrides in this file. -->" >> $HH/conf/mapred-site.xml
echo "" >> $HH/conf/mapred-site.xml
echo "<configuration>" >> $HH/conf/mapred-site.xml
echo "  <property>" >> $HH/conf/mapred-site.xml
echo "    <name>mapred.job.tracker</name>" >> $HH/conf/mapred-site.xml
echo "    <value>master:9001</value>" >> $HH/conf/mapred-site.xml
echo "  </property>" >> $HH/conf/mapred-site.xml
echo "</configuration>" >> $HH/conf/mapred-site.xml

# masters, slaves
echo "master" > $HH/conf/masters
echo "slave1" > $HH/conf/slaves
echo "slave2" >> $HH/conf/slaves
#====#

# Environment Setting
chown -R hadoop:hadoop /home/hadoop
chmod 755 -R /home/hadoop
echo "" >> ~hadoop/.bashrc
echo "export JAVA_HOME=$JH" >> ~hadoop/.bashrc
echo "export M2_HOME=$tools/maven" >> ~hadoop/.bashrc
echo "export PATH=\$PATH:\$JAVA_HOME/bin:$HH/bin" >> ~hadoop/.bashrc
echo "export PATH=\$PATH:\$M2_HOME/bin" >> ~hadoop/.bashrc

# /etc/hosts Setting
echo "fe00::0 ip6-localnet" > /etc/hosts
echo "ff00::0 ip6-mcastprefix" >> /etc/hosts
echo "ff02::1 ip6-allnodes" >> /etc/hosts
echo "ff02::2 ip6-allrouters" >> /etc/hosts
echo "ff02::3 ip6-allhosts" >> /etc/hosts
echo "192.168.200.2 master" >> /etc/hosts
echo "192.168.200.10 slave1" >> /etc/hosts
echo "192.168.200.11 slave2" >> /etc/hosts

Shell Script ์ž‘์„ฑ2

๋‚˜์ค‘์— SSH์˜ Public Key๋ฅผ ์ „๋‹ฌํ•˜๊ธฐ ์œ„ํ•œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ž‘์„ฑํ•ด์„œ project ๋””๋ ‰ํ† ๋ฆฌ์— ๋„ฃ์–ด๋‘ก๋‹ˆ๋‹ค. ํŒŒ์ผ๋ช…์€ 'ssh_setting.sh'๋กœ ํ•ฉ๋‹ˆ๋‹ค.

#!/bin/bash

# SSH's public key sharing
expect << EOF
    spawn ssh-keygen -t rsa
    expect "Enter file in which to save the key (/home/hadoop//.ssh/id_rsa):"
        send "\n"
    expect "Enter passphrase (empty for no passphrase):"
        send "\n"
    expect "Enter same passphrase again:"
        send "\n"
    expect eof
EOF

cat ~/.ssh/id_rsa.pub | ssh hadoop@master "cat > ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh hadoop@slave1 "mkdir ~/.ssh; cat > ~/.ssh/authorized_keys"
cat ~/.ssh/id_rsa.pub | ssh hadoop@slave2 "mkdir ~/.ssh; cat > ~/.ssh/authorized_keys"

vagrant up!

ํ•ด๋‹น Vagrantfile์—์„œ ์‚ฌ์šฉํ•  VM๋“ค์˜ ์ด๋ฆ„์€ ๊ฐ๊ฐ 'master', 'slave1', 'slave2' ์ž…๋‹ˆ๋‹ค. ํ˜น์‹œ ๋™์ผํ•œ ์ด๋ฆ„์˜ VM์ด ์กด์žฌํ•œ๋‹ค๋ฉด Vagrantfile์˜ v.name์„ ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

cd \project
vagrant up

========== Logs ==========
C:\Project>vagrant up
Bringing machine 'master' up with 'virtualbox' provider...
Bringing machine 'slave1' up with 'virtualbox' provider...
Bringing machine 'slave2' up with 'virtualbox' provider...
==> master: Importing base box 'ubuntu/trusty64'...
( ์ค‘๋žต ) // JDK ๋“ฑ์˜ ์„ค์น˜ ์ž‘์—…์œผ๋กœ ์‹œ๊ฐ„์ด ์†Œ์š”๋ฉ๋‹ˆ๋‹ค.
==> slave2: hadoop-1.2.1/src/test/system/java/org/apache/hadoop/mapred/TestTaskT
rackerInfoSuccessfulFailedJobs.java
==> slave2: hadoop-1.2.1/src/test/system/java/org/apache/hadoop/mapred/TestTaskT
rackerInfoTTProcess.java
==> slave2: hadoop-1.2.1/src/test/system/java/org/apache/hadoop/mapreduce/test/s
ystem/FinishTaskControlAction.java
==> slave2: hadoop-1.2.1/src/test/system/java/org/apache/hadoop/mapreduce/test/s
ystem/JTClient.java
==> slave2: hadoop-1.2.1/src/test/system/java/org/apac

ํ™˜๊ฒฝ์„ ์•Œ์•„๋‘๊ณ  ๋„˜์–ด๊ฐ‘์‹œ๋‹ค!

VM๋“ค์˜ SSH Port์™€ ๊ฐ ๊ณ„์ • ์„ค์ •์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

========== VM : SSH Port ==========
master : 2222 Port
slave1 : 2200 Port
slave2 : 2201 Port
โ€ป ์ฃผ์˜์‚ฌํ•ญ : ํ•ด๋‹น ํฌํŠธ๋“ค์ด ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š”์ง€ ๊ผญ! ํ™•์ธํ•˜์„ธ์š”.
master -> slave1 -> slave2 ์ˆœ์„œ๋กœ ํฌํŠธ๊ฐ€ ํ• ๋‹น์ด ๋˜๋ฉฐ ์ˆœ์„œ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
2222 -> 2200 -> 2201 -> 2202 -> 2203 -> 2204...
์ค‘๊ฐ„์— ์‚ฌ์šฉ์ด ๋˜๊ณ  ์žˆ๋Š” port๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ, ๋‹ค์Œ ๋ฒˆํ˜ธ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
(์ฆ‰ 2222ํฌํŠธ๊ฐ€ ์‚ฌ์šฉ์ค‘์ด๋ผ๋ฉด master์˜ ํฌํŠธ๋Š” 2200์ด ๋ฉ๋‹ˆ๋‹ค.)

========== ์•„์ด๋”” : ํŒจ์Šค์›Œ๋“œ ==========
root : vagant
vagrant : vagrant
hadoop : hadoop

SSH public key ๊ณต์œ (master์—์„œ๋งŒ ์ˆ˜ํ–‰)

์šฐ์„  master VM์œผ๋กœ ์œ„์˜ port๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์ ‘์†ํ•ฉ๋‹ˆ๋‹ค. (hadoop/hadoop) ๊ทธ ํ›„์— public key๋ฅผ ๊ณต์œ ํ•˜๊ธฐ ์œ„ํ•œ ์‰˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ˆ˜ํ–‰ ๊ฐ VM์— ๋Œ€ํ•ด์„œ 'yes' -> 'hadoop' ํŒจ์Šค์›Œ๋“œ๋ฅผ ์ž…๋ ฅํ•ด ์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

cd ~/
./ssh_setting.sh

========== logs ==========
hadoop@master:/home/hadoop$ ./ssh_setting.sh
spawn ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop//.ssh/id_rsa):
Created directory '/home/hadoop//.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop//.ssh/id_rsa.
Your public key has been saved in /home/hadoop//.ssh/id_rsa.pub.
The key fingerprint is:
d8:dc:17:08:58:e3:9e:29:2c:58:ed:62:b4:94:81:7b hadoop@master
The key's randomart image is:
+--[ RSA 2048]----+
|   ..  o+        |
|  .  +.. o .     |
|   .= . . . .    |
|  .=E+ = +   .   |
|  ..= = S . .    |
|   . o .   .     |
|                 |
|                 |
|                 |
+-----------------+
The authenticity of host 'slave1 (192.168.200.10)' can't be established.
ECDSA key fingerprint is 92:9b:5b:12:56:98:84:00:28:4f:04:13:55:1a:62:63.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave1,192.168.200.10' (ECDSA) to the list of known hosts.
hadoop@slave1's password:
The authenticity of host 'slave2 (192.168.200.11)' can't be established.
ECDSA key fingerprint is c0:28:b5:f8:c5:40:3e:b1:8d:67:94:43:b5:0a:6c:75.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'slave2,192.168.200.11' (ECDSA) to the list of known hosts.
hadoop@slave2's password:

SSH ์ ‘์†์ด ํŒจ์Šค์›Œ๋“œ ์—†์ด ๋˜๋Š”์ง€ ํ™•์ธ

์•„๋ž˜์™€ ๊ฐ™์ด ํŒจ์Šค์›Œ๋“œ๊ฐ€ ์—†์ด ์ ‘์†์ด ๋˜๋Š”์ง€ ํ™•์ธ ํ›„, exit๋กœ ๋น ์ ธ๋‚˜์˜ต๋‹ˆ๋‹ค.

ssh hadoop@master
exit
ssh hadoop@slave1
exit
ssh hadoop@slave2
exit

========== logs ==========
hadoop@master:/home/hadoop$ ssh hadoop@slave1
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-44-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Sat Jan 24 05:28:35 UTC 2015
( ์ƒ๋žต )

ํ•˜๋‘ก NameNode format

cd ~/tools/hadoop/bin
./hadoop namenode -format
์œ„์—์„œ ๋ฌป๋Š” ์žฅ๋ฉด์—์„œ ๊ผญ ๋Œ€๋ฌธ์ž Y๋ฅผ ์ž…๋ ฅ. (์†Œ๋ฌธ์ž ์•ˆ๋จ)

========== logs ==========
hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop namenode -format
Warning: $HADOOP_HOME is deprecated.

15/01/24 06:01:04 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.200.2
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.2.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG:   java = 1.7.0_65
************************************************************/
Re-format filesystem in /home/hadoop/hdfs/name ? (Y or N) Y
15/01/24 06:01:06 INFO util.GSet: Computing capacity for map BlocksMap
15/01/24 06:01:06 INFO util.GSet: VM type       = 64-bit
15/01/24 06:01:06 INFO util.GSet: 2.0% max memory = 1013645312
15/01/24 06:01:06 INFO util.GSet: capacity      = 2^21 = 2097152 entries
15/01/24 06:01:06 INFO util.GSet: recommended=2097152, actual=2097152
15/01/24 06:01:06 INFO namenode.FSNamesystem: fsOwner=hadoop
15/01/24 06:01:06 INFO namenode.FSNamesystem: supergroup=supergroup
15/01/24 06:01:06 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/01/24 06:01:06 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
15/01/24 06:01:06 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
15/01/24 06:01:06 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
15/01/24 06:01:06 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/01/24 06:01:06 INFO common.Storage: Image file /home/hadoop/hdfs/name/current/fsimage of size 112 bytes saved in 0 seconds.
15/01/24 06:01:07 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/hdfs/name/current/edits
15/01/24 06:01:07 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/hdfs/name/current/edits
15/01/24 06:01:07 INFO common.Storage: Storage directory /home/hadoop/hdfs/name has been successfully formatted.
15/01/24 06:01:07 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.200.2
************************************************************/

ํ•˜๋‘ก ์‹คํ–‰ ๋ฐ ํ™•์ธ

./start-all.sh
jps

๊ฐ slave๋“ค๋„ jps ๋ช…๋ น์–ด๋กœ ํ™•์ธ์„ ํ•ด๋ด…๋‹ˆ๋‹ค.
๋ช…๋ น์–ด ํ™•์ธ ๊ฒฐ๊ณผ๋กœ,

master : JPS, JobTracker, NameNode, SecondaryNameNode
slave : JPS, TaskTracker, DataNode
๊ฐ€ ์˜ฌ๋ผ์™€ ์žˆ์œผ๋ฉด ์ •์ƒ์ž…๋‹ˆ๋‹ค.

========== logs ==========
hadoop@master:/home/hadoop/tools/hadoop/bin$ ./start-all.sh
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/hadoop/tools/hadoop/logs/hadoop-hadoop-namenode-master.out
( ์ค‘๋žต )
slave2:
slave2: starting tasktracker, logging to /home/hadoop/tools/hadoop/logs/hadoop-hadoop-tasktracker-slave2.out
slave1: Warning: $HADOOP_HOME is deprecated.
slave1:
slave1: starting tasktracker, logging to /home/hadoop/tools/hadoop/logs/hadoop-hadoop-tasktracker-slave1.out

hadoop@master:/home/hadoop/tools/hadoop/bin$ jps
13921 Jps
13847 JobTracker
13567 NameNode
13775 SecondaryNameNode

hadoop@slave1:~$ jps
13339 Jps
13241 TaskTracker
13104 DataNode

Wordcount (example jar ํŒŒ์ผ์‚ฌ์šฉ) Test

ํ•˜๋‘ก์ด ์ œ๋Œ€๋กœ ์‹คํ–‰๋˜๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด Wordcount๋ฅผ ํ•œ ๋ฒˆ ๋Œ๋ ค๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

cd ~/tools/hadoop/bin/
./hadoop dfs -mkdir input
./hadoop dfs -ls
./hadoop dfs -put ../LICENSE.txt input
./hadoop jar ../hadoop-examples-1.2.1.jar wordcount input output
./hadoop dfs -ls output
./hadoop dfs -cat output/part-r-00000

========== logs ==========
hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop dfs -mkdir input
hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop dfs -ls

Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2015-01-24 06:10 /user/hadoop/input
hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop dfs -put ../LICENSE.txt input

hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop jar ../hadoop-examples-1.2.1.jar wordcount input output

15/01/24 06:10:55 INFO input.FileInputFormat: Total input paths to process : 1
15/01/24 06:10:55 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/01/24 06:10:55 WARN snappy.LoadSnappy: Snappy native library not loaded
15/01/24 06:10:55 INFO mapred.JobClient: Running job: job_201501240603_0001
15/01/24 06:10:56 INFO mapred.JobClient:  map 0% reduce 0%
15/01/24 06:11:04 INFO mapred.JobClient:  map 100% reduce 0%
15/01/24 06:11:12 INFO mapred.JobClient:  map 100% reduce 33%
15/01/24 06:11:13 INFO mapred.JobClient:  map 100% reduce 100%
15/01/24 06:11:15 INFO mapred.JobClient: Job complete: job_201501240603_0001
15/01/24 06:11:15 INFO mapred.JobClient: Counters: 29
( ์ค‘๋žต )
15/01/24 06:11:15 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1500114944
15/01/24 06:11:15 INFO mapred.JobClient:     Map output records=1887

hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop dfs -ls output

Found 3 items
-rw-r--r--   3 hadoop supergroup          0 2015-01-24 06:11 /user/hadoop/output/_SUCCESS
drwxr-xr-x   - hadoop supergroup          0 2015-01-24 06:10 /user/hadoop/output/_logs
-rw-r--r--   3 hadoop supergroup       7376 2015-01-24 06:11 /user/hadoop/output/part-r-00000

hadoop@master:/home/hadoop/tools/hadoop/bin$ ./hadoop dfs -cat output/part-r-00000

"AS     3
"Contribution"  1
"Contributor"   1
( ์ƒ๋žต )

์›๊ฒฉ์ €์žฅ์†Œ์—์„œ Clone ๋งŒ๋“ค๊ธฐ

์ด๋ฒˆ ๊ณผ์ œ๋ฅผ ์œ„ํ•œ ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ๋งŒ๋“ค๊ณ , git์—์„œ ํ•ด๋‹น ์ž๋ฃŒ๋“ค์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

cd
mkdir homework
cd homework
git clone https://github.com/dltkr77/Homework ./

========== logs ==========
hadoop@master:/home/hadoop/homework$ git clone https://github.com/dltkr77/Homework ./
Cloning into '.'...
remote: Counting objects: 59, done.
remote: Compressing objects: 100% (38/38), done.
remote: Total 59 (delta 12), reused 33 (delta 4)
Unpacking objects: 100% (59/59), done.
Checking connectivity... done.

MyTFIDF ํŒจํ‚ค์ง•

cd ~/homework/MyTFIDF/
mvn package

========== logs ==========
hadoop@master:/home/hadoop/homework$ cd ~/homework/MyTFIDF/
hadoop@master:/home/hadoop/homework/MyTFIDF$ mvn package
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building MyTFIDF 1.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
( ์ค‘๋žต )
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15.704 s
[INFO] Finished at: 2015-01-27T01:15:01+00:00
[INFO] Final Memory: 17M/50M
[INFO] ------------------------------------------------------------------------

MyFreq ์‹คํ–‰ํ•ด๋ณด๊ธฐ

์œ„์—์„œ ๋ฉ”์ด๋ธ์„ ์‚ฌ์šฉํ•ด์„œ ๋นŒ๋“œํ•œ MyFreq ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•ด๋ด…๋‹ˆ๋‹ค.

cd ~/homework/files/
tar xvf shakespeare.tar.gz
hadoop dfs -put ~/homework/files/shakespeare shakespeare
cd ~/homework/MyTFIDF/target/
hadoop jar ./MyTFIDF-1.0-SNAPSHOT-jar-with-dependencies.jar MyTFIDF.MyFreq shakespeare myfreq_output
hadoop dfs -ls myfreq_output
hadoop dfs -cat myfreq_output/part-r-00000

========== logs ==========
hadoop@master:/home/hadoop/MyFreq$ cd ~/homework/files/
hadoop@master:/home/hadoop/homework/files$ tar xvf shakespeare.tar.gz
shakespeare/
shakespeare/comedies
shakespeare/glossary
shakespeare/histories
shakespeare/poems
shakespeare/tragedies
hadoop@master:/home/hadoop/files$ hadoop dfs -put ~/homework/files/shakespeare shakespeare
hadoop@master:/home/hadoop/homework/files$ cd ~/homework/MyTFIDF/target/
hadoop@master:/home/hadoop/homework/MyTFIDF/target$ hadoop jar ./MyTFIDF-1.0-SNAPSHOT-jar-with-dependencies.jar MyTFIDF.MyFreq shakespeare myfreq_output
15/01/25 04:26:10 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/01/25 04:26:10 WARN snappy.LoadSnappy: Snappy native library not loaded
15/01/25 04:26:10 INFO mapred.JobClient: Running job: job_201501240857_0019
15/01/25 04:26:11 INFO mapred.JobClient:  map 0% reduce 0%
15/01/25 04:26:28 INFO mapred.JobClient:  map 13% reduce 0%
15/01/25 04:26:29 INFO mapred.JobClient:  map 45% reduce 0%
15/01/25 04:26:31 INFO mapred.JobClient:  map 72% reduce 0%
15/01/25 04:26:32 INFO mapred.JobClient:  map 80% reduce 0%
15/01/25 04:26:37 INFO mapred.JobClient:  map 100% reduce 0%
15/01/25 04:26:42 INFO mapred.JobClient:  map 100% reduce 33%
15/01/25 04:26:46 INFO mapred.JobClient:  map 100% reduce 100%
( ์ค‘๋žต )
15/01/25 04:26:47 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1025097728
15/01/25 04:26:47 INFO mapred.JobClient:     Reduce output records=107523
15/01/25 04:26:47 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=4488859648
15/01/25 04:26:47 INFO mapred.JobClient:     Map output records=948560
hadoop@master:/home/hadoop/homework/MyTFIDF/target$ hadoop dfs -ls myfreq_output
Found 3 items
-rw-r--r--   3 hadoop supergroup          0 2015-01-25 04:26 /user/hadoop/myfreq_output/_SUCCESS
drwxr-xr-x   - hadoop supergroup          0 2015-01-25 04:26 /user/hadoop/myfreq_output/_logs
-rw-r--r--   3 hadoop supergroup    2109371 2015-01-25 04:26 /user/hadoop/myfreq_output/part-r-00000
hadoop@master:/home/hadoop/homework/MyTFIDF/target$ hadoop dfs -cat myfreq_output/part-r-00000
( ์ค‘๋žต )
zeal tragedies  5
zealous comedies        2
zealous histories       3
zealous poems   1
zeals tragedies 1
zed tragedies   1
zenelophon comedies     1
zenith comedies 1
zephyrs tragedies       1
zir tragedies   2
zo tragedies    1
zodiac tragedies        1
zodiacs comedies        1
zone tragedies  1
zounds histories        15
zounds tragedies        6
zwaggered tragedies     1

MyCounts ์‹คํ–‰ํ•ด๋ณด๊ธฐ

hadoop jar MyTFIDF-1.0-SNAPSHOT-jar-with-dependencies.jar MyTFIDF.MyCounts myfreq_output mycounts_output
hadoop dfs -ls mycounts_output
hadoop dfs -cat mycounts_output/part-r-00000

========== logs ==========
( jar ์‹คํ–‰๊ณผ์ • ์ƒ๋žต)
hadoop@master:/home/hadoop/MyCounts$ hadoop dfs -ls mycounts_output
Found 3 items
-rw-r--r--   3 hadoop supergroup          0 2015-01-25 04:48 /user/hadoop/mycounts_output/_SUCCESS
drwxr-xr-x   - hadoop supergroup          0 2015-01-25 04:48 /user/hadoop/mycounts_output/_logs
-rw-r--r--   3 hadoop supergroup    2324417 2015-01-25 04:48 /user/hadoop/mycounts_output/part-r-00000
hadoop@master:/home/hadoop/MyCounts$ hadoop dfs -cat mycounts_output/part-r-00000
( ์ค‘๋žต )
zany comedies   1 2
zany glossary   1 2
zeal tragedies  5 3
zeal comedies   8 3
zeal histories  21 3
zealous poems   1 3
zealous histories       3 3
zealous comedies        2 3
zeals tragedies 1 1
zed tragedies   1 1
zenelophon comedies     1 1
zenith comedies 1 1
zephyrs tragedies       1 1
zir tragedies   2 1
zo tragedies    1 1
zodiac tragedies        1 1
zodiacs comedies        1 1
zone tragedies  1 1
zounds tragedies        6 2
zounds histories        15 2
zwaggered tragedies     1 1

MyTFIDF ์‹คํ–‰ํ•ด๋ณด๊ธฐ

hadoop jar MyTFIDF-1.0-SNAPSHOT-jar-with-dependencies.jar MyTFIDF.MyTFIDF shakespeare mycounts_output mytfidf_output
hadoop dfs -ls mytfidf_output
hadoop dfs -cat mytfidf_output/part-r-00000
========== logs ==========
hadoop@master:/home/hadoop/homework/MyTFIDF/target$ hadoop dfs -ls mytfidf_output
Found 3 items
-rw-r--r--   3 hadoop supergroup          0 2015-01-25 04:58 /user/hadoop/mytfidf_output/_SUCCESS
drwxr-xr-x   - hadoop supergroup          0 2015-01-25 04:58 /user/hadoop/mytfidf_output/_logs
-rw-r--r--   3 hadoop supergroup    2478759 2015-01-25 04:58 /user/hadoop/mytfidf_output/part-r-00000
hadoop@master:/home/hadoop/homework/MyTFIDF/target$ hadoop dfs -cat mytfidf_output/part-r-00000
( ์ค‘๋žต )
zany/comedies   0.6931471805599453
zany/glossary   0.6931471805599453
zeal/comedies   0.0
zeal/histories  0.0
zeal/tragedies  0.0
zealous/comedies        0.0
zealous/histories       0.0
zealous/poems   0.0
zeals/tragedies 1.6094379124341003
zed/tragedies   1.6094379124341003
zenelophon/comedies     1.6094379124341003
zenith/comedies 1.6094379124341003
zephyrs/tragedies       1.6094379124341003
zir/tragedies   3.2188758248682006
zo/tragedies    1.6094379124341003
zodiac/tragedies        1.6094379124341003
zodiacs/comedies        1.6094379124341003
zone/tragedies  1.6094379124341003
zounds/histories        10.39720770839918
zounds/tragedies        4.1588830833596715
zwaggered/tragedies     1.6094379124341003

About

Homework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors