Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
project:rpihadoop [2017/10/30 18:30]
licho [Krok 2: Hadoop SingleNode]
project:rpihadoop [2017/12/21 08:58]
licho [Zpracovani Dat]
Line 11: Line 11:
   - Data Analysis   - Data Analysis
 {{project:​hadoop-data-analysis-arch.png}} {{project:​hadoop-data-analysis-arch.png}}
 +
 +{{project:​kafka-spark.jpg}}
 ==== HDFS ==== ==== HDFS ====
 {{project:​hadoop-hdfs-arch.png?​550}} {{project:​hadoop-hdfs-arch.png?​550}}
Line 163: Line 165:
 cat /​opt/​hadoop-2.7.4/​etc/​hadoop/​slaves</​code>​ cat /​opt/​hadoop-2.7.4/​etc/​hadoop/​slaves</​code>​
   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​mapred-site.xml'':​**<​code>​   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​mapred-site.xml'':​**<​code>​
-cat <<EOF>/​opt/​hadoop-2.7.4/​etc/​hadoop/​mapred-site.xml+cat <<EOT>/​opt/​hadoop-2.7.4/​etc/​hadoop/​mapred-site.xml
 <​configuration>​ <​configuration>​
-  ​<​property>​ +    ​<​property
-    <​name>​mapreduce.framework.name</​name>​ +        <​name>​mapreduce.job.tracker</​name>​ 
-    <​value>​yarn</​value>​ +        <​value>​hadoop-rpi1.labka.cz:​5431</​value
-  </​property>​ +    ​</​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​mapreduce.map.memory.mb</​name>​ +        ​<​name>​mapreduce.framework.name</​name>​ 
-    <​value>​256</​value>​ +        <​value>​yarn</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​mapreduce.map.java.opts</​name>​ +        <​name>​mapreduce.map.memory.mb</​name>​ 
-    <​value>​-Xmx204m</​value>​ +        <​value>​256</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​mapreduce.reduce.memory.mb</​name>​ +        <​name>​mapreduce.map.java.opts</​name>​ 
-    <​value>​102</​value>​ +        <​value>​-Xmx204m</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​mapreduce.reduce.java.opts</​name>​ +        <​name>​mapreduce.reduce.memory.mb</​name>​ 
-    <​value>​-Xmx102m</​value>​ +        <​value>​102</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​yarn.app.mapreduce.am.resource.mb</​name>​ +        <​name>​mapreduce.reduce.java.opts</​name>​ 
-    <​value>​128</​value>​ +        <​value>​-Xmx102m</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​yarn.app.mapreduce.am.command-opts</​name>​ +        <​name>​yarn.app.mapreduce.am.resource.mb</​name>​ 
-    <​value>​-Xmx102m</​value>​ +        <​value>​128</​value>​ 
-  </​property>​+    </​property>​ 
 +    <​property>​ 
 +        <​name>​yarn.app.mapreduce.am.command-opts</​name>​ 
 +        <​value>​-Xmx102m</​value>​ 
 +    </​property>​
 </​configuration>​ </​configuration>​
-EOF</​code>​+EOT</​code>​
   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​hdfs-site.xml'':​**<​code>​   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​hdfs-site.xml'':​**<​code>​
-cat <<EOF>/​opt/​hadoop-2.7.4/​etc/​hadoop/​hdfs-site.xml+cat <<EOT>/​opt/​hadoop-2.7.4/​etc/​hadoop/​hdfs-site.xml
 <​configuration>​ <​configuration>​
-   <​property>​  +    ​<​property>​ 
-      <​name>​dfs.replication</​name>​  +        <​name>​dfs.datanode.data.dir</​name>​ 
-      <​value>​1</​value>​  +        <​value>​/​opt/​hadoop_tmp/​hdfs/​datanode</​value>​ 
-   ​</​property>​  +        <​final>​true</​final>​ 
-   ​<​property>​  +    ​</​property>​ 
-      <​name>​dfs.name.dir</​name>​  +    <​property>​ 
-      <​value>​file:///​hdfs/​namenode</​value>​  +        <​name>​dfs.namenode.name.dir</​name>​ 
-   ​</​property>​  +        <​value>/​opt/hadoop_tmp/​hdfs/​namenode</​value>​ 
-   ​<​property>​  +        <​final>​true</​final>​ 
-      <​name>​dfs.data.dir</​name>​ +    ​</​property>​ 
-      <​value>​file:///hdfs/​datanode</​value>​  +    <​property>​ 
-   ​</​property>​+        <​name>​dfs.namenode.http-address</​name>​ 
 +        <​value>​master:50070</value> 
 +    </property>​ 
 +    <​property>​ 
 +        <​name>​dfs.replication<​/name> 
 +        <​value>​11</​value>​ 
 +    </​property>​
 </​configuration>​ </​configuration>​
-EOF</​code>​+EOT</​code>​
   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​core-site.xml'':​**<​code>​   - **Konfigurace ''/​opt/​hadoop-2.7.4/​etc/​hadoop/​core-site.xml'':​**<​code>​
-cat <<EOF>/​opt/​hadoop-2.7.4/​etc/​hadoop/​core-site.xml+cat <<EOT>/​opt/​hadoop-2.7.4/​etc/​hadoop/​core-site.xml
 <​configuration>​ <​configuration>​
-  ​<​property>​ +    ​<​property>​ 
-    <​name>​fs.defaultFS</​name>​ +        <​name>​fs.default.name</​name>​ 
-    <​value>​hdfs://​hadoop-rpi1.labka.cz:​9000</​value>​ +        <​value>​hdfs://​hadoop-rpi1.labka.cz:​9000/</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property
-    <​name>​hadoop.tmp.dir</​name>​ +        <​name>​fs.default.FS</​name>​ 
-    <​value>/​hdfs/​tmp</​value>​ +        <​value>​hdfs://​hadoop-rpi1.labka.cz:​9000/</​value
-  </​property>​+    ​</​property>​ 
 +    <​property>​ 
 +        ​<​name>​hadoop.tmp.dir</​name>​ 
 +        <​value>​/​opt/​hadoop_tmp/​hdfs/​tmp</​value>​ 
 +    </​property>​
 </​configuration>​ </​configuration>​
 EOF</​code>​ EOF</​code>​
Line 228: Line 244:
 cat <<​EOF>/​opt/​hadoop-2.7.4/​etc/​hadoop/​yarn-site.xml cat <<​EOF>/​opt/​hadoop-2.7.4/​etc/​hadoop/​yarn-site.xml
 <​configuration>​ <​configuration>​
-  ​<​property>​ +    ​<​property>​ 
-    <​name>​yarn.resourcemanager.hostname</​name>​ +        <​name>​yarn.resourcemanager.resource-tracker.address</​name>​ 
-    <​value>​hadoop-rpi1.labka.cz</​value>​ +        <​value>​master:8025</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​yarn.resourcemanager.address</​name>​ +        <​name>​yarn.resourcemanager.scheduler.address</​name>​ 
-    <​value>​hadoop-rpi1.labka.cz:8050</​value>​ +        <​value>​master:8035</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property>​ +    <​property>​ 
-    <​name>​yarn.resourcemanager.scheduler.address</​name>​ +        <​name>​yarn.resourcemanager.address</​name>​ 
-    <​value>​hadoop-rpi1.labka.cz:8030</value+        <​value>​master:8050</​value>​ 
-  </​property>​ +    </​property>​ 
-  <​property+   ​<​property>​ 
-    ​<​name>​yarn.resourcemanager.resource-tracker.address</​name>​ +     <​name>​yarn.nodemanager.aux-services</​name>​ 
-     <​value>​hadoop-rpi1.labka.cz:​8031</​value>​ +     ​<​value>​mapreduce_shuffle</​value>​ 
-  ​</property>​ +   ​</​property>​ 
-  <property>​ +   ​<​property>​ 
-    <name>​yarn.resourcemanager.webapp.address</​name>​ +     ​<​name>​yarn.nodemanager.resource.cpu-vcores</​name>​ 
-     <​value>​hadoop-rpi1.labka.cz:​8088</​value>​ +     ​<​value>​4</​value>​ 
-  </property>​ +   ​</​property>​ 
-    <​name>​yarn.resourcemanager.admin.address</​name>​ +   ​<​property>​ 
-     ​<​value>​hadoop-rpi1.labka.cz:8033</value+     ​<​name>​yarn.nodemanager.resource.memory-mb</​name>​ 
-  </​property>​ +     ​<​value>​1024</​value>​ 
-  <​property+   ​</​property>​ 
-    ​<​name>​yarn.nodemanager.hostname</​name>​ +   ​<​property>​ 
-     <​value>​hadoop-rpi1.labka.cz</​value>​ +     ​<​name>​yarn.scheduler.minimum-allocation-mb</​name>​ 
-  ​</​property>​ +     ​<​value>​128</​value>​ 
-  <​property+   ​</​property>​ 
-    <​name>​yarn.nodemanager.address</​name+   ​<​property>​ 
-     <​value>​hadoop-rpi1.labka.cz:​8060</​value>​ +     ​<​name>​yarn.scheduler.maximum-allocation-mb</​name>​ 
-  </​property>​ +     ​<​value>​1024</​value>​ 
-  </​property>​ +   ​</​property>​ 
-    <​name>​yarn.nodemanager.localizer.address</​name>​ +   ​<​property>​ 
-     <​value>​hadoop-rpi1.labka.cz:​8040</​value>​ +     ​<​name>​yarn.scheduler.minimum-allocation-vcores</​name>​ 
-  </​property>​ +     ​<​value>​1</​value>​ 
-  <​property>​ +   ​</​property>​ 
-    ​<​name>​yarn.nodemanager.aux-services</​name>​ +   ​<​property>​ 
-    <​value>​mapreduce_shuffle</​value>​ +     ​<​name>​yarn.scheduler.maximum-allocation-vcores</​name>​ 
-  </​property>​ +     ​<​value>​4</​value>​ 
-  <​property>​ +   ​</​property>​ 
-    <​name>​yarn.nodemanager.resource.cpu-vcores</​name>​ +   ​<​property>​ 
-    <​value>​4</​value>​ +     ​<​name>​yarn.nodemanager.vmem-check-enabled</​name>​ 
-  </​property>​ +     ​<​value>​false</​value>​ 
-  <​property>​ +   ​</​property>​ 
-    <​name>​yarn.nodemanager.resource.memory-mb</​name>​ +   ​<​property>​ 
-    <​value>​1024</​value>​ +      <​name>​yarn.nodemanager.pmem-check-enabled</​name>​ 
-  </​property>​ +      <​value>​true</​value>​ 
-  <​property>​ +   ​</​property>​ 
-    <​name>​yarn.scheduler.minimum-allocation-mb</​name>​ +   ​<​property>​ 
-    <​value>​128</​value>​ +     ​<​name>​yarn.nodemanager.vmem-pmem-ratio</​name>​ 
-  </​property>​ +     ​<​value>​4</​value>​ 
-  <​property>​ +   ​</​property>​ 
-    <​name>​yarn.scheduler.maximum-allocation-mb</​name>​ +   ​<​property>​ 
-    <​value>​1024</​value>​ +     ​<​name>​yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</​name>​ 
-  </​property>​ +     ​<​value>​98.5</​value>​ 
-  <​property>​ +   ​</​property>​
-    <​name>​yarn.scheduler.minimum-allocation-vcores</​name>​ +
-    <​value>​1</​value>​ +
-  </​property>​ +
-  <​property>​ +
-    <​name>​yarn.scheduler.maximum-allocation-vcores</​name>​ +
-    <​value>​4</​value>​ +
-  </​property>​ +
-  <​property>​ +
-    <​name>​yarn.nodemanager.vmem-check-enabled</​name>​ +
-    <​value>​false</​value>​ +
-  </​property>​ +
-  <​property>​ +
-     ​<​name>​yarn.nodemanager.pmem-check-enabled</​name>​ +
-     ​<​value>​true</​value>​ +
-  </​property>​ +
-  <​property>​ +
-    <​name>​yarn.nodemanager.vmem-pmem-ratio</​name>​ +
-    <​value>​4</​value>​ +
-  </​property>​ +
-  <​property>​ +
-    <​name>​yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</​name>​ +
-    <​value>​98.5</​value>​ +
-  </​property>​+
 </​configuration>​ </​configuration>​
 EOF</​code>​ EOF</​code>​
Line 403: Line 396:
   - **Konfigurace uloziste ''/​hdfs'':​**<​code>​   - **Konfigurace uloziste ''/​hdfs'':​**<​code>​
 #Zopakujem na vsech nodech #Zopakujem na vsech nodech
-mkdir -p /hdfs/tmp +mkdir -p /​opt/​hadoop_tmp/hdfs/tmp 
-mkdir -p /​hdfs/​namenode +mkdir -p /​opt/​hadoop_tmp/​hdfs/​namenode 
-mkdir -p /​hdfs/​datanode +mkdir -p /​opt/​hadoop_tmp/​hdfs/​datanode 
-chown -R hduser:​hadoop /hdfs+chown -R hduser:​hadoop /opt/hadoop_tmp 
-chmod -R 750 /hdfs/ +chmod -R 750 /opt/hadoop_tmp</​code>​ 
-/opt/hadoop-2.7.4/​bin/​hdfs namenode -format</​code>​ +  - **Spusteni ''​hdfs'' ​z master nodu:​**<​code>​/​opt/​hadoop-2.7.4/​bin/​hdfs namenode -format
-  - **Spusteni ''​hdfs'':​**<​code>​+
 /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh
 curl  http://​hadoop-rpi1.labka.cz:​50070/​ curl  http://​hadoop-rpi1.labka.cz:​50070/​
Line 688: Line 680:
  
 ==== Krok 6: Flume ===== ==== Krok 6: Flume =====
-http://hadooptutorial.info/​apache-flume-installation/​ +== Prerequisite:​ == 
-  ​+  * **JDK 1.6 or later versions of Java** installed on our machine.  
 +  * **Memory** – Sufficient memory for configurations used by sources, channels or sinks.  
 +  * **Disk Space** – Sufficient disk space for configurations used by channels or sinks.  
 +  * **Directory Permissions** – Read/Write permissions for directories used by agent.  
 +=== Flume Installation === 
 +  - **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://flume.apache.org/download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''​apache-flume-1.5.0.1-bin.tar.gz''​) is used for installation ​in this post. 
 +  - **Copy** the ''​apache-flume-1.5.0.1-bin.tar.gz''​from downloads folder to our preferred flume installation directory, usually into ''​/usr/​lib/​flume''​ and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <​code>​ $ sudo mkdir /​usr/​lib/​flume $ sudo chmod -R 777 /​usr/​lib/​flume 
 +$ cp apache-flume-1.5.0.1-bin.tar.gz /​usr/​lib/​flume/​ 
 +$ cd /​usr/​lib/​flume 
 +$ tar -xzf apache-flume-1.5.0.1-bin.tar.gz</​code>​ 
 +  ​- **Set ''​FLUME_HOME'',​ ''​FLUME_CONF_DIR''​** environment variables in ''​.bashrc''​ file as shown below and add the Flume bin directory to ''​PATH''​ environment variable. Shell:<​code>​$ vi ~/​.bashrc</​code>​ 
 +  - **Edit:** In ''​FLUME_CONF_DIR''​ directory, rename flume-env.sh.template file to ''​flume-env.sh''​ and provide value for ''​JAVA_HOME''​ environment variable with Java installation directory.  
 +  - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''​JAVA_OPTS''​ variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <​code>​JAVA_HOME="​cesta"​ 
 +JAVAOPTS="​-Xms500m -Xmx1000m -Dcom.sun/​management.jmxremote"</​code>​ 
 +  - **Work done:** With these settings, we can consider flume installation as completed. 
 +  - **Verification:​** We can verify the flume installation with<​code>​$ flume-ng –help</​code>​ command on terminal. If we get output similar to below then flume installation is successful. 
 ==== Krok 7: Oozie ===== ==== Krok 7: Oozie =====
-http://​www.rohitmenon.com/​index.php/apache-oozie-installation/+== Prerequisite:​ == 
 +  * **Hadoop 2** is installed on our machine.  
 +=== Oozie Installation === 
 +My Hadoop Location : /​opt/​hadoop-2.7.4 
 + 
 +  - From your home directory execute the following commands (my home directory is /​home/​hduser):<​code>​$ pwd 
 +/​home/​hduser</​code>​ 
 +  - **Download Oozie: **<​code>​$ wget http://​supergsego.com/​apache/​oozie/​3.3.2/​oozie-3.3.2.tar.gz</​code>​ 
 +  - **Untar: **<​code>​$ tar xvzf oozie-3.3.2.tar.gz</​code>​ 
 +  - **Build Oozie** <​code>​$ cd oozie-3.3.2/​bin 
 +$ ./​mkdistro.sh -DskipTests</​code>​ 
 +=== Oozie Server Setup === 
 +  - Copy the built binaries to the home directory as ‘oozie’<​code>​$ cd ../../ 
 +$ cp -R oozie-3.3.2/​distro/​target/​oozie-3.3.2-distro/​oozie-3.3.2/​ oozie</​code>​ 
 +  - Create the required libext directory<​code>​$ cd oozie 
 +$ mkdir libext</​code>​ 
 +  - Copy all the required jars from hadooplibs to the libext directory using the following command:<​code>​$ cp ../​oozie-3.3.2/​hadooplibs/​target/​oozie-3.3.2-hadooplibs.tar.gz . 
 +$ tar xzvf oozie-3.3.2-hadooplibs.tar.gz 
 +$ cp oozie-3.3.2/​hadooplibs/​hadooplib-1.1.1.oozie-3.3.2/​* libext/</​code>​ 
 +  - Get Ext2Js – This library is not bundled with Oozie and needs to be downloaded separately. This library is used for the Oozie Web Console:<​code>​$ cd libext 
 +$ wget http://​extjs.com/​deploy/​ext-2.2.zip 
 +$ cd ..</​code>​ 
 +  - Update **../​hadoop/​conf/​core-site.xml** as follows:<​code><​property>​ 
 +<​name>​hadoop.proxyuser.hduser.hosts</​name>​ 
 +<​value>​localhost</​value>​ 
 +</​property>​ 
 +<​property>​ 
 +<​name>​hadoop.proxyuser.hduser.groups</​name>​ 
 +<​value>​hadoop</​value>​ 
 +</​property></​code>​ 
 +  - Here, ‘hduser’ is the username and it belongs to ‘hadoop’ group. 
 +  - Prepare the WAR file<​code>​$ ./​bin/​oozie-setup.sh prepare-war 
 + 
 +setting CATALINA_OPTS="​$CATALINA_OPTS -Xmx1024m"​ 
 + 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-beanutils-1.7.0.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-beanutils-core-1.8.0.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-codec-1.4.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-collections-3.2.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-configuration-1.6.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-digester-1.8.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-el-1.0.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-io-2.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-lang-2.4.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-logging-1.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-math-2.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​commons-net-1.4.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​hadoop-client-1.1.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​hadoop-core-1.1.1.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​hsqldb-1.8.0.7.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​jackson-core-asl-1.8.8.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​jackson-mapper-asl-1.8.8.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​log4j-1.2.16.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​oro-2.0.8.jar 
 +INFO: Adding extension: /​home/​hduser/​oozie/​libext/​xmlenc-0.52.jar 
 + 
 +New Oozie WAR file with added 'ExtJS library, JARs' at /​home/​hduser/​oozie/​oozie-server/​webapps/​oozie.war 
 + 
 +INFO: Oozie is ready to be started</​code>​ 
 +  - Create sharelib on HDFS<​code>​$ ./​bin/​oozie-setup.sh sharelib create -fs hdfs://​localhost:​54310 
 +setting CATALINA_OPTS="​$CATALINA_OPTS -Xmx1024m"​ 
 +the destination path for sharelib is: /​user/​hduser/​share/​lib</​code>​ 
 +  - Create the OoozieDB<​code>​$ ./​bin/​ooziedb.sh create -sqlfile oozie.sql -run 
 +setting CATALINA_OPTS="​$CATALINA_OPTS -Xmx1024m"​ 
 + 
 +Validate DB Connection 
 +DONE 
 +Check DB schema does not exist 
 +DONE 
 +Check OOZIE_SYS table does not exist 
 +DONE 
 +Create SQL schema 
 +DONE 
 +Create OOZIE_SYS table 
 +DONE 
 + 
 +Oozie DB has been created for Oozie version '​3.3.2'​ 
 + 
 +The SQL commands have been written to: oozie.sql</​code>​ 
 +  - To start Oozie as a daemon use the following command:<​code>​$ ./​bin/​oozied.sh start 
 + 
 +Setting OOZIE_HOME: /​home/​hduser/​oozie 
 +Setting OOZIE_CONFIG:​ /​home/​hduser/​oozie/​conf 
 +Sourcing: /​home/​hduser/​oozie/​conf/​oozie-env.sh 
 +setting CATALINA_OPTS="​$CATALINA_OPTS -Xmx1024m"​ 
 +Setting OOZIE_CONFIG_FILE:​ oozie-site.xml 
 +Setting OOZIE_DATA: /​home/​hduser/​oozie/​data 
 +Setting OOZIE_LOG: /​home/​hduser/​oozie/​logs 
 +Setting OOZIE_LOG4J_FILE:​ oozie-log4j.properties 
 +Setting OOZIE_LOG4J_RELOAD:​ 10 
 +Setting OOZIE_HTTP_HOSTNAME:​ rohit-VirtualBox 
 +Setting OOZIE_HTTP_PORT:​ 11000 
 +Setting OOZIE_ADMIN_PORT:​ 11001 
 +Setting OOZIE_HTTPS_PORT:​ 11443 
 +Setting OOZIE_BASE_URL:​ http://​rohit-VirtualBox:​11000/​oozie 
 +Setting CATALINA_BASE:​ /​home/​hduser/​oozie/​oozie-server 
 +Setting OOZIE_HTTPS_KEYSTORE_FILE:​ /​home/​hduser/​.keystore 
 +Setting OOZIE_HTTPS_KEYSTORE_PASS:​ password 
 +Setting CATALINA_OUT:​ /​home/​hduser/​oozie/​logs/​catalina.out 
 +Setting CATALINA_PID:​ /​home/​hduser/​oozie/​oozie-server/​temp/​oozie.pid 
 + 
 +Using CATALINA_OPTS:​ -Xmx1024m -Dderby.stream.error.file=/​home/​hduser/​oozie/​logs/​derby.log 
 +Adding to CATALINA_OPTS:​ -Doozie.home.dir=/​home/​hduser/​oozie -Doozie.config.dir=/​home/​hduser/​oozie/​conf -Doozie.log.dir=/​home/​hduser/​oozie/​logs -Doozie.data.dir=/​home/​hduser/​oozie/​data -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=rohit-VirtualBox -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://​rohit-VirtualBox:​11000/​oozie -Doozie.https.keystore.file=/​home/​hduser/​.keystore -Doozie.https.keystore.pass=password -Djava.library.path= 
 + 
 +Using CATALINA_BASE:​ /​home/​hduser/​oozie/​oozie-server 
 +Using CATALINA_HOME:​ /​home/​hduser/​oozie/​oozie-server 
 +Using CATALINA_TMPDIR:​ /​home/​hduser/​oozie/​oozie-server/​temp 
 +Using JRE_HOME: /​usr/​lib/​jvm/​java-6-oracle 
 +Using CLASSPATH: /​home/​hduser/​oozie/​oozie-server/​bin/​bootstrap.jar 
 +Using CATALINA_PID:​ /​home/​hduser/​oozie/​oozie-server/​temp/​oozie.pid</​code>​ 
 + 
 +  - To start Oozie as a foreground process use the following command:<​code>​$ ./​bin/​oozied.sh run</​code>​ Check the Oozie log file logs/​oozie.log to ensure Oozie started properly. 
 +  - Use the following command to check the status of Oozie from command line:<​code>​$ ./bin/oozie admin -oozie http://​localhost:​11000/​oozie -status 
 +System mode: NORMAL</​code>​ 
 +  - URL for the Oozie Web Console is [[http://​localhost:​11000/​oozie|Oozie Web Console]]{{http://​www.rohitmenon.com/​wp-content/​uploads/​2013/​12/​OozieWebConsole.png|Oozie Web Console}} 
 +=== Oozie Client Setup === 
 +  - **Instalation:​ **<​code>​$ cd .. 
 +$ cp oozie/oozie-client-3.3.2.tar.gz . 
 +$ tar xvzf oozie-client-3.3.2.tar.gz 
 +$ mv oozie-client-3.3.2 oozie-client 
 +$ cd bin</code> 
 +  - Add the **/​home/​hduser/​oozie-client/​bin** to ''​PATH''​ in .bashrc and restart your terminal. 
 +  - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows.
  
 ==== Krok 8: Zookeeper ===== ==== Krok 8: Zookeeper =====
Line 892: Line 1022:
   - https://​www.xianic.net/​post/​installing-maven-on-the-raspberry-pi/​   - https://​www.xianic.net/​post/​installing-maven-on-the-raspberry-pi/​
   - https://​hadoop.apache.org/​docs/​r2.7.4/​hadoop-project-dist/​hadoop-hdfs/​HdfsDesign.html   - https://​hadoop.apache.org/​docs/​r2.7.4/​hadoop-project-dist/​hadoop-hdfs/​HdfsDesign.html
 +  - https://​developer.ibm.com/​recipes/​tutorials/​building-a-hadoop-cluster-with-raspberry-pi/​
 +  - http://​hadooptutorial.info/​apache-flume-installation/​
  
  • project/rpihadoop.txt
  • Last modified: 2017/12/21 08:58
  • by licho