Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
project:rpihadoop [2017/10/31 11:04]
licho [Krok 7: Oozie]
project:rpihadoop [2017/12/21 08:58]
licho [Zpracovani Dat]
Line 11: Line 11:
   - Data Analysis   - Data Analysis
 {{project:​hadoop-data-analysis-arch.png}} {{project:​hadoop-data-analysis-arch.png}}
 +
 +{{project:​kafka-spark.jpg}}
 ==== HDFS ==== ==== HDFS ====
 {{project:​hadoop-hdfs-arch.png?​550}} {{project:​hadoop-hdfs-arch.png?​550}}
Line 398: Line 400:
 mkdir -p /​opt/​hadoop_tmp/​hdfs/​datanode mkdir -p /​opt/​hadoop_tmp/​hdfs/​datanode
 chown -R hduser:​hadoop /​opt/​hadoop_tmp chown -R hduser:​hadoop /​opt/​hadoop_tmp
-chmod -R 750 /​opt/​hadoop_tmp +chmod -R 750 /​opt/​hadoop_tmp</​code>​ 
-/​opt/​hadoop-2.7.4/​bin/​hdfs namenode -format</​code>​ +  - **Spusteni ''​hdfs'' ​z master nodu:​**<​code>​/​opt/​hadoop-2.7.4/​bin/​hdfs namenode -format
-  - **Spusteni ''​hdfs'':​**<​code>​+
 /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh
 curl  http://​hadoop-rpi1.labka.cz:​50070/​ curl  http://​hadoop-rpi1.labka.cz:​50070/​
Line 680: Line 681:
 ==== Krok 6: Flume ===== ==== Krok 6: Flume =====
 == Prerequisite:​ == == Prerequisite:​ ==
-  * **JDK 1.6 or later versions of Java** installed on our Ubuntu ​machine. ​+  * **JDK 1.6 or later versions of Java** installed on our machine. ​
   * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. ​   * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. ​
   * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. ​   * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. ​
   * **Directory Permissions** – Read/Write permissions for directories used by agent. ​   * **Directory Permissions** – Read/Write permissions for directories used by agent. ​
-== Apache ​Flume Installation == +==Flume Installation ===
   - **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://​flume.apache.org/​download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''​apache-flume-1.5.0.1-bin.tar.gz''​) is used for installation in this post.   - **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://​flume.apache.org/​download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''​apache-flume-1.5.0.1-bin.tar.gz''​) is used for installation in this post.
   - **Copy** the ''​apache-flume-1.5.0.1-bin.tar.gz''​from downloads folder to our preferred flume installation directory, usually into ''/​usr/​lib/​flume''​ and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <​code>​ $ sudo mkdir /​usr/​lib/​flume $ sudo chmod -R 777 /​usr/​lib/​flume   - **Copy** the ''​apache-flume-1.5.0.1-bin.tar.gz''​from downloads folder to our preferred flume installation directory, usually into ''/​usr/​lib/​flume''​ and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <​code>​ $ sudo mkdir /​usr/​lib/​flume $ sudo chmod -R 777 /​usr/​lib/​flume
Line 694: Line 694:
   - **Edit:** In ''​FLUME_CONF_DIR''​ directory, rename flume-env.sh.template file to ''​flume-env.sh''​ and provide value for ''​JAVA_HOME''​ environment variable with Java installation directory. ​   - **Edit:** In ''​FLUME_CONF_DIR''​ directory, rename flume-env.sh.template file to ''​flume-env.sh''​ and provide value for ''​JAVA_HOME''​ environment variable with Java installation directory. ​
   - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''​JAVA_OPTS''​ variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <​code>​JAVA_HOME="​cesta"​   - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''​JAVA_OPTS''​ variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <​code>​JAVA_HOME="​cesta"​
-JAVAOPTS="​-Xms200m ​-Xmx800m ​-Dcom.sun/​management.jmxremote"</​code>​+JAVAOPTS="​-Xms500m ​-Xmx1000m ​-Dcom.sun/​management.jmxremote"</​code>​
   - **Work done:** With these settings, we can consider flume installation as completed.   - **Work done:** With these settings, we can consider flume installation as completed.
   - **Verification:​** We can verify the flume installation with<​code>​$ flume-ng –help</​code>​ command on terminal. If we get output similar to below then flume installation is successful.   - **Verification:​** We can verify the flume installation with<​code>​$ flume-ng –help</​code>​ command on terminal. If we get output similar to below then flume installation is successful.
- 
  
 ==== Krok 7: Oozie ===== ==== Krok 7: Oozie =====
 == Prerequisite:​ == == Prerequisite:​ ==
   * **Hadoop 2** is installed on our machine. ​   * **Hadoop 2** is installed on our machine. ​
-== Installation==+=== Oozie Installation ​===
 My Hadoop Location : /​opt/​hadoop-2.7.4 My Hadoop Location : /​opt/​hadoop-2.7.4
  
Line 711: Line 710:
   - **Build Oozie** <​code>​$ cd oozie-3.3.2/​bin   - **Build Oozie** <​code>​$ cd oozie-3.3.2/​bin
 $ ./​mkdistro.sh -DskipTests</​code>​ $ ./​mkdistro.sh -DskipTests</​code>​
-== Oozie Server Setup ==+=== Oozie Server Setup ===
   - Copy the built binaries to the home directory as ‘oozie’<​code>​$ cd ../../   - Copy the built binaries to the home directory as ‘oozie’<​code>​$ cd ../../
 $ cp -R oozie-3.3.2/​distro/​target/​oozie-3.3.2-distro/​oozie-3.3.2/​ oozie</​code>​ $ cp -R oozie-3.3.2/​distro/​target/​oozie-3.3.2-distro/​oozie-3.3.2/​ oozie</​code>​
Line 814: Line 813:
   - Use the following command to check the status of Oozie from command line:<​code>​$ ./bin/oozie admin -oozie http://​localhost:​11000/​oozie -status   - Use the following command to check the status of Oozie from command line:<​code>​$ ./bin/oozie admin -oozie http://​localhost:​11000/​oozie -status
 System mode: NORMAL</​code>​ System mode: NORMAL</​code>​
-  - URL for the Oozie Web Console is [[http://​localhost:​11000/​oozie|http://​localhost:​11000/​Oozie Web Console]] +  - URL for the Oozie Web Console is [[http://​localhost:​11000/​oozie|Oozie Web Console]]{{http://​www.rohitmenon.com/​wp-content/​uploads/​2013/​12/​OozieWebConsole.png|Oozie Web Console}} 
- +=== Oozie Client Setup ===
-{{http://​www.rohitmenon.com/​wp-content/​uploads/​2013/​12/​OozieWebConsole.png|Oozie Web Console}} +
-== Oozie Client Setup ==+
   - **Instalation:​ **<​code>​$ cd ..   - **Instalation:​ **<​code>​$ cd ..
 $ cp oozie/​oozie-client-3.3.2.tar.gz . $ cp oozie/​oozie-client-3.3.2.tar.gz .
Line 823: Line 820:
 $ mv oozie-client-3.3.2 oozie-client $ mv oozie-client-3.3.2 oozie-client
 $ cd bin</​code>​ $ cd bin</​code>​
-  - Add the **/​home/​hduser/​oozie-client/​bin** to **PATH**in .bashrc and restart your terminal.+  - Add the **/​home/​hduser/​oozie-client/​bin** to ''​PATH'' ​in .bashrc and restart your terminal.
   - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows.   - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows.
  
  • project/rpihadoop.txt
  • Last modified: 2017/12/21 08:58
  • by licho