Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
project:rpihadoop [2017/10/31 11:12]
licho
project:rpihadoop [2017/12/21 08:58] (current)
licho [Zpracovani Dat]
Line 11: Line 11:
   - Data Analysis   - Data Analysis
 {{project:​hadoop-data-analysis-arch.png}} {{project:​hadoop-data-analysis-arch.png}}
 +
 +{{project:​kafka-spark.jpg?​650}}
 ==== HDFS ==== ==== HDFS ====
 {{project:​hadoop-hdfs-arch.png?​550}} {{project:​hadoop-hdfs-arch.png?​550}}
Line 398: Line 400:
 mkdir -p /​opt/​hadoop_tmp/​hdfs/​datanode mkdir -p /​opt/​hadoop_tmp/​hdfs/​datanode
 chown -R hduser:​hadoop /​opt/​hadoop_tmp chown -R hduser:​hadoop /​opt/​hadoop_tmp
-chmod -R 750 /​opt/​hadoop_tmp +chmod -R 750 /​opt/​hadoop_tmp</​code>​ 
-/​opt/​hadoop-2.7.4/​bin/​hdfs namenode -format</​code>​ +  - **Spusteni ''​hdfs'' ​z master nodu:​**<​code>​/​opt/​hadoop-2.7.4/​bin/​hdfs namenode -format
-  - **Spusteni ''​hdfs'':​**<​code>​+
 /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh /​opt/​hadoop-2.7.4/​sbin/​start-dfs.sh
 curl  http://​hadoop-rpi1.labka.cz:​50070/​ curl  http://​hadoop-rpi1.labka.cz:​50070/​
Line 680: Line 681:
 ==== Krok 6: Flume ===== ==== Krok 6: Flume =====
 == Prerequisite:​ == == Prerequisite:​ ==
-  * **JDK 1.6 or later versions of Java** installed on our Ubuntu ​machine. ​+  * **JDK 1.6 or later versions of Java** installed on our machine. ​
   * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. ​   * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. ​
   * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. ​   * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. ​
Line 693: Line 694:
   - **Edit:** In ''​FLUME_CONF_DIR''​ directory, rename flume-env.sh.template file to ''​flume-env.sh''​ and provide value for ''​JAVA_HOME''​ environment variable with Java installation directory. ​   - **Edit:** In ''​FLUME_CONF_DIR''​ directory, rename flume-env.sh.template file to ''​flume-env.sh''​ and provide value for ''​JAVA_HOME''​ environment variable with Java installation directory. ​
   - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''​JAVA_OPTS''​ variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <​code>​JAVA_HOME="​cesta"​   - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''​JAVA_OPTS''​ variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <​code>​JAVA_HOME="​cesta"​
-JAVAOPTS="​-Xms200m ​-Xmx800m ​-Dcom.sun/​management.jmxremote"</​code>​+JAVAOPTS="​-Xms500m ​-Xmx1000m ​-Dcom.sun/​management.jmxremote"</​code>​
   - **Work done:** With these settings, we can consider flume installation as completed.   - **Work done:** With these settings, we can consider flume installation as completed.
   - **Verification:​** We can verify the flume installation with<​code>​$ flume-ng –help</​code>​ command on terminal. If we get output similar to below then flume installation is successful.   - **Verification:​** We can verify the flume installation with<​code>​$ flume-ng –help</​code>​ command on terminal. If we get output similar to below then flume installation is successful.
Line 709: Line 710:
   - **Build Oozie** <​code>​$ cd oozie-3.3.2/​bin   - **Build Oozie** <​code>​$ cd oozie-3.3.2/​bin
 $ ./​mkdistro.sh -DskipTests</​code>​ $ ./​mkdistro.sh -DskipTests</​code>​
-== Oozie Server Setup ==+=== Oozie Server Setup ===
   - Copy the built binaries to the home directory as ‘oozie’<​code>​$ cd ../../   - Copy the built binaries to the home directory as ‘oozie’<​code>​$ cd ../../
 $ cp -R oozie-3.3.2/​distro/​target/​oozie-3.3.2-distro/​oozie-3.3.2/​ oozie</​code>​ $ cp -R oozie-3.3.2/​distro/​target/​oozie-3.3.2-distro/​oozie-3.3.2/​ oozie</​code>​
Line 812: Line 813:
   - Use the following command to check the status of Oozie from command line:<​code>​$ ./bin/oozie admin -oozie http://​localhost:​11000/​oozie -status   - Use the following command to check the status of Oozie from command line:<​code>​$ ./bin/oozie admin -oozie http://​localhost:​11000/​oozie -status
 System mode: NORMAL</​code>​ System mode: NORMAL</​code>​
-  - URL for the Oozie Web Console is [[http://​localhost:​11000/​oozie|http://​localhost:​11000/​Oozie Web Console]] +  - URL for the Oozie Web Console is [[http://​localhost:​11000/​oozie|Oozie Web Console]]{{http://​www.rohitmenon.com/​wp-content/​uploads/​2013/​12/​OozieWebConsole.png|Oozie Web Console}} 
- +=== Oozie Client Setup ===
-{{http://​www.rohitmenon.com/​wp-content/​uploads/​2013/​12/​OozieWebConsole.png|Oozie Web Console}} +
-== Oozie Client Setup ==+
   - **Instalation:​ **<​code>​$ cd ..   - **Instalation:​ **<​code>​$ cd ..
 $ cp oozie/​oozie-client-3.3.2.tar.gz . $ cp oozie/​oozie-client-3.3.2.tar.gz .
Line 821: Line 820:
 $ mv oozie-client-3.3.2 oozie-client $ mv oozie-client-3.3.2 oozie-client
 $ cd bin</​code>​ $ cd bin</​code>​
-  - Add the **/​home/​hduser/​oozie-client/​bin** to **PATH**in .bashrc and restart your terminal.+  - Add the **/​home/​hduser/​oozie-client/​bin** to ''​PATH'' ​in .bashrc and restart your terminal.
   - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows.   - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows.
  
  • project/rpihadoop.1509444772.txt.gz
  • Last modified: 2017/10/31 11:12
  • by licho