Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
project:rpihadoop [2017/10/31 11:02] licho [Krok 7: Oozie] |
project:rpihadoop [2017/12/21 08:58] licho [Zpracovani Dat] |
||
---|---|---|---|
Line 11: | Line 11: | ||
- Data Analysis | - Data Analysis | ||
{{project:hadoop-data-analysis-arch.png}} | {{project:hadoop-data-analysis-arch.png}} | ||
+ | |||
+ | {{project:kafka-spark.jpg?650}} | ||
==== HDFS ==== | ==== HDFS ==== | ||
{{project:hadoop-hdfs-arch.png?550}} | {{project:hadoop-hdfs-arch.png?550}} | ||
Line 398: | Line 400: | ||
mkdir -p /opt/hadoop_tmp/hdfs/datanode | mkdir -p /opt/hadoop_tmp/hdfs/datanode | ||
chown -R hduser:hadoop /opt/hadoop_tmp | chown -R hduser:hadoop /opt/hadoop_tmp | ||
- | chmod -R 750 /opt/hadoop_tmp | + | chmod -R 750 /opt/hadoop_tmp</code> |
- | /opt/hadoop-2.7.4/bin/hdfs namenode -format</code> | + | - **Spusteni ''hdfs'' z master nodu:**<code>/opt/hadoop-2.7.4/bin/hdfs namenode -format |
- | - **Spusteni ''hdfs'':**<code> | + | |
/opt/hadoop-2.7.4/sbin/start-dfs.sh | /opt/hadoop-2.7.4/sbin/start-dfs.sh | ||
curl http://hadoop-rpi1.labka.cz:50070/ | curl http://hadoop-rpi1.labka.cz:50070/ | ||
Line 680: | Line 681: | ||
==== Krok 6: Flume ===== | ==== Krok 6: Flume ===== | ||
== Prerequisite: == | == Prerequisite: == | ||
- | * **JDK 1.6 or later versions of Java** installed on our Ubuntu machine. | + | * **JDK 1.6 or later versions of Java** installed on our machine. |
* **Memory** – Sufficient memory for configurations used by sources, channels or sinks. | * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. | ||
* **Disk Space** – Sufficient disk space for configurations used by channels or sinks. | * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. | ||
* **Directory Permissions** – Read/Write permissions for directories used by agent. | * **Directory Permissions** – Read/Write permissions for directories used by agent. | ||
- | == Apache Flume Installation == | + | === Flume Installation === |
- **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://flume.apache.org/download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''apache-flume-1.5.0.1-bin.tar.gz'') is used for installation in this post. | - **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://flume.apache.org/download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''apache-flume-1.5.0.1-bin.tar.gz'') is used for installation in this post. | ||
- **Copy** the ''apache-flume-1.5.0.1-bin.tar.gz''from downloads folder to our preferred flume installation directory, usually into ''/usr/lib/flume'' and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <code> $ sudo mkdir /usr/lib/flume $ sudo chmod -R 777 /usr/lib/flume | - **Copy** the ''apache-flume-1.5.0.1-bin.tar.gz''from downloads folder to our preferred flume installation directory, usually into ''/usr/lib/flume'' and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <code> $ sudo mkdir /usr/lib/flume $ sudo chmod -R 777 /usr/lib/flume | ||
Line 694: | Line 694: | ||
- **Edit:** In ''FLUME_CONF_DIR'' directory, rename flume-env.sh.template file to ''flume-env.sh'' and provide value for ''JAVA_HOME'' environment variable with Java installation directory. | - **Edit:** In ''FLUME_CONF_DIR'' directory, rename flume-env.sh.template file to ''flume-env.sh'' and provide value for ''JAVA_HOME'' environment variable with Java installation directory. | ||
- If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''JAVA_OPTS'' variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <code>JAVA_HOME="cesta" | - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''JAVA_OPTS'' variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <code>JAVA_HOME="cesta" | ||
- | JAVAOPTS="-Xms200m -Xmx800m -Dcom.sun/management.jmxremote"</code> | + | JAVAOPTS="-Xms500m -Xmx1000m -Dcom.sun/management.jmxremote"</code> |
- **Work done:** With these settings, we can consider flume installation as completed. | - **Work done:** With these settings, we can consider flume installation as completed. | ||
- **Verification:** We can verify the flume installation with<code>$ flume-ng –help</code> command on terminal. If we get output similar to below then flume installation is successful. | - **Verification:** We can verify the flume installation with<code>$ flume-ng –help</code> command on terminal. If we get output similar to below then flume installation is successful. | ||
- | |||
==== Krok 7: Oozie ===== | ==== Krok 7: Oozie ===== | ||
== Prerequisite: == | == Prerequisite: == | ||
* **Hadoop 2** is installed on our machine. | * **Hadoop 2** is installed on our machine. | ||
- | == Installation: == | + | === Oozie Installation === |
My Hadoop Location : /opt/hadoop-2.7.4 | My Hadoop Location : /opt/hadoop-2.7.4 | ||
Line 711: | Line 710: | ||
- **Build Oozie** <code>$ cd oozie-3.3.2/bin | - **Build Oozie** <code>$ cd oozie-3.3.2/bin | ||
$ ./mkdistro.sh -DskipTests</code> | $ ./mkdistro.sh -DskipTests</code> | ||
- | + | === Oozie Server Setup === | |
- | == Oozie Server Setup == | + | - Copy the built binaries to the home directory as ‘oozie’<code>$ cd ../../ |
- | + | ||
- | - Copy the built binaries to the home directory as ‘oozie’ | + | |
- | + | ||
- | <code>$ cd ../../ | + | |
$ cp -R oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/ oozie</code> | $ cp -R oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/ oozie</code> | ||
- | |||
- Create the required libext directory<code>$ cd oozie | - Create the required libext directory<code>$ cd oozie | ||
$ mkdir libext</code> | $ mkdir libext</code> | ||
Line 819: | Line 813: | ||
- Use the following command to check the status of Oozie from command line:<code>$ ./bin/oozie admin -oozie http://localhost:11000/oozie -status | - Use the following command to check the status of Oozie from command line:<code>$ ./bin/oozie admin -oozie http://localhost:11000/oozie -status | ||
System mode: NORMAL</code> | System mode: NORMAL</code> | ||
- | - URL for the Oozie Web Console is [[http://localhost:11000/oozie|http://localhost:11000/Oozie Web Console]] | + | - URL for the Oozie Web Console is [[http://localhost:11000/oozie|Oozie Web Console]]{{http://www.rohitmenon.com/wp-content/uploads/2013/12/OozieWebConsole.png|Oozie Web Console}} |
- | + | === Oozie Client Setup === | |
- | {{http://www.rohitmenon.com/wp-content/uploads/2013/12/OozieWebConsole.png|Oozie Web Console}} | + | |
- | == Oozie Client Setup == | + | |
- **Instalation: **<code>$ cd .. | - **Instalation: **<code>$ cd .. | ||
$ cp oozie/oozie-client-3.3.2.tar.gz . | $ cp oozie/oozie-client-3.3.2.tar.gz . | ||
Line 828: | Line 820: | ||
$ mv oozie-client-3.3.2 oozie-client | $ mv oozie-client-3.3.2 oozie-client | ||
$ cd bin</code> | $ cd bin</code> | ||
- | - Add the **/home/hduser/oozie-client/bin** to **PATH**in .bashrc and restart your terminal. | + | - Add the **/home/hduser/oozie-client/bin** to ''PATH'' in .bashrc and restart your terminal. |
- Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows. | - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows. | ||