Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
project:rpihadoop [2017/10/30 16:48] licho [Trocha teorie] |
project:rpihadoop [2017/12/21 08:58] (current) licho [Zpracovani Dat] |
||
---|---|---|---|
Line 11: | Line 11: | ||
- Data Analysis | - Data Analysis | ||
{{project:hadoop-data-analysis-arch.png}} | {{project:hadoop-data-analysis-arch.png}} | ||
+ | |||
+ | {{project:kafka-spark.jpg?650}} | ||
==== HDFS ==== | ==== HDFS ==== | ||
{{project:hadoop-hdfs-arch.png?550}} | {{project:hadoop-hdfs-arch.png?550}} | ||
Line 114: | Line 116: | ||
rm -f /etc/sudoers.d/010_pi-nopasswd | rm -f /etc/sudoers.d/010_pi-nopasswd | ||
rm -rf /home/pi</code> | rm -rf /home/pi</code> | ||
- | - **Update ntpd:**<code> | + | - **Konfigurace ntpd:**<code> |
- | apt-get update | + | cat <<EOF> /etc/ntp.conf |
- | apt-get upgrade | + | driftfile /var/lib/ntp/ntp.drift |
+ | statsdir /var/log/ntpstats/ | ||
+ | statistics loopstats peerstats clockstats | ||
+ | filegen loopstats file loopstats type day enable | ||
+ | filegen peerstats file peerstats type day enable | ||
+ | filegen clockstats file clockstats type day enable | ||
+ | server ntp.nic.cz iburst prefer | ||
+ | server tik.cesnet.cz iburst | ||
+ | server tak.cesnet.cz iburst | ||
+ | pool 0.debian.pool.ntp.org iburst | ||
+ | pool 1.debian.pool.ntp.org iburst | ||
+ | pool 2.debian.pool.ntp.org iburst | ||
+ | pool 3.debian.pool.ntp.org iburst | ||
+ | restrict -4 default kod notrap nomodify nopeer noquery limited | ||
+ | restrict -6 default kod notrap nomodify nopeer noquery limited | ||
+ | restrict 127.0.0.1 | ||
+ | restrict ::1 | ||
+ | restrict source notrap nomodify noquery | ||
+ | EOF | ||
</code> | </code> | ||
- **Overeni konfigurace:**<code> | - **Overeni konfigurace:**<code> | ||
Line 145: | Line 165: | ||
cat /opt/hadoop-2.7.4/etc/hadoop/slaves</code> | cat /opt/hadoop-2.7.4/etc/hadoop/slaves</code> | ||
- **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/mapred-site.xml'':**<code> | - **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/mapred-site.xml'':**<code> | ||
- | cat <<EOF>/opt/hadoop-2.7.4/etc/hadoop/mapred-site.xml | + | cat <<EOT>/opt/hadoop-2.7.4/etc/hadoop/mapred-site.xml |
<configuration> | <configuration> | ||
- | <property> | + | <property> |
- | <name>mapreduce.framework.name</name> | + | <name>mapreduce.job.tracker</name> |
- | <value>yarn</value> | + | <value>hadoop-rpi1.labka.cz:5431</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>mapreduce.map.memory.mb</name> | + | <name>mapreduce.framework.name</name> |
- | <value>256</value> | + | <value>yarn</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>mapreduce.map.java.opts</name> | + | <name>mapreduce.map.memory.mb</name> |
- | <value>-Xmx204m</value> | + | <value>256</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>mapreduce.reduce.memory.mb</name> | + | <name>mapreduce.map.java.opts</name> |
- | <value>102</value> | + | <value>-Xmx204m</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>mapreduce.reduce.java.opts</name> | + | <name>mapreduce.reduce.memory.mb</name> |
- | <value>-Xmx102m</value> | + | <value>102</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.app.mapreduce.am.resource.mb</name> | + | <name>mapreduce.reduce.java.opts</name> |
- | <value>128</value> | + | <value>-Xmx102m</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.app.mapreduce.am.command-opts</name> | + | <name>yarn.app.mapreduce.am.resource.mb</name> |
- | <value>-Xmx102m</value> | + | <value>128</value> |
- | </property> | + | </property> |
+ | <property> | ||
+ | <name>yarn.app.mapreduce.am.command-opts</name> | ||
+ | <value>-Xmx102m</value> | ||
+ | </property> | ||
</configuration> | </configuration> | ||
- | EOF</code> | + | EOT</code> |
- **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/hdfs-site.xml'':**<code> | - **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/hdfs-site.xml'':**<code> | ||
- | cat <<EOF>/opt/hadoop-2.7.4/etc/hadoop/hdfs-site.xml | + | cat <<EOT>/opt/hadoop-2.7.4/etc/hadoop/hdfs-site.xml |
<configuration> | <configuration> | ||
- | <property> | + | <property> |
- | <name>dfs.replication</name> | + | <name>dfs.datanode.data.dir</name> |
- | <value>1</value> | + | <value>/opt/hadoop_tmp/hdfs/datanode</value> |
- | </property> | + | <final>true</final> |
- | <property> | + | </property> |
- | <name>dfs.name.dir</name> | + | <property> |
- | <value>file:///hdfs/namenode</value> | + | <name>dfs.namenode.name.dir</name> |
- | </property> | + | <value>/opt/hadoop_tmp/hdfs/namenode</value> |
- | <property> | + | <final>true</final> |
- | <name>dfs.data.dir</name> | + | </property> |
- | <value>file:///hdfs/datanode</value> | + | <property> |
- | </property> | + | <name>dfs.namenode.http-address</name> |
+ | <value>master:50070</value> | ||
+ | </property> | ||
+ | <property> | ||
+ | <name>dfs.replication</name> | ||
+ | <value>11</value> | ||
+ | </property> | ||
</configuration> | </configuration> | ||
- | EOF</code> | + | EOT</code> |
- **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/core-site.xml'':**<code> | - **Konfigurace ''/opt/hadoop-2.7.4/etc/hadoop/core-site.xml'':**<code> | ||
- | cat <<EOF>/opt/hadoop-2.7.4/etc/hadoop/core-site.xml | + | cat <<EOT>/opt/hadoop-2.7.4/etc/hadoop/core-site.xml |
<configuration> | <configuration> | ||
- | <property> | + | <property> |
- | <name>fs.defaultFS</name> | + | <name>fs.default.name</name> |
- | <value>hdfs://hadoop-rpi1.labka.cz:9000</value> | + | <value>hdfs://hadoop-rpi1.labka.cz:9000/</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>hadoop.tmp.dir</name> | + | <name>fs.default.FS</name> |
- | <value>/hdfs/tmp</value> | + | <value>hdfs://hadoop-rpi1.labka.cz:9000/</value> |
- | </property> | + | </property> |
+ | <property> | ||
+ | <name>hadoop.tmp.dir</name> | ||
+ | <value>/opt/hadoop_tmp/hdfs/tmp</value> | ||
+ | </property> | ||
</configuration> | </configuration> | ||
EOF</code> | EOF</code> | ||
Line 210: | Line 244: | ||
cat <<EOF>/opt/hadoop-2.7.4/etc/hadoop/yarn-site.xml | cat <<EOF>/opt/hadoop-2.7.4/etc/hadoop/yarn-site.xml | ||
<configuration> | <configuration> | ||
- | <property> | + | <property> |
- | <name>yarn.resourcemanager.hostname</name> | + | <name>yarn.resourcemanager.resource-tracker.address</name> |
- | <value>hadoop-rpi1.labka.cz</value> | + | <value>master:8025</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.resourcemanager.address</name> | + | <name>yarn.resourcemanager.scheduler.address</name> |
- | <value>hadoop-rpi1.labka.cz:8050</value> | + | <value>master:8035</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.resourcemanager.scheduler.address</name> | + | <name>yarn.resourcemanager.address</name> |
- | <value>hadoop-rpi1.labka.cz:8030</value> | + | <value>master:8050</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.resourcemanager.resource-tracker.address</name> | + | <name>yarn.nodemanager.aux-services</name> |
- | <value>hadoop-rpi1.labka.cz:8031</value> | + | <value>mapreduce_shuffle</value> |
- | </property> | + | </property> |
- | <property> | + | <property> |
- | <name>yarn.resourcemanager.webapp.address</name> | + | <name>yarn.nodemanager.resource.cpu-vcores</name> |
- | <value>hadoop-rpi1.labka.cz:8088</value> | + | <value>4</value> |
- | </property> | + | </property> |
- | <name>yarn.resourcemanager.admin.address</name> | + | <property> |
- | <value>hadoop-rpi1.labka.cz:8033</value> | + | <name>yarn.nodemanager.resource.memory-mb</name> |
- | </property> | + | <value>1024</value> |
- | <name>yarn.nodemanager.hostname</name> | + | </property> |
- | <value>hadoop-rpi1.labka.cz</value> | + | <property> |
- | </property> | + | <name>yarn.scheduler.minimum-allocation-mb</name> |
- | </property> | + | <value>128</value> |
- | <name>yarn.nodemanager.address</name> | + | </property> |
- | <value>hadoop-rpi1.labka.cz:8060</value> | + | <property> |
- | </property> | + | <name>yarn.scheduler.maximum-allocation-mb</name> |
- | </property> | + | <value>1024</value> |
- | <name>yarn.nodemanager.localizer.address</name> | + | </property> |
- | <value>hadoop-rpi1.labka.cz:8040</value> | + | <property> |
- | </property> | + | <name>yarn.scheduler.minimum-allocation-vcores</name> |
- | <property> | + | <value>1</value> |
- | <name>yarn.nodemanager.aux-services</name> | + | </property> |
- | <value>mapreduce_shuffle</value> | + | <property> |
- | </property> | + | <name>yarn.scheduler.maximum-allocation-vcores</name> |
- | <property> | + | <value>4</value> |
- | <name>yarn.nodemanager.resource.cpu-vcores</name> | + | </property> |
- | <value>4</value> | + | <property> |
- | </property> | + | <name>yarn.nodemanager.vmem-check-enabled</name> |
- | <property> | + | <value>false</value> |
- | <name>yarn.nodemanager.resource.memory-mb</name> | + | </property> |
- | <value>1024</value> | + | <property> |
- | </property> | + | <name>yarn.nodemanager.pmem-check-enabled</name> |
- | <property> | + | <value>true</value> |
- | <name>yarn.scheduler.minimum-allocation-mb</name> | + | </property> |
- | <value>128</value> | + | <property> |
- | </property> | + | <name>yarn.nodemanager.vmem-pmem-ratio</name> |
- | <property> | + | <value>4</value> |
- | <name>yarn.scheduler.maximum-allocation-mb</name> | + | </property> |
- | <value>1024</value> | + | <property> |
- | </property> | + | <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name> |
- | <property> | + | <value>98.5</value> |
- | <name>yarn.scheduler.minimum-allocation-vcores</name> | + | </property> |
- | <value>1</value> | + | |
- | </property> | + | |
- | <property> | + | |
- | <name>yarn.scheduler.maximum-allocation-vcores</name> | + | |
- | <value>4</value> | + | |
- | </property> | + | |
- | <property> | + | |
- | <name>yarn.nodemanager.vmem-check-enabled</name> | + | |
- | <value>false</value> | + | |
- | </property> | + | |
- | <property> | + | |
- | <name>yarn.nodemanager.pmem-check-enabled</name> | + | |
- | <value>true</value> | + | |
- | </property> | + | |
- | <property> | + | |
- | <name>yarn.nodemanager.vmem-pmem-ratio</name> | + | |
- | <value>4</value> | + | |
- | </property> | + | |
- | <property> | + | |
- | <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name> | + | |
- | <value>98.5</value> | + | |
- | </property> | + | |
</configuration> | </configuration> | ||
EOF</code> | EOF</code> | ||
Line 384: | Line 396: | ||
- **Konfigurace uloziste ''/hdfs'':**<code> | - **Konfigurace uloziste ''/hdfs'':**<code> | ||
#Zopakujem na vsech nodech | #Zopakujem na vsech nodech | ||
- | mkdir -p /hdfs/tmp | + | mkdir -p /opt/hadoop_tmp/hdfs/tmp |
- | mkdir -p /hdfs/namenode | + | mkdir -p /opt/hadoop_tmp/hdfs/namenode |
- | mkdir -p /hdfs/datanode | + | mkdir -p /opt/hadoop_tmp/hdfs/datanode |
- | chown -R hduser:hadoop /hdfs/ | + | chown -R hduser:hadoop /opt/hadoop_tmp |
- | chmod -R 750 /hdfs/ | + | chmod -R 750 /opt/hadoop_tmp</code> |
- | /opt/hadoop-2.7.4/bin/hdfs namenode -format</code> | + | - **Spusteni ''hdfs'' z master nodu:**<code>/opt/hadoop-2.7.4/bin/hdfs namenode -format |
- | - **Spusteni ''hdfs'':**<code> | + | |
/opt/hadoop-2.7.4/sbin/start-dfs.sh | /opt/hadoop-2.7.4/sbin/start-dfs.sh | ||
curl http://hadoop-rpi1.labka.cz:50070/ | curl http://hadoop-rpi1.labka.cz:50070/ | ||
Line 669: | Line 680: | ||
==== Krok 6: Flume ===== | ==== Krok 6: Flume ===== | ||
- | http://hadooptutorial.info/apache-flume-installation/ | + | == Prerequisite: == |
- | | + | * **JDK 1.6 or later versions of Java** installed on our machine. |
+ | * **Memory** – Sufficient memory for configurations used by sources, channels or sinks. | ||
+ | * **Disk Space** – Sufficient disk space for configurations used by channels or sinks. | ||
+ | * **Directory Permissions** – Read/Write permissions for directories used by agent. | ||
+ | === Flume Installation === | ||
+ | - **Download** latest stable release of apache flume binary distribution from apache download mirrors at [[http://flume.apache.org/download.html|Flume Download]]. At the time of writing this post, apache-flume-1.5.0 is the latest version and the same (''apache-flume-1.5.0.1-bin.tar.gz'') is used for installation in this post. | ||
+ | - **Copy** the ''apache-flume-1.5.0.1-bin.tar.gz''from downloads folder to our preferred flume installation directory, usually into ''/usr/lib/flume'' and **unpack** the tarball. Below are the set of commands to perform these activities. Flume installation Shell <code> $ sudo mkdir /usr/lib/flume $ sudo chmod -R 777 /usr/lib/flume | ||
+ | $ cp apache-flume-1.5.0.1-bin.tar.gz /usr/lib/flume/ | ||
+ | $ cd /usr/lib/flume | ||
+ | $ tar -xzf apache-flume-1.5.0.1-bin.tar.gz</code> | ||
+ | - **Set ''FLUME_HOME'', ''FLUME_CONF_DIR''** environment variables in ''.bashrc'' file as shown below and add the Flume bin directory to ''PATH'' environment variable. Shell:<code>$ vi ~/.bashrc</code> | ||
+ | - **Edit:** In ''FLUME_CONF_DIR'' directory, rename flume-env.sh.template file to ''flume-env.sh'' and provide value for ''JAVA_HOME'' environment variable with Java installation directory. | ||
+ | - If we are going to use **memory channels** while setting flume agents, it is preferable to increase the memory limits in ''JAVA_OPTS'' variable. By default, the minimum and maximum memory values are 100 MB and 200 MB respectively (Xms100m -Xmx200m). Better to increase these limits to **500 MB** and **1000 MB** respectively. Shell: <code>JAVA_HOME="cesta" | ||
+ | JAVAOPTS="-Xms500m -Xmx1000m -Dcom.sun/management.jmxremote"</code> | ||
+ | - **Work done:** With these settings, we can consider flume installation as completed. | ||
+ | - **Verification:** We can verify the flume installation with<code>$ flume-ng –help</code> command on terminal. If we get output similar to below then flume installation is successful. | ||
==== Krok 7: Oozie ===== | ==== Krok 7: Oozie ===== | ||
- | http://www.rohitmenon.com/index.php/apache-oozie-installation/ | + | == Prerequisite: == |
+ | * **Hadoop 2** is installed on our machine. | ||
+ | === Oozie Installation === | ||
+ | My Hadoop Location : /opt/hadoop-2.7.4 | ||
+ | |||
+ | - From your home directory execute the following commands (my home directory is /home/hduser):<code>$ pwd | ||
+ | /home/hduser</code> | ||
+ | - **Download Oozie: **<code>$ wget http://supergsego.com/apache/oozie/3.3.2/oozie-3.3.2.tar.gz</code> | ||
+ | - **Untar: **<code>$ tar xvzf oozie-3.3.2.tar.gz</code> | ||
+ | - **Build Oozie** <code>$ cd oozie-3.3.2/bin | ||
+ | $ ./mkdistro.sh -DskipTests</code> | ||
+ | === Oozie Server Setup === | ||
+ | - Copy the built binaries to the home directory as ‘oozie’<code>$ cd ../../ | ||
+ | $ cp -R oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/ oozie</code> | ||
+ | - Create the required libext directory<code>$ cd oozie | ||
+ | $ mkdir libext</code> | ||
+ | - Copy all the required jars from hadooplibs to the libext directory using the following command:<code>$ cp ../oozie-3.3.2/hadooplibs/target/oozie-3.3.2-hadooplibs.tar.gz . | ||
+ | $ tar xzvf oozie-3.3.2-hadooplibs.tar.gz | ||
+ | $ cp oozie-3.3.2/hadooplibs/hadooplib-1.1.1.oozie-3.3.2/* libext/</code> | ||
+ | - Get Ext2Js – This library is not bundled with Oozie and needs to be downloaded separately. This library is used for the Oozie Web Console:<code>$ cd libext | ||
+ | $ wget http://extjs.com/deploy/ext-2.2.zip | ||
+ | $ cd ..</code> | ||
+ | - Update **../hadoop/conf/core-site.xml** as follows:<code><property> | ||
+ | <name>hadoop.proxyuser.hduser.hosts</name> | ||
+ | <value>localhost</value> | ||
+ | </property> | ||
+ | <property> | ||
+ | <name>hadoop.proxyuser.hduser.groups</name> | ||
+ | <value>hadoop</value> | ||
+ | </property></code> | ||
+ | - Here, ‘hduser’ is the username and it belongs to ‘hadoop’ group. | ||
+ | - Prepare the WAR file<code>$ ./bin/oozie-setup.sh prepare-war | ||
+ | |||
+ | setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" | ||
+ | |||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-beanutils-1.7.0.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-beanutils-core-1.8.0.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-codec-1.4.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-collections-3.2.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-configuration-1.6.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-digester-1.8.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-el-1.0.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-io-2.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-lang-2.4.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-logging-1.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-math-2.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/commons-net-1.4.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/hadoop-client-1.1.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/hadoop-core-1.1.1.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/hsqldb-1.8.0.7.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/jackson-core-asl-1.8.8.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/jackson-mapper-asl-1.8.8.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/log4j-1.2.16.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/oro-2.0.8.jar | ||
+ | INFO: Adding extension: /home/hduser/oozie/libext/xmlenc-0.52.jar | ||
+ | |||
+ | New Oozie WAR file with added 'ExtJS library, JARs' at /home/hduser/oozie/oozie-server/webapps/oozie.war | ||
+ | |||
+ | INFO: Oozie is ready to be started</code> | ||
+ | - Create sharelib on HDFS<code>$ ./bin/oozie-setup.sh sharelib create -fs hdfs://localhost:54310 | ||
+ | setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" | ||
+ | the destination path for sharelib is: /user/hduser/share/lib</code> | ||
+ | - Create the OoozieDB<code>$ ./bin/ooziedb.sh create -sqlfile oozie.sql -run | ||
+ | setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" | ||
+ | |||
+ | Validate DB Connection | ||
+ | DONE | ||
+ | Check DB schema does not exist | ||
+ | DONE | ||
+ | Check OOZIE_SYS table does not exist | ||
+ | DONE | ||
+ | Create SQL schema | ||
+ | DONE | ||
+ | Create OOZIE_SYS table | ||
+ | DONE | ||
+ | |||
+ | Oozie DB has been created for Oozie version '3.3.2' | ||
+ | |||
+ | The SQL commands have been written to: oozie.sql</code> | ||
+ | - To start Oozie as a daemon use the following command:<code>$ ./bin/oozied.sh start | ||
+ | |||
+ | Setting OOZIE_HOME: /home/hduser/oozie | ||
+ | Setting OOZIE_CONFIG: /home/hduser/oozie/conf | ||
+ | Sourcing: /home/hduser/oozie/conf/oozie-env.sh | ||
+ | setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m" | ||
+ | Setting OOZIE_CONFIG_FILE: oozie-site.xml | ||
+ | Setting OOZIE_DATA: /home/hduser/oozie/data | ||
+ | Setting OOZIE_LOG: /home/hduser/oozie/logs | ||
+ | Setting OOZIE_LOG4J_FILE: oozie-log4j.properties | ||
+ | Setting OOZIE_LOG4J_RELOAD: 10 | ||
+ | Setting OOZIE_HTTP_HOSTNAME: rohit-VirtualBox | ||
+ | Setting OOZIE_HTTP_PORT: 11000 | ||
+ | Setting OOZIE_ADMIN_PORT: 11001 | ||
+ | Setting OOZIE_HTTPS_PORT: 11443 | ||
+ | Setting OOZIE_BASE_URL: http://rohit-VirtualBox:11000/oozie | ||
+ | Setting CATALINA_BASE: /home/hduser/oozie/oozie-server | ||
+ | Setting OOZIE_HTTPS_KEYSTORE_FILE: /home/hduser/.keystore | ||
+ | Setting OOZIE_HTTPS_KEYSTORE_PASS: password | ||
+ | Setting CATALINA_OUT: /home/hduser/oozie/logs/catalina.out | ||
+ | Setting CATALINA_PID: /home/hduser/oozie/oozie-server/temp/oozie.pid | ||
+ | |||
+ | Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/home/hduser/oozie/logs/derby.log | ||
+ | Adding to CATALINA_OPTS: -Doozie.home.dir=/home/hduser/oozie -Doozie.config.dir=/home/hduser/oozie/conf -Doozie.log.dir=/home/hduser/oozie/logs -Doozie.data.dir=/home/hduser/oozie/data -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=rohit-VirtualBox -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://rohit-VirtualBox:11000/oozie -Doozie.https.keystore.file=/home/hduser/.keystore -Doozie.https.keystore.pass=password -Djava.library.path= | ||
+ | |||
+ | Using CATALINA_BASE: /home/hduser/oozie/oozie-server | ||
+ | Using CATALINA_HOME: /home/hduser/oozie/oozie-server | ||
+ | Using CATALINA_TMPDIR: /home/hduser/oozie/oozie-server/temp | ||
+ | Using JRE_HOME: /usr/lib/jvm/java-6-oracle | ||
+ | Using CLASSPATH: /home/hduser/oozie/oozie-server/bin/bootstrap.jar | ||
+ | Using CATALINA_PID: /home/hduser/oozie/oozie-server/temp/oozie.pid</code> | ||
+ | |||
+ | - To start Oozie as a foreground process use the following command:<code>$ ./bin/oozied.sh run</code> Check the Oozie log file logs/oozie.log to ensure Oozie started properly. | ||
+ | - Use the following command to check the status of Oozie from command line:<code>$ ./bin/oozie admin -oozie http://localhost:11000/oozie -status | ||
+ | System mode: NORMAL</code> | ||
+ | - URL for the Oozie Web Console is [[http://localhost:11000/oozie|Oozie Web Console]]{{http://www.rohitmenon.com/wp-content/uploads/2013/12/OozieWebConsole.png|Oozie Web Console}} | ||
+ | === Oozie Client Setup === | ||
+ | - **Instalation: **<code>$ cd .. | ||
+ | $ cp oozie/oozie-client-3.3.2.tar.gz . | ||
+ | $ tar xvzf oozie-client-3.3.2.tar.gz | ||
+ | $ mv oozie-client-3.3.2 oozie-client | ||
+ | $ cd bin</code> | ||
+ | - Add the **/home/hduser/oozie-client/bin** to ''PATH'' in .bashrc and restart your terminal. | ||
+ | - Your Oozie Server and Client setup on a single node cluster is now ready. In the next post, we will configure and schedule some Oozie workflows. | ||
==== Krok 8: Zookeeper ===== | ==== Krok 8: Zookeeper ===== | ||
Line 872: | Line 1021: | ||
- https://www.slideshare.net/EdurekaIN/hadoop-20-architecture-hdfs-federation-namenode-high-availability | - https://www.slideshare.net/EdurekaIN/hadoop-20-architecture-hdfs-federation-namenode-high-availability | ||
- https://www.xianic.net/post/installing-maven-on-the-raspberry-pi/ | - https://www.xianic.net/post/installing-maven-on-the-raspberry-pi/ | ||
+ | - https://hadoop.apache.org/docs/r2.7.4/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html | ||
+ | - https://developer.ibm.com/recipes/tutorials/building-a-hadoop-cluster-with-raspberry-pi/ | ||
+ | - http://hadooptutorial.info/apache-flume-installation/ | ||