How to run the Flume’s HelloWorld example
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data.
For more information see extending_via_sink_source_decorator_plugins
The flume’s Hello World example is an example of an experimental plug-in mechanism that allows to add new custom sources, sinks, and decorators to the Flume system.
1. The source code of the Hello-World example should be under /usr/lib/flume/plugins/ (i.e. /usr/lib/flume/plugins/helloworld).
If you don’t have it, please download from https://github.com/cloudera/flume/tree/master/plugins/
2. Change the build.xml as follows:
<?xml version="1.0"?> <project name="flume-helloworld" default="jar"> <property name="javac.debug" value="on"/> <property name="flume.base" value="/usr/lib/flume"/> <path id="classpath"> <!-- in case we are running in dev env --> <pathelement location="${flume.base}/build/classes"/> <fileset dir="${flume.base}/lib"> <include name="**/google-collect*.jar" /> <include name="**/guava*.jar" /> <include name="**/log4j-*.jar" /> <include name="**/slf4j-*.jar" /> </fileset> <!-- in case we are running in release env --> <fileset dir="${flume.base}/lib"> <include name="flume-*.jar" /> </fileset> <pathelement location="${flume.base}/lib/"/> </path> <target name="jar"> <mkdir dir="build"/> <mkdir dir="build/classes"/> <javac srcdir="./src/java" destdir="build/classes" debug="${javac.debug}"> <classpath refid="classpath"/> </javac> <jar jarfile="helloworld_plugin.jar" basedir="build/classes"/> </target> <target name="clean"> <echo message="Cleaning generated files and stuff"/> <delete dir="build" /> <delete file="helloworld_plugin.jar" /> </target> </project> |
3. Run ant (see How to install Ant)
4. Run sudo su –shell=/bin/bash -l flume
5. Edit /etc/flume/conf.empty/flume-site.xml and set the property flume.plugin.classes to be helloworld.HelloWorldSink,helloworld.HelloWorldSource,helloworld.HelloWorldDecorator
6. Run sudo cp /usr/lib/flume/bin/flume-env.sh.template /usr/lib/flume/bin/flume-env.sh
Edit /usr/lib/flume/bin/flume-env.sh and enter the following line: export FLUME_CLASSPATH=/usr/lib/flume/plugins/helloworld/helloworld_plugin.jar
Run export FLUME_CLASSPATH=/usr/lib/flume/plugins/helloworld/helloworld_plugin.jar. Copy the helloworld_plugin.jar into /usr/lib/flume/lib/
7. Copy the HelloWorld jar file to the Flume master machine and put it at the same place as the node (i.e. /usr/lib/flume/lib/)
8. Do steps 5 and 6 at the master
9. Start Flume node and Flume Master
10. Configure the node at the master: using the master’s web interface, load the helloworld source/sink into your node_name node:
node_name: helloWorldSource() | helloWorldSink();
11. Go to your node_name machine and see that an output file helloworld.txt is created in it’s current working directory. Every 3 seconds a new “hello world!” line will be output to the file.
[user@node-name ~]$ tail -f /usr/lib/flume/plugins/helloworld/helloworld.txt Hello World! Hello World! Hello World! |
Notes:
The FLUME_CLASSPATH that set by the export command is session specific and will expire as the session ends.
In order for Flume to find the helloworld plugin jar, I recommends one of the following:
- Move or copy the jar into the flume’s lib directory (i.e. /usr/lib/flume/lib/)
- Insert all your Flume specific export lines into /usr/lib/flume/bin/flume-env.sh
Edit the /etc/profile file to include the line export FLUME_CLASSPATH=/usr/lib/flume/plugins/helloworld/helloworld_plugin.jar, and run source /etc/profile in order to reload the file.
The HelloWorldSink as all collector’s Sink, once in a while gets closed, and then reopened.
In order for the helloworld.txt to preserve its content, you need to change the HelloWorldSink.java code to open the file in Append mode.
Moreover, I recommend writing the file directly to /tmp directory.