Cloudera Enterprise 5.16.x | Other versions

Setting Up Apache Flume Using the Command Line

Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized datastore.

Continue reading:

      Note:

    To install Flume using Cloudera Manager, see Configuring Apache Flume.

    Installing the Flume RPM or Debian Packages

    The Flume RPM and Debian packages consist of three packages:

    • flume-ng — Everything you need to run Flume
    • flume-ng-agent — Handles starting and stopping the Flume agent as a service
    • flume-ng-doc — Flume documentation
    All Flume installations require the common code provided by flume-ng.
      Note: Install Cloudera Repository
    Before using the instructions on this page to install or upgrade:
    • Install the Cloudera yum, zypper/YaST or apt repository.
    • Install or upgrade CDH 5 and make sure it is functioning correctly.
    For instructions, see Installing and Deploying Unmanaged CDH Using the Command Line and Upgrading Unmanaged CDH Using the Command Line.

    To install Flume on Ubuntu and other Debian systems:

    $ sudo apt-get install flume-ng

    To install Flume On RHEL-compatible systems:

    $ sudo yum install flume-ng

    To install Flume on SLES systems:

    $ sudo zypper install flume-ng

    You might also want Flume to run automatically on start-up. To do this, install the Flume agent.

    To install the Flume agent so Flume starts automatically on Ubuntu and other Debian systems:

    $ sudo apt-get install flume-ng-agent

    To install the Flume agent so Flume starts automatically on Red Hat-compatible systems:

    $ sudo yum install flume-ng-agent

    To install the Flume agent so Flume starts automatically on SLES systems:

    $ sudo zypper install flume-ng-agent

    To install the documentation:

    To install the documentation on Ubuntu and other Debian systems:

    $ sudo apt-get install flume-ng-doc

    To install the documentation on RHEL-compatible systems:

    $ sudo yum install flume-ng-doc

    To install the documentation on SLES systems:

    $ sudo zypper install flume-ng-doc

    Verifying the Flume Installation

    At this point, you should have everything necessary to run Flume, and the flume-ng command should be in your $PATH. You can test this by running:

    $ flume-ng help

    You should see something similar to this:

    Usage: /usr/bin/flume-ng <command> [options]...
    
    commands:
      help                  display this help text
      agent                 run a Flume agent
      avro-client           run an avro Flume client
      version               show Flume version info
    
    global options:
      --conf,-c <conf>      use configs in <conf> directory
      --classpath,-C <cp>   append to the classpath
      --dryrun,-d           do not actually start Flume, just print the command
      --Dproperty=value     sets a JDK system property value
    
    agent options:
      --conf-file,-f <file> specify a config file (required)
      --name,-n <name>      the name of this agent (required)
      --help,-h             display help text
    
    avro-client options:
      --rpcProps,-P <file>  RPC client properties file with server connection params
      --host,-H <host>      hostname to which events will be sent (required)
      --port,-p <port>      port of the avro source (required)
      --dirname <dir>       directory to stream to avro source
      --filename,-F <file>  text file to stream to avro source [default: std input]
      --headerFile,-R <file> headerFile containing headers as key/value pairs on each new line
      --help,-h             display help text
    
      Either --rpcProps or both --host and --port must be specified.
    
    Note that if <conf> directory is specified, then it is always included first
    in the classpath.
    Page generated October 24, 2018.