org.apache.flume.source
Class ExecSource

java.lang.Object
  extended by org.apache.flume.source.AbstractSource
      extended by org.apache.flume.source.ExecSource
All Implemented Interfaces:
Configurable, EventDrivenSource, LifecycleAware, NamedComponent, Source

public class ExecSource
extends AbstractSource
implements EventDrivenSource, Configurable

A Source implementation that executes a Unix process and turns each line of text into an event.

This source runs a given Unix command on start up and expects that process to continuously produce data on standard out (stderr ignored by default). Unless told to restart, if the process exits for any reason, the source also exits and will produce no further data. This means configurations such as cat [named pipe] or tail -F [file] are going to produce the desired results where as date will probably not - the former two commands produce streams of data where as the latter produces a single event and exits.

The ExecSource is meant for situations where one must integrate with existing systems without modifying code. It is a compatibility gateway built to allow simple, stop-gap integration and doesn't necessarily offer all of the benefits or guarantees of native integration with Flume. If one has the option of using the AvroSource, for instance, that would be greatly preferred to this source as it (and similarly implemented sources) can maintain the transactional guarantees that exec can not.

Why doesn't ExecSource offer transactional guarantees?

The problem with ExecSource and other asynchronous sources is that the source can not guarantee that if there is a failure to put the event into the Channel the client knows about it. As a for instance, one of the most commonly requested features is the tail -F [file]-like use case where an application writes to a log file on disk and Flume tails the file, sending each line as an event. While this is possible, there's an obvious problem; what happens if the channel fills up and Flume can't send an event? Flume has no way of indicating to the application writing the log file that it needs to retain the log or that the event hasn't been sent, for some reason. If this doesn't make sense, you need only know this: Your application can never guarantee data has been received when using a unidirectional asynchronous interface such as ExecSource! As an extension of this warning - and to be completely clear - there is absolutely zero guarantee of event delivery when using this source. You have been warned.

Configuration options

Parameter Description Unit / Type Default
command The command to execute String none (required)
restart Whether to restart the command when it exits Boolean false
restartThrottle How long in milliseconds to wait before restarting the command Long 10000
logStderr Whether to log or discard the standard error stream of the command Boolean false
batchSize The number of events to commit to channel at a time. integer 20

Metrics

TODO


Constructor Summary
ExecSource()
           
 
Method Summary
 void configure(Context context)
           Request the implementing class to (re)configure itself.
 void start()
           Starts a service or component.
 void stop()
           Stops a service or component.
 
Methods inherited from class org.apache.flume.source.AbstractSource
getChannelProcessor, getLifecycleState, getName, setChannelProcessor, setName, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.flume.Source
getChannelProcessor, setChannelProcessor
 
Methods inherited from interface org.apache.flume.lifecycle.LifecycleAware
getLifecycleState
 
Methods inherited from interface org.apache.flume.NamedComponent
getName, setName
 

Constructor Detail

ExecSource

public ExecSource()
Method Detail

start

public void start()
Description copied from interface: LifecycleAware

Starts a service or component.

Implementations should determine the result of any start logic and effect the return value of LifecycleAware.getLifecycleState() accordingly.

Specified by:
start in interface LifecycleAware
Overrides:
start in class AbstractSource

stop

public void stop()
Description copied from interface: LifecycleAware

Stops a service or component.

Implementations should determine the result of any stop logic and effect the return value of LifecycleAware.getLifecycleState() accordingly.

Specified by:
stop in interface LifecycleAware
Overrides:
stop in class AbstractSource

configure

public void configure(Context context)
Description copied from interface: Configurable

Request the implementing class to (re)configure itself.

When configuration parameters are changed, they must be reflected by the component asap.

There are no thread safety guarrantees on when configure might be called.

Specified by:
configure in interface Configurable


Copyright © 2009-2012 Apache Software Foundation. All Rights Reserved.