public class ExecSource extends AbstractSource implements EventDrivenSource, Configurable
A Source
implementation that executes a Unix process and turns each
line of text into an event.
This source runs a given Unix command on start up and expects that process to continuously produce data on standard out (stderr ignored by default). Unless told to restart, if the process exits for any reason, the source also exits and will produce no further data. This means configurations such as cat [named pipe] or tail -F [file] are going to produce the desired results where as date will probably not - the former two commands produce streams of data where as the latter produces a single event and exits.
The ExecSource is meant for situations where one must integrate with existing systems without modifying code. It is a compatibility gateway built to allow simple, stop-gap integration and doesn't necessarily offer all of the benefits or guarantees of native integration with Flume. If one has the option of using the AvroSource, for instance, that would be greatly preferred to this source as it (and similarly implemented sources) can maintain the transactional guarantees that exec can not.
Why doesn't ExecSource offer transactional guarantees?
The problem with ExecSource and other asynchronous sources is that
the source can not guarantee that if there is a failure to put the event into
the Channel
the client knows about it. As a for instance, one of the
most commonly requested features is the tail -F [file]-like use case
where an application writes to a log file on disk and Flume tails the file,
sending each line as an event. While this is possible, there's an obvious
problem; what happens if the channel fills up and Flume can't send an event?
Flume has no way of indicating to the application writing the log file that
it needs to retain the log or that the event hasn't been sent, for some
reason. If this doesn't make sense, you need only know this: Your
application can never guarantee data has been received when using a
unidirectional asynchronous interface such as ExecSource! As an extension
of this warning - and to be completely clear - there is absolutely zero
guarantee of event delivery when using this source. You have been warned.
Configuration options
Parameter | Description | Unit / Type | Default |
---|---|---|---|
command | The command to execute | String | none (required) |
restart | Whether to restart the command when it exits | Boolean | false |
restartThrottle | How long in milliseconds to wait before restarting the command | Long | 10000 |
logStderr | Whether to log or discard the standard error stream of the command | Boolean | false |
batchSize | The number of events to commit to channel at a time. | integer | 20 |
batchTimeout | Amount of time (in milliseconds) to wait, if the buffer size was not reached, before data is pushed downstream. | long | 3000 |
Metrics
TODO
Constructor and Description |
---|
ExecSource() |
Modifier and Type | Method and Description |
---|---|
void |
configure(Context context)
Request the implementing class to (re)configure itself.
|
void |
start()
Starts a service or component.
|
void |
stop()
Stops a service or component.
|
getChannelProcessor, getLifecycleState, getName, setChannelProcessor, setName, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getChannelProcessor, setChannelProcessor
getLifecycleState
getName, setName
public void start()
LifecycleAware
Starts a service or component.
Implementations should determine the result of any start logic and effect
the return value of LifecycleAware.getLifecycleState()
accordingly.
start
in interface LifecycleAware
start
in class AbstractSource
public void stop()
LifecycleAware
Stops a service or component.
Implementations should determine the result of any stop logic and effect
the return value of LifecycleAware.getLifecycleState()
accordingly.
stop
in interface LifecycleAware
stop
in class AbstractSource
public void configure(Context context)
Configurable
Request the implementing class to (re)configure itself.
When configuration parameters are changed, they must be reflected by the component asap.
There are no thread safety guarantees on when configure might be called.
configure
in interface Configurable
Copyright © 2009-2016 Apache Software Foundation. All Rights Reserved.