org.apache.flume.sink
Class AbstractRpcSink

java.lang.Object
  extended by org.apache.flume.sink.AbstractSink
      extended by org.apache.flume.sink.AbstractRpcSink
All Implemented Interfaces:
Configurable, LifecycleAware, NamedComponent, Sink
Direct Known Subclasses:
AvroSink, ThriftSink

public abstract class AbstractRpcSink
extends AbstractSink
implements Configurable

This sink provides the basic RPC functionality for Flume. This sink takes several arguments which are used in RPC. This sink forms one half of Flume's tiered collection support. Events sent to this sink are transported over the network to the hostname / port pair using the RPC implementation encapsulated in RpcClient. The destination is an instance of Flume's .AvroSource or ThriftSource (based on which implementation of this class is used), which allows Flume agents to forward to other Flume agents, forming a tiered collection infrastructure. Of course, nothing prevents one from using this sink to speak to other custom built infrastructure that implements the same RPC protocol.

Events are taken from the configured Channel in batches of the configured batch-size. The batch size has no theoretical limits although all events in the batch must fit in memory. Generally, larger batches are far more efficient, but introduce a slight delay (measured in millis) in delivery. The batch behavior is such that underruns (i.e. batches smaller than the configured batch size) are possible. This is a compromise made to maintain low latency of event delivery. If the channel returns a null event, meaning it is empty, the batch is immediately sent, regardless of size. Batch underruns are tracked in the metrics. Empty batches do not incur an RPC roundtrip.

Configuration options

Parameter Description Unit (data type) Default
hostname The hostname to which events should be sent. Hostname or IP (String) none (required)
port The port to which events should be sent on hostname. TCP port (int) none (required)
batch-size The maximum number of events to send per RPC. events (int) 100
connect-timeout Maximum time to wait for the first Avro handshake and RPC request milliseconds (long) 20000
request-timeout Maximum time to wait RPC requests after the first milliseconds (long) 20000
compression-type Select compression type. Default is "none" and the only compression type available is "deflate" compression type none
compression-level In the case compression type is "deflate" this value can be between 0-9. 0 being no compression and 1-9 is compression. The higher the number the better the compression. 6 is the default. compression level 6

Metrics

TODO

Implementation Notes: Any implementation of this class must override the initializeRpcClient(Properties) method. This method will be called whenever this sink needs to create a new connection to the source.


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.flume.Sink
Sink.Status
 
Constructor Summary
AbstractRpcSink()
           
 
Method Summary
 void configure(Context context)
           Request the implementing class to (re)configure itself.
protected abstract  RpcClient initializeRpcClient(Properties props)
          Returns a new RpcClient instance configured using the given Properties object.
 Sink.Status process()
          Requests the sink to attempt to consume data from attached channel
 void start()
          The start() of RpcSink is more of an optimization that allows connection to be created before the process() loop is started.
 void stop()
           Stops a service or component.
 String toString()
           
 
Methods inherited from class org.apache.flume.sink.AbstractSink
getChannel, getLifecycleState, getName, setChannel, setName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

AbstractRpcSink

public AbstractRpcSink()
Method Detail

configure

public void configure(Context context)
Description copied from interface: Configurable

Request the implementing class to (re)configure itself.

When configuration parameters are changed, they must be reflected by the component asap.

There are no thread safety guarrantees on when configure might be called.

Specified by:
configure in interface Configurable

initializeRpcClient

protected abstract RpcClient initializeRpcClient(Properties props)
Returns a new RpcClient instance configured using the given Properties object. This method is called whenever a new connection needs to be created to the next hop.

Parameters:
props -
Returns:

start

public void start()
The start() of RpcSink is more of an optimization that allows connection to be created before the process() loop is started. In case it so happens that the start failed, the process() loop will itself attempt to reconnect as necessary. This is the expected behavior since it is possible that the downstream source becomes unavailable in the middle of the process loop and the sink will have to retry the connection again.

Specified by:
start in interface LifecycleAware
Overrides:
start in class AbstractSink

stop

public void stop()
Description copied from interface: LifecycleAware

Stops a service or component.

Implementations should determine the result of any stop logic and effect the return value of LifecycleAware.getLifecycleState() accordingly.

Specified by:
stop in interface LifecycleAware
Overrides:
stop in class AbstractSink

toString

public String toString()
Overrides:
toString in class AbstractSink

process

public Sink.Status process()
                    throws EventDeliveryException
Description copied from interface: Sink

Requests the sink to attempt to consume data from attached channel

Note: This method should be consuming from the channel within the bounds of a Transaction. On successful delivery, the transaction should be committed, and on failure it should be rolled back.

Specified by:
process in interface Sink
Returns:
READY if 1 or more Events were successfully delivered, BACKOFF if no data could be retrieved from the channel feeding this sink
Throws:
EventDeliveryException - In case of any kind of failure to deliver data to the next hop destination.


Copyright © 2009-2013 Apache Software Foundation. All Rights Reserved.