org.apache.flume.sink
Class AvroSink

java.lang.Object
  extended by org.apache.flume.sink.AbstractSink
      extended by org.apache.flume.sink.AbstractRpcSink
          extended by org.apache.flume.sink.AvroSink
All Implemented Interfaces:
Configurable, LifecycleAware, NamedComponent, Sink

public class AvroSink
extends AbstractRpcSink

A Sink implementation that can send events to an RPC server (such as Flume's AvroSource).

This sink forms one half of Flume's tiered collection support. Events sent to this sink are transported over the network to the hostname / port pair using the RPC implementation encapsulated in RpcClient. The destination is an instance of Flume's AvroSource, which allows Flume agents to forward to other Flume agents, forming a tiered collection infrastructure. Of course, nothing prevents one from using this sink to speak to other custom built infrastructure that implements the same RPC protocol.

Events are taken from the configured Channel in batches of the configured batch-size. The batch size has no theoretical limits although all events in the batch must fit in memory. Generally, larger batches are far more efficient, but introduce a slight delay (measured in millis) in delivery. The batch behavior is such that underruns (i.e. batches smaller than the configured batch size) are possible. This is a compromise made to maintain low latency of event delivery. If the channel returns a null event, meaning it is empty, the batch is immediately sent, regardless of size. Batch underruns are tracked in the metrics. Empty batches do not incur an RPC roundtrip.

Configuration options

Parameter Description Unit (data type) Default
hostname The hostname to which events should be sent. Hostname or IP (String) none (required)
port The port to which events should be sent on hostname. TCP port (int) none (required)
batch-size The maximum number of events to send per RPC. events (int) 100
connect-timeout Maximum time to wait for the first Avro handshake and RPC request milliseconds (long) 20000
request-timeout Maximum time to wait RPC requests after the first milliseconds (long) 20000

Metrics

TODO


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.flume.Sink
Sink.Status
 
Constructor Summary
AvroSink()
           
 
Method Summary
protected  RpcClient initializeRpcClient(Properties props)
          Returns a new RpcClient instance configured using the given Properties object.
 
Methods inherited from class org.apache.flume.sink.AbstractRpcSink
configure, process, start, stop, toString
 
Methods inherited from class org.apache.flume.sink.AbstractSink
getChannel, getLifecycleState, getName, setChannel, setName
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

AvroSink

public AvroSink()
Method Detail

initializeRpcClient

protected RpcClient initializeRpcClient(Properties props)
Description copied from class: AbstractRpcSink
Returns a new RpcClient instance configured using the given Properties object. This method is called whenever a new connection needs to be created to the next hop.

Specified by:
initializeRpcClient in class AbstractRpcSink
Returns:


Copyright © 2009-2013 Apache Software Foundation. All Rights Reserved.