org.apache.flume.sink
Class AvroSink
java.lang.Object
org.apache.flume.sink.AbstractSink
org.apache.flume.sink.AbstractRpcSink
org.apache.flume.sink.AvroSink
- All Implemented Interfaces:
- Configurable, LifecycleAware, NamedComponent, Sink
public class AvroSink
- extends AbstractRpcSink
A Sink
implementation that can send events to an RPC server (such as
Flume's AvroSource
).
This sink forms one half of Flume's tiered collection support. Events sent to
this sink are transported over the network to the hostname / port pair using
the RPC implementation encapsulated in RpcClient
.
The destination is an instance of Flume's AvroSource
, which
allows Flume agents to forward to other Flume agents, forming a tiered
collection infrastructure. Of course, nothing prevents one from using this
sink to speak to other custom built infrastructure that implements the same
RPC protocol.
Events are taken from the configured Channel
in batches of the
configured batch-size. The batch size has no theoretical limits
although all events in the batch must fit in memory. Generally, larger
batches are far more efficient, but introduce a slight delay (measured in
millis) in delivery. The batch behavior is such that underruns (i.e. batches
smaller than the configured batch size) are possible. This is a compromise
made to maintain low latency of event delivery. If the channel returns a null
event, meaning it is empty, the batch is immediately sent, regardless of
size. Batch underruns are tracked in the metrics. Empty batches do not incur
an RPC roundtrip.
Configuration options
Parameter |
Description |
Unit (data type) |
Default |
hostname |
The hostname to which events should be sent. |
Hostname or IP (String) |
none (required) |
port |
The port to which events should be sent on hostname. |
TCP port (int) |
none (required) |
batch-size |
The maximum number of events to send per RPC. |
events (int) |
100 |
connect-timeout |
Maximum time to wait for the first Avro handshake and RPC request |
milliseconds (long) |
20000 |
request-timeout |
Maximum time to wait RPC requests after the first |
milliseconds (long) |
20000 |
Metrics
TODO
Nested classes/interfaces inherited from interface org.apache.flume.Sink |
Sink.Status |
AvroSink
public AvroSink()
initializeRpcClient
protected RpcClient initializeRpcClient(Properties props)
- Description copied from class:
AbstractRpcSink
- Returns a new RpcClient instance configured using the given
Properties object. This method is called whenever a new
connection needs to be created to the next hop.
- Specified by:
initializeRpcClient
in class AbstractRpcSink
- Returns:
Copyright © 2009-2013 Apache Software Foundation. All Rights Reserved.