|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.flume.sink.AbstractSink org.apache.flume.sink.hbase.HBaseSink
public class HBaseSink
A simple sink which reads events from a channel and writes them to HBase. The Hbase configuration is picked up from the first hbase-site.xml encountered in the classpath. This sink supports batch reading of events from the channel, and writing them to Hbase, to minimize the number of flushes on the hbase tables. To use this sink, it has to be configured with certain mandatory parameters:
table: The name of the table in Hbase to write to.
columnFamily: The column family in Hbase to write to.
This sink will commit each transaction if the table's write buffer size is reached or if the number of events in the current transaction reaches the batch size, whichever comes first.
Other optional parameters are:
serializer: A class implementing HbaseEventSerializer
.
An instance of
this class will be used to write out events to hbase.
serializer.*: Passed in the configure() method to serializer
as an object of Context
.
batchSize: This is the batch size used by the client. This is the maximum number of events the sink will commit per transaction. The default batch size is 100 events.
Note: While this sink flushes all events in a transaction to HBase in one shot, Hbase does not guarantee atomic commits on multiple rows. So if a subset of events in a batch are written to disk by Hbase and Hbase fails, the flume transaction is rolled back, causing flume to write all the events in the transaction all over again, which will cause duplicates. The serializer is expected to take care of the handling of duplicates etc. HBase also does not support batch increments, so if multiple increments are returned by the serializer, then HBase failure will cause them to be re-written, when HBase comes back up.
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface org.apache.flume.Sink |
---|
Sink.Status |
Constructor Summary | |
---|---|
HBaseSink()
|
|
HBaseSink(org.apache.hadoop.conf.Configuration conf)
|
Method Summary | |
---|---|
void |
configure(Context context)
Request the implementing class to (re)configure itself. |
org.apache.hadoop.conf.Configuration |
getConfig()
|
Sink.Status |
process()
Requests the sink to attempt to consume data from attached channel |
void |
start()
Starts a service or component. |
void |
stop()
Stops a service or component. |
Methods inherited from class org.apache.flume.sink.AbstractSink |
---|
getChannel, getLifecycleState, getName, setChannel, setName, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public HBaseSink()
public HBaseSink(org.apache.hadoop.conf.Configuration conf)
Method Detail |
---|
public void start()
LifecycleAware
Starts a service or component.
Implementations should determine the result of any start logic and effect
the return value of LifecycleAware.getLifecycleState()
accordingly.
start
in interface LifecycleAware
start
in class AbstractSink
public void stop()
LifecycleAware
Stops a service or component.
Implementations should determine the result of any stop logic and effect
the return value of LifecycleAware.getLifecycleState()
accordingly.
stop
in interface LifecycleAware
stop
in class AbstractSink
public void configure(Context context)
Configurable
Request the implementing class to (re)configure itself.
When configuration parameters are changed, they must be reflected by the component asap.
There are no thread safety guarrantees on when configure might be called.
configure
in interface Configurable
public org.apache.hadoop.conf.Configuration getConfig()
public Sink.Status process() throws EventDeliveryException
Sink
Requests the sink to attempt to consume data from attached channel
Note: This method should be consuming from the channel within the bounds of a Transaction. On successful delivery, the transaction should be committed, and on failure it should be rolled back.
process
in interface Sink
EventDeliveryException
- In case of any kind of failure to
deliver data to the next hop destination.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |