-
Notifications
You must be signed in to change notification settings - Fork 16
/
README
57 lines (46 loc) · 2.06 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
HBaseHUT:
---------
http://github.com/sematext/HBaseHUT
November 2010
Released under Apache License 2.0.
Author:
-------
Alex Baranau
Description:
------------
HBaseHUT stands for High Updates Throughput for HBase. It was inspired by
discussions on HBase mailing lists around the problem with having Get&Put
for each update operation, which affects write throughput dramatically.
Another force behind behind the approach used in HBaseHUT was recent
activity on Coprocessors development. Although the current implementation
doesn't use CPs, HBaseHUT is designed with CPs in mind because CPs add more
flexibility when it comes to alternative MapReduce data processing
approaches that will be used by HBaseHUT in the future.
The idea behind HBaseHUT is:
* Don't do updates of existing data on each Put (and hence don't perform
Get operation for each Put operation). All Puts are plain Puts with the
relevant pure-insert write performance.
* Defer processing updates to scheduled job (not necessarily a MapReduce job)
or perform updates on as-needed basis.
* Serve updated data in "online" manner: user always gets updated record
immediately after new data was Put, i.e., user "sees" updates immediately
after he writes data.
For more information please refer to the github project wiki:
https://github.com/sematext/HBaseHUT/wiki.
For a clear introductory post with a good HBaseHUT use-case read:
http://blog.sematext.com/2010/12/16/deferring-processing-updates-to-increase-hbase-write-performance/
Mailing List:
-------------
To participate more in the discussion, join the group at
https://groups.google.com/group/hbasehut/
Build Notes:
------------
Current pom configured to build against HBase 0.89.20100924.
HBase jars are provided with sources, as only HBase trunk sources
were mavenized and available in public repos.
Tests take some time to execute (up to several minutes), to skip
their execution use -Dmaven.skip.tests=true.
HBase Version Compatibility:
----------------------------
Compatible with HBase 0.20.5 - 0.89.XXX.
May be compatible with previous: tests needed (TODO)