Added splitfile support to the Create table command#1
Added splitfile support to the Create table command#1ravipesala wants to merge 1 commit intoHuawei-Spark:masterfrom
Conversation
There was a problem hiding this comment.
only in createSplitKeys we use this method, right? if so i think we no need create this method in HBaseKVHelper
|
Just a minor comment, @yzhou2001 can you take a look at this? |
|
Basically my questions are:
Thanks. |
|
@yzhou2001 2 i think split file should be very small size and can fit into memory, so maybe we can put them with cmd just like: 3 yes, we need test it. PS: Also a question here: do you think it is necessary to control the # of reduce for bulk load of non-split table now? |
|
For the table creation, I think a focal point is how much we should build on top of the semantics of "creation of a RDB table on a nonexistent HBase table". The problem is that the more functionalities built into this semantics, the more difficult to reconcile with a possibly existing HBase table. In summary, this is something good to have, but has to be designed carefully to have a clear semantics. On the reducers, yes, a configurable reducer would be great. But, again, there is some complexity to it, mainly because we probably need a "splitter" class like in HBase. It's feasible but probably not a priority as of now. Right now, we're anxious to get basic functionalities work and obtain some advantageous performance data, in order to produce some weight behind the push for our technology. All the value-adding features/optimizations can be put off 'til a future release. |
User can create the splits to the table by using following command ex : CREATE TABLE testrav4(bytecol BYTE, shortcol SHORT, intcol INTEGER, longcol LONG, floatcol FLOAT, PRIMARY KEY(intcol,shortcol)) MAPPED BY (testhbaseravi4, COLS=[bytecol=cf1.hbytecol, longcol=cf2.hlongcol, floatcol=cf2.hfloatcol]) SPLITSFILE = 'D:/1.txt'