"Version" decides how many versions of data can be showed for each *column* for each *column family* for each *row*.
For example(Hbase 0.94.8):
Create a table with version=5 and update/insert the same column of the same row for 6 times.
create 't1', {NAME => 'f1', VERSIONS => 5} put 't1','row1','f1:col1','1' put 't1','row1','f1:col1','2' put 't1','row1','f1:col1','3' put 't1','row1','f1:col1','4' put 't1','row1','f1:col1','5' put 't1','row1','f1:col1','6'
1. Raw scan can see the latest 5 versions of data in order.
hbase(main):025:0> scan 't1', {RAW => true, VERSIONS => 6} ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col1, timestamp=1400264596173, value=2 1 row(s) in 0.0280 seconds
2. The 5 versions are per column, not per row.
Then update/insert another column for 7 times.put 't1','row2','f1:col1','2_1' put 't1','row2','f1:col1','2_2' put 't1','row2','f1:col1','2_3' put 't1','row2','f1:col1','2_4' put 't1','row2','f1:col1','2_5' put 't1','row2','f1:col1','2_6' put 't1','row2','f1:col1','2_7'
hbase(main):034:0> scan 't1', {RAW => true, VERSIONS => 10} ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col1, timestamp=1400264596173, value=2 row2 column=f1:col1, timestamp=1400265195640, value=2_7 row2 column=f1:col1, timestamp=1400265194944, value=2_6 row2 column=f1:col1, timestamp=1400265194927, value=2_5 row2 column=f1:col1, timestamp=1400265194908, value=2_4 row2 column=f1:col1, timestamp=1400265194883, value=2_3 2 row(s) in 0.0360 secondsDelete row2 for column f1:col1.
hbase(main):036:0> delete 't1','row2','f1:col1' 0 row(s) in 0.0120 seconds
3. Deleted column is shown as "type=DeleteColumn".
hbase(main):037:0> scan 't1', {RAW => true, VERSIONS => 10} ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col1, timestamp=1400264596173, value=2 row2 column=f1:col1, timestamp=1400265585864, type=DeleteColumn row2 column=f1:col1, timestamp=1400265195640, value=2_7 row2 column=f1:col1, timestamp=1400265194944, value=2_6 row2 column=f1:col1, timestamp=1400265194927, value=2_5 row2 column=f1:col1, timestamp=1400265194908, value=2_4 row2 column=f1:col1, timestamp=1400265194883, value=2_3 2 row(s) in 0.0210 seconds
4. Deleted whole column family is always the 1st one in order.
Per scanning in hbase, "because family delete marker affects potentially many columns in this row, so in order to allow scanners to scan forward-only, the family delete markers need to be seen by a scanner first." Please try to understand below graph.hbase(main):009:0> scan 't1' ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col2, timestamp=1400267214363, value=col2_7 1 row(s) in 0.0150 seconds hbase(main):010:0> scan 't1', {RAW => true, VERSIONS => 6} ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col1, timestamp=1400264596173, value=2 row1 column=f1:col2, timestamp=1400267214363, value=col2_7 row1 column=f1:col2, timestamp=1400267213932, value=col2_6 row1 column=f1:col2, timestamp=1400267213914, value=col2_5 row1 column=f1:col2, timestamp=1400267213889, value=col2_4 row1 column=f1:col2, timestamp=1400267213862, value=col2_3 row2 column=f1:col1, timestamp=1400265585864, type=DeleteColumn 2 row(s) in 0.0490 seconds hbase(main):011:0> deleteall 't1','row1' 0 row(s) in 0.0400 seconds hbase(main):015:0> scan 't1', {RAW => true, VERSIONS => 6} ROW COLUMN+CELL row1 column=f1:, timestamp=1400274062009, type=DeleteFamily row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col1, timestamp=1400264596173, value=2 row1 column=f1:col2, timestamp=1400267214363, value=col2_7 row1 column=f1:col2, timestamp=1400267213932, value=col2_6 row1 column=f1:col2, timestamp=1400267213914, value=col2_5 row1 column=f1:col2, timestamp=1400267213889, value=col2_4 row1 column=f1:col2, timestamp=1400267213862, value=col2_3 row2 column=f1:col1, timestamp=1400265585864, type=DeleteColumn 2 row(s) in 0.0390 seconds hbase(main):016:0> scan 't1' ROW COLUMN+CELL 0 row(s) in 0.0130 seconds hbase(main):017:0> put 't1','row1','f1:col1','supernewrow' 0 row(s) in 0.0220 seconds hbase(main):018:0> scan 't1' ROW COLUMN+CELL row1 column=f1:col1, timestamp=1400274112052, value=supernewrow 1 row(s) in 0.0140 seconds hbase(main):019:0> scan 't1', {RAW => true, VERSIONS => 6} ROW COLUMN+CELL row1 column=f1:, timestamp=1400274062009, type=DeleteFamily row1 column=f1:col1, timestamp=1400274112052, value=supernewrow row1 column=f1:col1, timestamp=1400264962097, value=6 row1 column=f1:col1, timestamp=1400264933707, value=5 row1 column=f1:col1, timestamp=1400264928122, value=4 row1 column=f1:col1, timestamp=1400264924764, value=3 row1 column=f1:col2, timestamp=1400267214363, value=col2_7 row1 column=f1:col2, timestamp=1400267213932, value=col2_6 row1 column=f1:col2, timestamp=1400267213914, value=col2_5 row1 column=f1:col2, timestamp=1400267213889, value=col2_4 row1 column=f1:col2, timestamp=1400267213862, value=col2_3 row2 column=f1:col1, timestamp=1400265585864, type=DeleteColumn 2 row(s) in 0.0260 seconds
No comments:
Post a Comment