Tagged: HBase Toggle Comment Threads | Keyboard Shortcuts

  • Wang 23:33 on 2021-02-05 Permalink | Reply
    Tags: , , HBase,   

    In this light here is a comparison of… 

    In this light, here is a comparison of Open Source NOSQL databases CassandraMongodbCouchDBRedisRiakRethinkDBCouchbase (ex-Membase)HypertableElasticSearchAccumuloVoltDBKyoto TycoonScalarisOrientDBAerospikeNeo4j and HBase:

    https://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis/

     
  • Wang 20:56 on 2019-11-11 Permalink | Reply
    Tags: , , HBase, ,   

    Include Ranger to protect your hadoop ecosystem 

    Apache Ranger

    Apache Ranger™ is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform.

    The vision with Ranger is to provide comprehensive security across the Apache Hadoop ecosystem. With the advent of Apache YARN, the Hadoop platform can now support a true data lake architecture. Enterprises can potentially run multiple workloads, in a multi tenant environment. Data security within Hadoop needs to evolve to support multiple use cases for data access, while also providing a framework for central administration of security policies and monitoring of user access.

     
  • Wang 21:33 on 2018-03-11 Permalink | Reply
    Tags: , , HBase, , ,   

    [Sqoop1] Interact MySQL with HDFS/Hive/HBase 

    install sqoop1 on mac

    brew install sqoop
    

    #if you have set env profiles, uncomment profiles in conf/sqoop-env.sh

    1.MySQL -> HDFS

    1.1.import table

    sqoop import --connect jdbc:mysql://localhost/test --direct --username root --P --table t1 --warehouse-dir /mysql/test --fields-terminated-by ','
    

    1.2.import schema

    sqoop import-all-tables --connect jdbc:mysql://localhost/test --direct --username root -P --warehouse-dir /mysql/test --fields-terminated-by ','
    

    2.MySQL -> Hive

    2.1.import definition

    sqoop create-hive-table --connect jdbc:mysql://localhost/test --table t1 --username root --P --hive-database test
    

    2.2.import table

    sqoop import --connect jdbc:mysql://localhost/test --username root --P --table t1 --hive-import --hive-database test --hive-table t1 --fields-terminated-by ','
    

    2.3.import schema

    sqoop import-all-tables --connect jdbc:mysql://localhost/test --username root --P --hive-import --hive-database test --fields-terminated-by ','
    

    3.MySQL -> HBase

    3.1.definition

    sqoop import --connect jdbc:mysql://localhost/test --username root --P --table t1
    

    3.2.import table, need create table in hbase first

    sqoop import --connect jdbc:mysql://localhost/test --username root --P --table t1 --hbase-bulkload --hbase-table test.t1 --column-family basic --fields-terminated-by ','
    

    3.3.import table without creating table in hbase, but pay attention to hbase/sqoop version

    sqoop import --connect jdbc:mysql://localhost/test --username root --P --table t1 --hbase-bulkload --hbase-create-table --hbase-table test.t1 --column-family basic --fields-terminated-by ','
    

    4.HDFS/Hive/HBase -> MySQL

    sqoop export --connect jdbc:mysql://localhost/test --username root --P --table t1 --export-dir /user/hive/warehouse/test.db/t1 --fields-terminated-by ','
    
     
  • Wang 22:21 on 2018-03-09 Permalink | Reply
    Tags: , HBase   

    [HBase] No columns to insert 

    When I load data from hdfs to hbase, I got error:

    Caused by: java.lang.IllegalArgumentException: No columns to insert
        at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1505)
        at org.apache.hadoop.hbase.client.BufferedMutatorImpl.validatePut(BufferedMutatorImpl.java:147)
        at org.apache.hadoop.hbase.client.BufferedMutatorImpl.doMutate(BufferedMutatorImpl.java:134)
        at org.apache.hadoop.hbase.client.BufferedMutatorImpl.mutate(BufferedMutatorImpl.java:98)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1028)
        at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:146)
        at org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
        at org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:762)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
        at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
        at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:148)
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:547)
        ... 9 more
    

    After reading the document, it said that hbase doesn’t support null value, I checked hdfs files, it indeed contained null value in some properties.

    So I modified the data and reloaded to hbase, I didn’t get the error any more.

     
    • Black Hairstyles 04:11 on 2020-11-28 Permalink | Reply

      Thanks , I have just been looking for information about this subject for a while and yours is the best I have came upon so far. But, what about the conclusion? Are you certain about the source?

      Like

    • Hairstyles 10:13 on 2020-12-03 Permalink | Reply

      I am constantly looking online for articles that can assist me. Thank you!

      Like

    • Crave Freebies 06:34 on 2020-12-07 Permalink | Reply

      Magnificent goods from you, man. I’ve understand your stuff previous to and you’re just extremely fantastic. I really like what you have acquired here, certainly like what you are saying and the way in which you say it. You make it enjoyable and you still take care of to keep it sensible. I can not wait to read much more from you. This is really a great site.

      Like

    • I Fashion Styles 19:11 on 2020-12-07 Permalink | Reply

      Have you ever thought about including a little bit more than just your articles? I mean, what you say is important and all. Nevertheless imagine if you added some great pictures or videos to give your posts more, “pop”! Your content is excellent but with pics and clips, this blog could certainly be one of the most beneficial in its niche. Amazing blog!

      Like

  • Wang 21:34 on 2018-02-27 Permalink | Reply
    Tags: , ETL, , HBase, , NoSQL   

    Import data from hive to hbase 

    Recently I need restore data from hive to hbase, I found there are no direct ways to do this by tools like sqoop, so I converted it myself.

    1.create hbase namespace and table which contained one columnfamily named basic

    create_namespace 'gbif'
    create 'gbif.gbif_0004998', 'basic'
    

    1.create intermediate hive table which following hive/hbase tables’s structure

    CREATE EXTERNAL TABLE intermediate.hbase_gbif_0004998 (gbifid string, datasetkey string, occurrenceid string, kingdom string, phylum string, class string, orders string, family string, genus string, species string, infraspecificepithet string, taxonrank string, scientificname string, countrycode string, locality string, publishingorgkey string, decimallatitude string, decimallongitude string, coordinateuncertaintyinmeters string, coordinateprecision string, elevation string, elevationaccuracy string, depth string, depthaccuracy string, eventdate string, day string, month string, year string, taxonkey string, specieskey string, basisofrecord string, institutioncode string, collectioncode string, catalognumber string, recordnumber string, identifiedby string, license string, rightsholder string, recordedby string, typestatus string, establishmentmeans string, lastinterpreted string, mediatype string, issue string)
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,basic:datasetkey,basic:occurrenceid,basic:kingdom,basic:phylum,basic:class,basic:orders,basic:family,basic:genus,basic:species,basic:infraspecificepithet,basic:taxonrank,basic:scientificname,basic:countrycode,basic:locality,basic:publishingorgkey,basic:decimallatitude,basic:decimallongitude,basic:coordinateuncertaintyinmeters,basic:coordinateprecision,basic:elevation,basic:elevationaccuracy,basic:depth,basic:depthaccuracy,basic:eventdate,basic:day,basic:month,basic:year,basic:taxonkey,basic:specieskey,basic:basisofrecord,basic:institutioncode,basic:collectioncode,basic:catalognumber,basic:recordnumber,basic:identifiedby,basic:license,basic:rightsholder,basic:recordedby,basic:typestatus,basic:establishmentmeans,basic:lastinterpreted,basic:mediatype,basic:issue") 
    TBLPROPERTIES("hbase.table.name" = "gbif.gbif_0004998");
    

    3.insert data into intermediate hive table

    insert overwrite table intermediate.hbase_gbif_0004998 select * from gbif.gbif_0004998;
    

    4.get intermediate hive table’s hdfs path

    desc formatted intermediate.hbase_gbif_0004998;
    

    #5.import into hbase from hdfs

    #hbase --config config_dir org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles 
    #hdfs://localhost:9000/user/hive/warehouse/intermediate.db/hbase_gbif_0004998 
    gbif.gbif_0004998
    

    6.check hbase’s data

    count 'gbif.gbif_0004998'
    
    ...
     ...
     Current count: 326000, row: 986217061
     Current count: 327000, row: 991771339
     327316 row(s) in 13.6890 seconds
    
    => 327316
    

    7.get data from hbase table

    hbase(main):008:0> get 'gbif.gbif_0004998', '1019778874'
    COLUMN CELL 
    basic:basisofrecord timestamp=1519452831179, value=LIVING_SPECIMEN 
    basic:catalognumber timestamp=1519452831179, value=A0011 
    basic:class timestamp=1519452831179, value=Liliopsida 
    basic:collectioncode timestamp=1519452831179, value=ArxC3xA1ceas 
    basic:coordinateprecision timestamp=1519452831179, value= 
    basic:coordinateuncertaintyinmeters timestamp=1519452831179, value= 
    basic:countrycode timestamp=1519452831179, value=CO 
    basic:datasetkey timestamp=1519452831179, value=fd5ae2bb-6ee6-4e5c-8428-6284fa385f9a 
    basic:day timestamp=1519452831179, value=23 
    basic:decimallatitude timestamp=1519452831179, value= 
    basic:decimallongitude timestamp=1519452831179, value= 
    basic:depth timestamp=1519452831179, value= 
    basic:depthaccuracy timestamp=1519452831179, value= 
    basic:elevation timestamp=1519452831179, value= 
    basic:elevationaccuracy timestamp=1519452831179, value= 
    basic:establishmentmeans timestamp=1519452831179, value= 
    basic:eventdate timestamp=1519452831179, value=2007-08-23T02:00Z 
    basic:family timestamp=1519452831179, value=Araceae 
    basic:genus timestamp=1519452831179, value=Anthurium 
    basic:identifiedby timestamp=1519452831179, value= 
    basic:infraspecificepithet timestamp=1519452831179, value= 
    basic:institutioncode timestamp=1519452831179, value=CorporacixC3xB3n San Jorge 
    basic:issue timestamp=1519452831179, value= 
    basic:kingdom timestamp=1519452831179, value=Plantae 
    basic:lastinterpreted timestamp=1519452831179, value=2018-02-03T23:09Z 
    basic:license timestamp=1519452831179, value=CC0_1_0 
    basic:locality timestamp=1519452831179, value= 
    basic:mediatype timestamp=1519452831179, value= 
    basic:month timestamp=1519452831179, value=8 
    basic:occurrenceid timestamp=1519452831179, value=JBSJ:Araceas:A0011 
    basic:orders timestamp=1519452831179, value=Alismatales 
    basic:phylum timestamp=1519452831179, value=Tracheophyta 
    basic:publishingorgkey timestamp=1519452831179, value=1904954c-81e7-4254-9778-ae3deed93de6 
    basic:recordedby timestamp=1519452831179, value=Oyuela G. 
    basic:recordnumber timestamp=1519452831179, value= 
    basic:rightsholder timestamp=1519452831179, value=CorporacixC3xB3n San Jorge 
    basic:scientificname timestamp=1519452831179, value=Anthurium cabrerense Engl. 
    basic:species timestamp=1519452831179, value=Anthurium cabrerense 
    basic:specieskey timestamp=1519452831179, value=2872557 
    basic:taxonkey timestamp=1519452831179, value=2872557 
    basic:taxonrank timestamp=1519452831179, value=SPECIES 
    basic:typestatus timestamp=1519452831179, value= 
    basic:year timestamp=1519452831179, value=2007
    
     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: