apache pig - Create hive timestamp from pig -


how can create timestamp field in pig string hive accepts timestamp?

i have formatted string in pig match timestamp format in hive, after loading null instead of showing date.

2014-04-10 09:45:56 how format looks in pig, , matching format hive timestamp, cannot load. (only if load string field)

any ideas why?

quick update: no hcatalog available

problem case timestamp fields contains null values , filed become null when using timestamp data type. when putting timestamp column row in above format works fine. real question how null values can handle

i suspect have written data hdfs using pigstorage , want load hive table. problem missing tuple field written pig null treated hive 0.11 null. far good. subsequent fields treated null, can have different values. hive 0.12 doesn't have issue.

depending on serde type, hive can interpret different strings null. in case of lazysimpleserde \n.

you have 2 option:

  • set table's null format property empty string produced pig
  • or store \n in pig null fields

e.g:

given following data in pig 0.11 :

a = load 'data' (txt:chararray, ts:chararray); dump a; (a,2014-04-10 09:45:56) (b,2014-04-11 10:45:56) (,) (e,2014-04-12 11:45:56) 

option 1:

store '/user/data'; 

hive 0.11 :

create external table test (txt string, tms timestamp)  row format delimited fields terminated '\t' location '/user/data';  alter table test set serdeproperties('serialization.null.format' = ''); 

option 2:

... b = foreach generate txt, (ts null?'\\n':ts); store b '/user/data';  

then create table in hive without setting serde property.


Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -