how to set alignment in pandas in python with non-ANSI characters -


when read data(death in m370 air crash) in r ,the format fine.

> read.csv("g:\\test.ansi",sep=",")            乘客姓名 性别   出生日期 1      huangtianhui   男 1948/05/28 2             姜翠云   女 1952/03/27 3             李红晶   女 1994/12/09 4          luiching   女 1969/08/02 5             宋飞飞   男 1982/03/01 6             唐旭东   男 1983/08/03 7        yangjiabao   女 1988/08/25 

when read data in python ,how can set records right alignment?

>>> import pandas pd     >>> pd.read_csv("g:\\test.ansi",sep=",")           乘客姓名         性别 出生日期 0    huangtianhui  男  1948/05/28 1             姜翠云  女  1952/03/27 2             李红晶  女  1994/12/09 3        luiching  女  1969/08/02 4             宋飞飞  男  1982/03/01 5             唐旭东  男  1983/08/03 6      yangjiabao  女  1988/08/25 7        买买提江·阿布拉  男  1979/07/10 

the data here: http://pan.baidu.com/s/1sjhaul3

the reason when dealing chinese characters (which takes space of 2 ansi characters), pandas still padding amount of white space ansi characters. means number of white spaces half of needed df containing chinese characters. makes situation worse pandas ignored chinese characters takes twice space:

print pd.read_csv("test.ansi",sep=",", encoding='gb18030').loc[10:12]  10  边亮京  男  1987/06/06 11  边茂勤  女  1947/07/19 12   曹蕊  女  1982/02/19 #notice how last line missing 1 leading white space compared preceding lines. 

ultimately, under hood come down __unicode__ class of dataframe class allocates spaces according _repr_fit_horizontal_ class. not sure may best solution. using 2 spaces in stead of 1 anywhere when chinese character encountered? not idea in case of mixing rows, , without chinese characters, such in dataframe.

maybe worthwhile report bug.

but if use ipython notebook, less affected issue, dataframes nicely displayed html.


Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -