python - How to associate values of tags with label of the tag the using ElementTree in a Pythonic way -
i have xml files trying process.
here derived sample 1 of files
fileasstring = """ <?xml version="1.0" encoding="utf-8"?> <eventdocument> <schemaversion>x2</schemaversion> <eventtable> <eventtransaction> <eventtitle> <value>some event</value> </eventtitle> <eventdate> <value>2003-12-31</value> </eventdate> <eventcoding> <eventtype>47</eventtype> <eventcode>a</eventcode> <footnoteid id="f1"/> <footnoteid id="f2"/> </eventcoding> <eventcycled> <value></value> </eventcycled> <eventamounts> <eventvoltage> <value>40000</value> </eventvoltage> </eventamounts> </eventtransaction> </eventtable> </eventdocument>"""
note, there can many eventtables in each document , events can have more details ones have isolated.
my goal create dictionary in following form
{'eventtitle':'some event, 'eventdate':'2003-12-31','eventtype':'47',\ 'eventcode':'a', 'eventcoding_ftnt_1':'f1','eventcoding_ftnt_2':'f2',\ 'eventcycled': , 'eventvoltage':'40000'}
i reading these in files assuming have string code text elements right below eventtransaction element text inside value tag follows
import xml.etree.celementtree et myxml = et.fromstring(fileasstring) eventtransactions = [ e e in myxml.iter() if e.tag == 'eventtransaction'] testtransaction = eventtransactions[0] my_dict = {} child_of in testtransaction: grand_children_tags = [e.tag e in child_of] if grand_children_tags == ['value']: my_dict[child_of.tag] = [e.text e in child_of][0] >>> my_dict {'eventtitle': 'some event', 'eventcycled': none, 'eventdate': '2003-12-31'}
this seems wrong because not taking advantage of xml instead using brute force have not seemed find example.
is there clearer , more pythonic way create output looking for?
use xpath pull out elements you're interested in.
the following code creates list of lists of dicts (i.e. tables/transactions/info):
tables = [] myxml = et.fromstring(fileasstring) table in myxml.findall('./eventtable'): transactions = [] tables.append(transactions) transaction in table.findall('./eventtransaction'): info = {} element in table.findall('.//*[value]'): info[element.tag] = element.find('./value').text or '' coding = transaction.find('./eventcoding') if coding not none: tag in 'eventtype', 'eventcode': element = coding.find('./%s' % tag) if element not none: info[tag] = element.text or '' index, element in enumerate(coding.findall('./footnoteid')): info['eventcoding_ftnt_%d' % index] = element.get('id', '') if info: transactions.append(info)
output:
[[{'eventcode': 'a', 'eventcoding_ftnt_0': 'f1', 'eventcoding_ftnt_1': 'f2', 'eventcycled': '', 'eventdate': '2003-12-31', 'eventtitle': 'some event', 'eventtype': '47', 'eventvoltage': '40000'}]]
Comments
Post a Comment