python - How to associate values of tags with label of the tag the using ElementTree in a Pythonic way -


i have xml files trying process.

here derived sample 1 of files

fileasstring = """ <?xml version="1.0" encoding="utf-8"?> <eventdocument> <schemaversion>x2</schemaversion>    <eventtable>        <eventtransaction>            <eventtitle>                <value>some event</value>            </eventtitle>            <eventdate>                <value>2003-12-31</value>            </eventdate>            <eventcoding>                <eventtype>47</eventtype>                <eventcode>a</eventcode>                <footnoteid id="f1"/>                <footnoteid id="f2"/>            </eventcoding>            <eventcycled>                <value></value>            </eventcycled>            <eventamounts>                 <eventvoltage>                 <value>40000</value>                 </eventvoltage>            </eventamounts>       </eventtransaction>    </eventtable> </eventdocument>""" 

note, there can many eventtables in each document , events can have more details ones have isolated.

my goal create dictionary in following form

{'eventtitle':'some event, 'eventdate':'2003-12-31','eventtype':'47',\  'eventcode':'a', 'eventcoding_ftnt_1':'f1','eventcoding_ftnt_2':'f2',\   'eventcycled': , 'eventvoltage':'40000'} 

i reading these in files assuming have string code text elements right below eventtransaction element text inside value tag follows

import xml.etree.celementtree et myxml = et.fromstring(fileasstring) eventtransactions = [ e e in myxml.iter() if e.tag == 'eventtransaction'] testtransaction = eventtransactions[0] my_dict = {} child_of in testtransaction:     grand_children_tags = [e.tag e in child_of]     if grand_children_tags == ['value']:         my_dict[child_of.tag] = [e.text e in child_of][0]  >>> my_dict {'eventtitle': 'some event', 'eventcycled': none, 'eventdate': '2003-12-31'} 

this seems wrong because not taking advantage of xml instead using brute force have not seemed find example.

is there clearer , more pythonic way create output looking for?

use xpath pull out elements you're interested in.

the following code creates list of lists of dicts (i.e. tables/transactions/info):

tables = [] myxml = et.fromstring(fileasstring) table in myxml.findall('./eventtable'):     transactions = []     tables.append(transactions)     transaction in table.findall('./eventtransaction'):         info = {}         element in table.findall('.//*[value]'):             info[element.tag] = element.find('./value').text or ''         coding = transaction.find('./eventcoding')         if coding not none:             tag in 'eventtype', 'eventcode':                 element = coding.find('./%s' % tag)                 if element not none:                     info[tag] = element.text or ''             index, element in enumerate(coding.findall('./footnoteid')):                 info['eventcoding_ftnt_%d' % index] = element.get('id', '')         if info:             transactions.append(info) 

output:

[[{'eventcode': 'a',    'eventcoding_ftnt_0': 'f1',    'eventcoding_ftnt_1': 'f2',    'eventcycled': '',    'eventdate': '2003-12-31',    'eventtitle': 'some event',    'eventtype': '47',    'eventvoltage': '40000'}]] 

Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -