regex - How to extract number more gracefully in Python using xpath and regular expression -


i have small html snippet want extract number – grade. using python scrapy , re.

my code works, far being nice.

here html snippet, want 2.

<div id="left"> <div class="0"><b>certificate:</b></div> <div class="1"> <div></div> <div> <a class="link" href="new.html">maths</a>&nbsp;(first)&nbsp;&nbsp;&nbsp;grade 2<br> </div> </div> <div class="2"></div> </div> 

and here how solved far:

! note = sel.xpath('//*[@id="left"]/div[2]/div[2]/text()[2]').extract() ! print note > [u'\xa0(first)\xa0\xa0\xa0grade 2'] ! note_string = ''.join(note) ! note_only = re.search(r'\d+', note_string).group() > 2 

it's not best practice transform lists strings extract such tiny information.

how can better?

you can use following xpath expression 2

substring-after(//*[@id="left"]/div[2]/div[2]/text(), "grade ") 

Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -