screen scraping - XPATH Contain() Function Not Working for Multiple <div> Tags with Same Name -
i trying scrape following section (only excerpt) of xml code. second form-item i'm trying scrape:
<div class="form-item"> <a href="http://www.avaopera.org" target="_blank" rel="" class="">http://www.avaopera.org</a> </div> <div class="form-item"> <script type="text/javascript"> document.write('*[block of text]*') </script> <a href="mailto:ademarco@avaopera.org">ademarco@avaopera.org</a> </div>
i used following xpath query contain function because there multiple form-item tags: //div[@class='form-item' , contains(.,'@')]/a/text()
this query not work. tried removing /a/text()
displays text within <script>
not tag text.
what doing wrong?
you're targeting text within <div>
instead of text within <a>
, if understand goal correctly.
try using //div[@class='form-item' , contains(a/text(),'@')]/a/text()
instead, search child <a>
element within <div>
, not parent.
Comments
Post a Comment