yelp - File Operation in Python -


what trying do:

i trying use 'open' in python , script trying execute. trying give "restaurant name" input , file gets saved (reviews.txt).

script: (in short, script goes page , scrapes reviews)

from bs4 import beautifulsoup urllib import urlopen queries = 0 while queries <201:     stringq = str(queries)     page = urlopen('http://www.yelp.com/biz/madison-square-park-new-york?start=' + stringq)      soup = beautifulsoup(page)     reviews = soup.findall('p', attrs={'itemprop':'description'})     authors = soup.findall('span', attrs={'itemprop':'author'})      flag = true     indexof = 1     review in reviews:         dirtyentry = str(review)         while dirtyentry.index('<') != -1:             indexof = dirtyentry.index('<')             endof = dirtyentry.index('>')             if flag:                 dirtyentry = dirtyentry[endof+1:]                 flag = false             else:                 if(endof+1 == len(dirtyentry)):                     cleanentry = dirtyentry[0:indexof]                     break                 else:                     dirtyentry = dirtyentry[0:indexof]+dirtyentry[endof+1:]         f=open("reviews.txt", "a")         f.write(cleanentry)         f.write("\n")         f.close      queries = queries + 40 

problem: it's using append mode 'a' , according documentation, 'w' write mode overwrites. when change 'w' nothing happens.

f=open("reviews.txt", "w") #does not work! 

actual question: edit: let me clear confusion.

i want one review.txt file reviews. everytime run script, want script overwrite existing review.txt new reviews according input.

thank you,

if understand behavior want, should right code:

with open("reviews.txt", "w") f:     review in reviews:         dirtyentry = str(review)         while dirtyentry.index('<') != -1:             indexof = dirtyentry.index('<')             endof = dirtyentry.index('>')             if flag:                 dirtyentry = dirtyentry[endof+1:]                 flag = false             else:                 if(endof+1 == len(dirtyentry)):                     cleanentry = dirtyentry[0:indexof]                     break                 else:                     dirtyentry = dirtyentry[0:indexof]+dirtyentry[endof+1:]         f.write(cleanentry)         f.write("\n") 

this open file writing once , write entries it. otherwise, if it's nested in for loop, file opened each review , overwritten next review.

with statement ensures when program quits block, file closed. makes code easier read.


i'd suggest avoid using brackets in if statement, instead of

if(endof+1 == len(dirtyentry)): 

it's better use just

if endof + 1 == len(dirtyentry): 

Comments

Popular posts from this blog

javascript - jquery or ashx not working -

opencv - DataType<cv::detail::deriv_type>::depth what is it used for -

python 3.x - Mapping specific letters onto a list of words -