Using foreach in R doubles memory usage -


i'm using r 2.15 in ubuntu distro.

i applying function assign keywords streaming text data popular social networking site. aim make process more efficient splitting data 2 parts , applying function:

textd<-data.frame(text=c("they","dont","think","it","be","like is", "but do"),keywordid=0)  textd<-split(textd, seq(nrow(textd)) %/% 2 %% 2 == 0) keywords<-data.frame(kwds=c("be","do","is"),keywordid=1:3)  library(doparallel) registerdoparallel(2) library(foreach)   textd<-foreach (j = 1:2)%dopar%{   t<-textd[[j]]    (i in keywords$kwds){    #for loop assign keyword ids      tmp<-grepl(i, t$text, ignore.case = t)     cond<-tmp & t$keywordid==0     if (length(t$keywordid[cond]) > 0){       t$keywordid[cond]<-keywords$keywordid[keywords$kwds==i]        #if kw field populated...       cond2<-tmp & t$keywordid!=0       extra<-t[cond2,]       if (length(extra$keywordid) > 0){         extra$keywordid<-keywords$keywordid[keywords$kwds==i]          t<-rbind(t,extra)}}   }   t }   library(data.table) textd<-as.data.frame(data.table::rbindlist(textd)) 

the problem is, doing way makes both cores use same amount of ram, meaning each core doubles amount of ram used. runs out quickly. doing wrong? how ram split in quantity between cores? looking.

try splitting data within loop. this:

library(itertools) registerdoparallel(2)   textd<-foreach (t=isplitrows(textd, chunks=2), .combine=rbind,)%dopar%{   (i in keywords$kwds){    #for loop assign keyword ids  tmp<-grepl(i, t$text, ignore.case = t) cond<-tmp & t$keywordid==0 if (length(t$keywordid[cond]) > 0){   t$keywordid[cond]<-keywords$keywordid[keywords$kwds==i]    #if kw field populated...   cond2<-tmp & t$keywordid!=0   extra<-t[cond2,]   if (length(extra$keywordid) > 0){     extra$keywordid<-keywords$keywordid[keywords$kwds==i]      t<-rbind(t,extra)}}   }  return(t)  } 

Comments

Popular posts from this blog

apache - Remove .php and add trailing slash in url using htaccess not loading css -

javascript - jQuery show full size image on click -