Partial animal string matching in R -
i have dataframe,
d<-data.frame(name=c("brown cat", "blue cat", "big lion", "tall tiger", "black panther", "short cat", "red bird", "short bird stuffed", "big eagle", "bad sparrow", "dog fish", "head dog", "brown yorkie", "lab short bulldog"), label=1:14)
i'd search name
column , if words "cat", "lion", "tiger", , "panther" appear, want assign character string feline
new column , corresponding row species
.
if words "bird", "eagle", , "sparrow"
appear, want assign character string avian
new column , corresponding row species
.
if words "dog", "yorkie", , "bulldog" appear, want assign character string canine
new column , corresponding row species
.
ideally, i'd store in list or similar can keep @ beginning of script, because new variants of species show in name category, nice have easy access update qualifies feline
, avian
, , canine
.
this question answered here (how create new column in dataframe based on partial string matching other column in r), doesn't address multiple name twist present in problem.
there may more elegant solution this, use grep
|
specify alternative matches.
d[grep("cat|lion|tiger|panther", d$name), "species"] <- "feline" d[grep("bird|eagle|sparrow", d$name), "species"] <- "avian" d[grep("dog|yorkie", d$name), "species"] <- "canine"
i've assumed meant "avian", , left out "bulldog" since contains "dog".
you might want add ignore.case = true
grep.
output:
# name label species #1 brown cat 1 feline #2 blue cat 2 feline #3 big lion 3 feline #4 tall tiger 4 feline #5 black panther 5 feline #6 short cat 6 feline #7 red bird 7 avian #8 short bird stuffed 8 avian #9 big eagle 9 avian #10 bad sparrow 10 avian #11 dog fish 11 canine #12 head dog 12 canine #13 brown yorkie 13 canine #14 lab short bulldog 14 canine
Comments
Post a Comment