r - How to number/label data-table by group-number from group_by? -
i have tbl_df want group_by(u,v) each distinct integer combination observed (u,v).
edit: resolved adding group_indices()
in dplyr 0.4.0
a) want assign each distinct group arbitrary distinct number label=1,2,3... e.g. combination (u,v)==(2,3) label 1, (1,3) 2, , on. how 1 mutate()
, without three-step summarize-and-self-join?
dplyr has neat function n()
, gives number of elements within group, not overall number of group. in data.table
called .grp
.
b) want assign string/character label ('a','b',...). numbering groups integers good-enough, because can use integer_to_label(i)
below. unless there's clever way merge these two? don't sweat part.
set.seed(1234) # helper fn mapping integer 1..26 character label integer_to_label <- function(i) { substr("abcdefghijklmnopqrstuvwxyz",i,i) } df <- tbl_df(data.frame(u=sample.int(3,10,replace=t), v=sample.int(4,10,replace=t))) # want label/number each distinct group of unique (u,v) combinations df %>% group_by(u,v) %>% mutate(label = n()) # wrong: n() number of element within group, not overall number of group u v 1 2 3 2 1 3 3 1 2 4 2 3 5 1 2 6 3 3 7 1 3 8 1 2 9 3 1 10 3 4 kludge 1: df %>% group_by(u,v) %>% summarize(label = n()) , self-join
updated answer
get_group_number = function(){ = 0 function(){ <<- i+1 } } group_number = get_group_number() df %>% group_by(u,v) %>% mutate(label = group_number())
you can consider following unreadable version
group_number = (function(){i = 0; function() <<- i+1 })() df %>% group_by(u,v) %>% mutate(label = group_number())
using iterators
package
library(iterators) counter = icount() df %>% group_by(u,v) %>% mutate(label = nextelem(counter))
Comments
Post a Comment