dplyr case_when when across groups


4galaxy7:

i have df

df = data.frame(
    group = c(rep("A", 3), rep("B", 3)), 
    vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821")
    )

and two vectors:

tier_1 = c("SO:0001574", "SO:0001575")
tier_2 = c("SO:0001821", "SO:0001822")

I want to produce an output:

  group         vt     ct
1     A SO:0001574 tier_1
2     A SO:0001619 tier_1
3     A SO:0001619 tier_1
4     B SO:0001619 tier_2
5     B SO:0001619 tier_2
6     B SO:0001821 tier_2

i.e. I want to generate a third column ct which is populated depending on the presence of the vt column in tier_1 or tier_2 so that all rows within a given group are populated with that tier type.

I tried:

df %>%
    dplyr::group_by(group) %>% 
    dplyr::mutate(tier = dplyr::case_when(
        vt %in% tier_1 ~ "tier_1",
        vt %in% tier_2 ~ "tier_2"))

But this only populates a single row, not all rows in the group:

# A tibble: 6 x 4
# Groups:   group [2]
  group vt         ct     tier  
  <chr> <chr>      <chr>  <chr> 
1 A     SO:0001574 tier_1 tier_1
2 A     SO:0001619 tier_1 NA    
3 A     SO:0001619 tier_1 NA    
4 B     SO:0001619 tier_2 NA    
5 B     SO:0001619 tier_2 NA    
6 B     SO:0001821 tier_2 tier_2
Ronak Shah:

Wrap the code anyto get one logical value per group:

library(dplyr)

df %>%
 group_by(group) %>% 
 mutate(tier = case_when(
                any(vt %in% tier_1) ~ "tier_1",
                any(vt %in% tier_2) ~ "tier_2"))

#  group vt         tier  
#  <chr> <chr>      <chr> 
#1 A     SO:0001574 tier_1
#2 A     SO:0001619 tier_1
#3 A     SO:0001619 tier_1
#4 B     SO:0001619 tier_2
#5 B     SO:0001619 tier_2
#6 B     SO:0001821 tier_2

Related


dplyr case_when when across groups

4galaxy7: i have df df = data.frame( group = c(rep("A", 3), rep("B", 3)), vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821") ) and two vectors: tier_1 = c("SO:0001574", "SO:0001575") tier_2 = c("SO:0001821"

dplyr case_when when across groups

4galaxy7: i have df df = data.frame( group = c(rep("A", 3), rep("B", 3)), vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821") ) and two vectors: tier_1 = c("SO:0001574", "SO:0001575") tier_2 = c("SO:0001821"

Evaluate in dplyr::case_when()

tics Following the example given in the dplyr::case_when()documentation : x <- 1:50 case_when(x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x)) I expected this number 35to produce, "b

Using case_when with dplyr

James DeWeese I'm trying to convert mutate_at() to mutate() using dplyr's new "cross" function and am having some difficulty. In short, I need to compare the values in a range of columns to a "baseline" column. I need to use the baseline value when the value i

Evaluate in dplyr::case_when()

tics Following the example given in the dplyr::case_when()documentation : x <- 1:50 case_when(x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x)) I expected this number 35to produce, "b

Using case_when with dplyr

James DeWeese I'm trying to convert mutate_at() to mutate() using dplyr's new "cross" function and am having some difficulty. In short, I need to compare the values in a range of columns to a "baseline" column. I need to use the baseline value when the value i

Using case_when with dplyr

James DeWeese I'm trying to convert mutate_at() to mutate() using dplyr's new "cross" function and am having some difficulty. In short, I need to compare the values in a range of columns to a "baseline" column. I need to use the baseline value when the value i

Evaluate in dplyr::case_when()

tics Following the example given in the dplyr::case_when()documentation : x <- 1:50 case_when(x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x)) I expected this number 35to produce, "b

Evaluate in dplyr::case_when()

tics Following the example given in the dplyr::case_when()documentation : x <- 1:50 case_when(x %% 35 == 0 ~ "fizz buzz", x %% 5 == 0 ~ "fizz", x %% 7 == 0 ~ "buzz", TRUE ~ as.character(x)) I expected this number 35to produce, "b

Using case_when with dplyr

James DeWeese I'm trying to convert mutate_at() to mutate() using dplyr's new "cross" function and am having some difficulty. In short, I need to compare the values in a range of columns to a "baseline" column. I need to use the baseline value when the value i

Using case_when with dplyr

James DeWeese I'm trying to convert mutate_at() to mutate() using dplyr's new "cross" function and am having some difficulty. In short, I need to compare the values in a range of columns to a "baseline" column. I need to use the baseline value when the value i

Apply case_when across two dataframes

Anthony W I have two dataframes: df1 <- data.frame(A = c(1, 2, 3), B = c(0,0,3), C = c(3,2,1)) df2 <- data.frame(A = c(0, 2, 4), B = c(1,0,3), C = c(0,1,4)) I would like to generate a third dataframe by comparing entries between equivalent named columns by a

Apply case_when across two dataframes

Anthony W I have two dataframes: df1 <- data.frame(A = c(1, 2, 3), B = c(0,0,3), C = c(3,2,1)) df2 <- data.frame(A = c(0, 2, 4), B = c(1,0,3), C = c(0,1,4)) I would like to generate a third dataframe by comparing entries between equivalent named columns by a

Apply case_when across two dataframes

Anthony W I have two dataframes: df1 <- data.frame(A = c(1, 2, 3), B = c(0,0,3), C = c(3,2,1)) df2 <- data.frame(A = c(0, 2, 4), B = c(1,0,3), C = c(0,1,4)) I would like to generate a third dataframe by comparing entries between equivalent named columns by a

Apply case_when across two dataframes

Anthony W I have two dataframes: df1 <- data.frame(A = c(1, 2, 3), B = c(0,0,3), C = c(3,2,1)) df2 <- data.frame(A = c(0, 2, 4), B = c(1,0,3), C = c(0,1,4)) I would like to generate a third dataframe by comparing entries between equivalent named columns by a

Apply case_when across two dataframes

Anthony W I have two dataframes: df1 <- data.frame(A = c(1, 2, 3), B = c(0,0,3), C = c(3,2,1)) df2 <- data.frame(A = c(0, 2, 4), B = c(1,0,3), C = c(0,1,4)) I would like to generate a third dataframe by comparing entries between equivalent named columns by a

Avoid type conflicts with dplyr::case_when

user3614648: I'm trying to use dplyr::case_wheninside dplyr::mutateto create a new variable where I set some missing values and recode others at the same time. However, if I try to set the value to NA, I get an error saying we can't create the variable newbeca

Having trouble using case_when in dplyr

pgcudahy It's late and I must be making a stupid mistake, but why does this usage case_whenproduce an error? x <- 1:5 dplyr:::case_when( x == 1 ~ TRUE, x != 1 ~ print(x)) #> [1] 1 2 3 4 5 #> Error: must be a logical vector, not an integer vector Ronal

Using NSE in dplyr::case_when

mokhovu I've read the Programming with dplyr documentation and tried to write a simple case_when()function around that . library(dplyr) data_test <- data.frame( a = rep(c("a", "b", "c"), each = 5), b = rnorm(15) ) fun_test <- function(df, var1, var2)

dplyr case_when with dynamic number of cases

Simon Want to use dplyr and case_whencollapse a series of indicator columns into one column. The challenge is that I want to be able to collapse an unspecified/dynamic number of columns. Consider the following dataset, which gearhas been divided into a series

dplyr case_when with dynamic number of cases

Simon Want to use dplyr and case_whencollapse a series of indicator columns into one column. The challenge is that I want to be able to collapse an unspecified/dynamic number of columns. Consider the following dataset, which gearhas been divided into a series

dplyr case_when with dynamic number of cases

Simon Want to use dplyr and case_whencollapse a series of indicator columns into one column. The challenge is that I want to be able to collapse an unspecified/dynamic number of columns. Consider the following dataset, which gearhas been divided into a series

dplyr case_when in complex cases

Ahu Suppose I generate a probability table by country, type and type in each round of research. Also, I need to calculate weights based on the rounds a person has participated in up to that point. Weights are calculated as the sum of all probabilities (p) minu

Avoid type conflicts with dplyr::case_when

user3614648: I'm trying to use dplyr::case_wheninside dplyr::mutateto create a new variable where I set some missing values and recode others at the same time. However, if I try to set the value to NA, I get an error saying we can't create the variable newbeca

Using case_when with OR in ir dplyr

Aitan I have this column in my data: table(data$year) 2011 2012 2013 2014 2015 2016 2017 2018 2019 2 28 17 36 26 29 37 33 10 is.numeric(data$year) [1] TRUE I want to make changes to case_when using the following code: data <- data %>% m

dplyr behavior inside case_when and lag

username I have a dataset with studyid, year and two markers: event and popular. I want all year variables to be TRUE (1) after the event is marked true (and the event variable can only be true once). case_when and lag seem to be the perfect combination, but i

dplyr case_when statement with list of characters

nt I have a list with 31 site names. > typeof(Asites) [1] "list" > str(Asites) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 31 obs. of 1 variable: $ Asites: chr "45.88:-64.35" "45.88:-64.37" "45.89:-64.33" "45.89:-64.34" ... I want to write a case_when stat

Concise evaluation programming with dplyr::case_when

Malinga I try to write a simple function around the dplyr::case_when() function. I read the program with dplyr documentation at https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html but can't figure out how the case_when() function works. I