Section 6 Replace Values by Nas/A specfic value

6.1 Introduction

  • We will explore some cases in which we can quickly change some values to NAs.

  • Some useful commands (from the library naniar) include:

    • replace_with_na(): replace specific value(s) at in specific columns by NAs.
    • replace_with_na_all(): Replaces some values by NAs for all columns.
    • replace_with_na_at(): Replaces some chosen values in a chosen set of columns by NAs.
    • replace_with_na_if(): Replaces some values on some columns with conditions (is.numeric, is.character) by NAs.
suppressPackageStartupMessages(library(tidyverse)) 
library(naniar)

Let’s first create a data example.

You can see that the data_example that we create here represents what we would expect if we have a character value in a numerical column if we read data from a CSV file. For example, column y is a column with numerical values, but since there is NR (NR means not reported) in the column, R will think it is a character column when we read the data.

data_example<- tribble(
   ~name,    ~x,     ~y,   ~z,   ~t,
  "Mr.A",   "a",     "2", "3.6",  "na",    
  "Mr.B",   "b",     "1",   ".",  "2.1",
 "Ms. C",  "NR",    "NR",  "10",   "3.4",
  "Ms. D",  "NR",    "NR",  "NR",   "1")
data_example
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     .     2.1  
## 3 Ms. C NR    NR    10    3.4  
## 4 Ms. D NR    NR    NR    1

6.2 replace_with_na(): replace specific value(s) at in specific columns by NAs.

replace_with_na replaces a list of values in a list of columns by Na, where each column has different values that we want to replace.

We may need to use this in case a value should be converted to NA in a column but not in another column. For example, we want to convert NR to NA in column x, but do not want to do so with column z (with column z, we only want to convert “.” to NAs but keep NR as it is).

data_example %>% 
  naniar::replace_with_na(replace = list(x = c("NR"), 
                                 y = c("NR"), 
                                 z = c(".")))
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     <NA>  2.1  
## 3 Ms. C <NA>  <NA>  10    3.4  
## 4 Ms. D <NA>  <NA>  NR    1

6.3 replace_with_na_all() Replaces some values by NA for all columns.

Here, we want to replace a value “NR” by NAs.

data_example %>% replace_with_na_all(., condition = ~.x == "NR")
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     .     2.1  
## 3 Ms. C <NA>  <NA>  10    3.4  
## 4 Ms. D <NA>  <NA>  <NA>  1

In case we want to replace many more values by NA, we can use \(%in%\) as below

data_example %>% replace_with_na_all(., condition = ~.x %in% c("NR", "."))
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     <NA>  2.1  
## 3 Ms. C <NA>  <NA>  10    3.4  
## 4 Ms. D <NA>  <NA>  <NA>  1

6.4 replace_with_na_at(): replaces some chosen values in a chosen set of columns by NAs.

Replace “NR” and “.” at columns x and y with NAs:

data_example %>% replace_with_na_at(.var=c("x", "y"), 
                                    condition = ~.x %in% c("NR", "."))
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     .     2.1  
## 3 Ms. C <NA>  <NA>  10    3.4  
## 4 Ms. D <NA>  <NA>  NR    1

6.5 replace_with_na_if(): Replaces some values on some columns with conditions (is.numeric, is.character) by NA.

data_example %>% replace_with_na_if(is_character, 
                                    condition = ~.x %in% c("NR", "."))
## # A tibble: 4 × 5
##   name  x     y     z     t    
##   <chr> <chr> <chr> <chr> <chr>
## 1 Mr.A  a     2     3.6   na   
## 2 Mr.B  b     1     <NA>  2.1  
## 3 Ms. C <NA>  <NA>  10    3.4  
## 4 Ms. D <NA>  <NA>  <NA>  1

<<<<<<< HEAD ## A helpfull observation using mutate_at and replace

Besides of replace values by Na. We also can replace specific values by a value in some selected columns. One example can be

data_example %>%  mutate_at(c("name"),
            ~replace(., .%in% c("Mr.A", "Mr.B"), "The AB"))
## # A tibble: 4 × 5
##   name   x     y     z     t    
##   <chr>  <chr> <chr> <chr> <chr>
## 1 The AB a     2     3.6   na   
## 2 The AB b     1     .     2.1  
## 3 Ms. C  NR    NR    10    3.4  
## 4 Ms. D  NR    NR    NR    1

======= >>>>>>> 56eeb42d87bc29311ec57c37bae5be9a4b59edac ## References

https://www.rdocumentation.org/packages/naniar/versions/0.6.1/topics/replace_with_na