session-stringr.qmd

---
title: "Introduction to R and Rstudio"
subtitle: "Session  - {stringr}"
execute:
  eval: true
---

## Wildcards searches using {stringr}

```{r}
#| echo: false
#| label: "libs"
library(tidyverse)
library(stringr)
```

```{r}
#| echo: false
#| label: "load-data"
beds_data <- read_csv(url("https://raw.githubusercontent.com/nhs-r-community/intro_r_data/main/beds_data.csv"),
  col_types = cols(date = col_date(format = "%d/%m/%Y")),
  skip = 3
)
```

A natural step to searching for long strings is to consider searching by key words

```{r}
#| eval: false
library(dplyr)
library(stringr)
beds_data |>
  filter(str_detect(
    string = org_name,
    pattern = "Bradford",
    negate = FALSE
  ))
```

See what happens when negate is changed to `TRUE`

## Adding trailing spaces

Quite often data has trailing spaces but using {readr}, interestingly, corrects this!

This data set has had trailing white space added to the beginning of the name and afterwards:

```{r}
by_ethnicity <- tibble::tribble(
  ~Ethnicity, ~`%`, ~Headcount, ~`%.working.age.population.(2011)`,
     "  Asian  ",  10,     118396,                                7.2,
     "  Black ",  6.1,      72321,                                3.4,
   "    Chinese    ",  0.6,       6536,                                0.9,
     " Mixed  ",  1.7,      20607,                                1.8,
     " White ", 79.2,     934544,                               85.6,
     "Other",  2.3,      27169,                                1.1
  )
```

## Removing trailing spaces

```{r}
by_ethnicity |>
  mutate(trimmed_name = str_trim(Ethnicity, "both"))
```

## End session