Built-in Character Functions in R

## Built-in Character Functions in R

In earlier tutorials we have seen that how to create character vector in R which contains character strings. To manipulate strings or character vectors, R has many built in functions for characters.

Function | Description |
---|---|

`nchar()` |
Get the length of string |

`toupper(x)` |
Convert to Upper case |

`tolower(x)` |
Convert to Lower case |

`casefold()` |
case folding |

`chartr()` |
Character translation in character vector |

`substr(x,start=n1,stop=n2)` |
Extract or replace sub-strings in a character vector |

`strsplit()` |
Split the character vector |

`strrep()` |
Repeat the character string |

`paste()` |
Concatenate vectors after converting to character |

## Examples of Character Functions

Let us discuss how to use all the above built-in character functions in R with the help of examples.

### String Length in R

The number of characters (including space) in a string or elements of character vector can be counted using `nchar()`

function.

```
# create a string
x <- "R Programming"
# count the number of characters
nchar(x)
```

`[1] 13`

```
# count the number of elements
length(x)
```

`[1] 1`

```
# create a character vector
y <-c("One","Two","Three","Four","Five")
# count no. of letters in each element of y
nchar(y)
```

`[1] 3 3 5 4 4`

```
# count the number of elements in y
length(y)
```

`[1] 5`

Note than the `length()`

function gives the number of elements in a vector and `nchar()`

function gives the number of characters in each element of a vector.

### toupper() function in R

Many times, during programming we need to change the case of a string or character. The `toupper()`

function converts the letters in a given string to uppercase.

```
x <- "r programming"
toupper(x)
```

`[1] "R PROGRAMMING"`

### tolower() function in R

The `tolower()`

function converts the letters in a given string to lowercase.

```
y <- 'R Programming'
tolower(y)
```

`[1] "r programming"`

### casefold() function in R

By default the `casefold()`

function converts all the characters to lower case. But we can use the argument `upper=TRUE`

to convert all the characters to upper case.

`casefold("R is The bEst ProGramminG LanGuagE")`

`[1] "r is the best programming language"`

`casefold("R is The bEst ProGramminG LanGuagE", upper=TRUE)`

`[1] "R IS THE BEST PROGRAMMING LANGUAGE"`

### chartr() function in R

The `chartr(old,new,x)`

function is used to translate the `old`

characters to `new`

characters in character vectors `x`

Suppose we need to translate the letter "r" with "R" in the sentence "r Language".

```
x <- "r Language"
chartr("r","R",x)
```

`[1] "R Language"`

The `chartr()`

can also be used for multiple replacement.

Suppose we have `x`

as `R Programming Language`

and we need to translate all the character from the range `m`

to `p`

(i.e., `m, n, o, p`

) in `x`

to `M`

to `P`

(i.e., `M, N, O, P`

) and `g`

to `G`

.

```
x <- "R Programming Language"
chartr("m-pg","M-PG", x)
```

`[1] "R PrOGraMMiNG LaNGuaGe"`

### substr() function in R

The `substr()`

function is used to extract or replace substrings in a character vector.

```
x <- "R Programming"
substr(x,3,9) # extract
```

`[1] "Program"`

In the above example, R will extract a string from $3^{rd}$ letter to $9^{th}$ letter form `x`

.

```
# Replace 3rd to 5th character by abc
substr(x,3,5)<-"abc"
x
```

`[1] "R abcgramming"`

In the above example, R will replace $3^{rd}$ to $5^{th}$ character by the string `abc`

.

### strsplit() function in R

The `strsplit(x, " ")`

function

```
x <- c("R Programming","Python Programming")
strsplit(x," ")
```

```
[[1]]
[1] "R" "Programming"
[[2]]
[1] "Python" "Programming"
```

### strrep() function in R

The `strrep(x,times)`

function repeat the character string in a character vector a given number of times.

`strrep("ABC",4)`

`[1] "ABCABCABCABC"`

Above command create a character string in which the string "ABC" is repeated four times.

`strrep(c("X","Y","Z"),1:4)`

`[1] "X" "YY" "ZZZ" "XXXX"`

Above command create a vector containing the elements "X", "YY","ZZ" and "XXXX". Since the string contains less number of elements that the number of times, R will use recycling rule.

`strrep(" ",1:3)`

`[1] " " " " " "`

Above function create vector with the given number of spaces. First element contain two spaces, second element contains four spaces and third element contains six spaces.

### paste() function in R

The basic syntax of `paste()`

function is

`paste(..., sep=" ",collapse=NULL)`

One of the most important function that can be used to create and build strings is the `paste()`

function.

The `paste()`

function takes one or more R objects and convert them to a character. After that it concatenates these characters to create one or more character string.

```
D <- c("R", "Python")
paste("Best programming language for data science is ",D)
```

```
[1] "Best programming language for data science is R"
[2] "Best programming language for data science is Python"
```

`paste("Treatment",1:3,sep="-")`

`[1] "Treatment-1" "Treatment-2" "Treatment-3"`

Note than if the objects are of different length in paste function, R apply the recycling rule.

`paste("Block",1:4,sep=" ")`

`[1] "Block 1" "Block 2" "Block 3" "Block 4"`

## Endnote

In this tutorial you learned anout some important character built-in functions available in R with illustration.

To learn more about other built-in functions and user-defined functions in R, please refer to the following tutorials:

- Mathematical functions in R
- Special Mathematical functions in R
- Trigonometric functions in R
- Statistical functions in R
- User-defined functions in R Part I
- User-defined functions in R Part II
- Functions in R

Hope you enjoyed learning this tutorial on built-in character functions in R.