Built-in Character Functions in R
In earlier tutorials we have seen that how to create character vector in R which contains character strings. To manipulate strings or character vectors, R has many built in functions for characters.
Function | Description |
---|---|
nchar() |
Get the length of string |
toupper(x) |
Convert to Upper case |
tolower(x) |
Convert to Lower case |
casefold() |
case folding |
chartr() |
Character translation in character vector |
substr(x,start=n1,stop=n2) |
Extract or replace sub-strings in a character vector |
strsplit() |
Split the character vector |
strrep() |
Repeat the character string |
paste() |
Concatenate vectors after converting to character |
Examples of Character Functions
Let us discuss how to use all the above built-in character functions in R with the help of examples.
String Length in R
The number of characters (including space) in a string or elements of character vector can be counted using nchar()
function.
# create a string
x <- "R Programming"
# count the number of characters
nchar(x)
[1] 13
# count the number of elements
length(x)
[1] 1
# create a character vector
y <-c("One","Two","Three","Four","Five")
# count no. of letters in each element of y
nchar(y)
[1] 3 3 5 4 4
# count the number of elements in y
length(y)
[1] 5
Note than the length()
function gives the number of elements in a vector and nchar()
function gives the number of characters in each element of a vector.
toupper() function in R
Many times, during programming we need to change the case of a string or character. The toupper()
function converts the letters in a given string to uppercase.
x <- "r programming"
toupper(x)
[1] "R PROGRAMMING"
tolower() function in R
The tolower()
function converts the letters in a given string to lowercase.
y <- 'R Programming'
tolower(y)
[1] "r programming"
casefold() function in R
By default the casefold()
function converts all the characters to lower case. But we can use the argument upper=TRUE
to convert all the characters to upper case.
casefold("R is The bEst ProGramminG LanGuagE")
[1] "r is the best programming language"
casefold("R is The bEst ProGramminG LanGuagE", upper=TRUE)
[1] "R IS THE BEST PROGRAMMING LANGUAGE"
chartr() function in R
The chartr(old,new,x)
function is used to translate the old
characters to new
characters in character vectors x
Suppose we need to translate the letter "r" with "R" in the sentence "r Language".
x <- "r Language"
chartr("r","R",x)
[1] "R Language"
The chartr()
can also be used for multiple replacement.
Suppose we have x
as R Programming Language
and we need to translate all the character from the range m
to p
(i.e., m, n, o, p
) in x
to M
to P
(i.e., M, N, O, P
) and g
to G
.
x <- "R Programming Language"
chartr("m-pg","M-PG", x)
[1] "R PrOGraMMiNG LaNGuaGe"
substr() function in R
The substr()
function is used to extract or replace substrings in a character vector.
x <- "R Programming"
substr(x,3,9) # extract
[1] "Program"
In the above example, R will extract a string from $3^{rd}$ letter to $9^{th}$ letter form x
.
# Replace 3rd to 5th character by abc
substr(x,3,5)<-"abc"
x
[1] "R abcgramming"
In the above example, R will replace $3^{rd}$ to $5^{th}$ character by the string abc
.
strsplit() function in R
The strsplit(x, " ")
function
x <- c("R Programming","Python Programming")
strsplit(x," ")
[[1]]
[1] "R" "Programming"
[[2]]
[1] "Python" "Programming"
strrep() function in R
The strrep(x,times)
function repeat the character string in a character vector a given number of times.
strrep("ABC",4)
[1] "ABCABCABCABC"
Above command create a character string in which the string "ABC" is repeated four times.
strrep(c("X","Y","Z"),1:4)
[1] "X" "YY" "ZZZ" "XXXX"
Above command create a vector containing the elements "X", "YY","ZZ" and "XXXX". Since the string contains less number of elements that the number of times, R will use recycling rule.
strrep(" ",1:3)
[1] " " " " " "
Above function create vector with the given number of spaces. First element contain two spaces, second element contains four spaces and third element contains six spaces.
paste() function in R
The basic syntax of paste()
function is
paste(..., sep=" ",collapse=NULL)
One of the most important function that can be used to create and build strings is the paste()
function.
The paste()
function takes one or more R objects and convert them to a character. After that it concatenates these characters to create one or more character string.
D <- c("R", "Python")
paste("Best programming language for data science is ",D)
[1] "Best programming language for data science is R"
[2] "Best programming language for data science is Python"
paste("Treatment",1:3,sep="-")
[1] "Treatment-1" "Treatment-2" "Treatment-3"
Note than if the objects are of different length in paste function, R apply the recycling rule.
paste("Block",1:4,sep=" ")
[1] "Block 1" "Block 2" "Block 3" "Block 4"
Endnote
In this tutorial you learned anout some important character built-in functions available in R with illustration.
To learn more about other built-in functions and user-defined functions in R, please refer to the following tutorials:
- Mathematical functions in R
- Special Mathematical functions in R
- Trigonometric functions in R
- Statistical functions in R
- User-defined functions in R Part I
- User-defined functions in R Part II
- Functions in R
Hope you enjoyed learning this tutorial on built-in character functions in R.