• Category
  • >R Programming

Character Functions in R

  • Lalit Salunkhe
  • Aug 10, 2020
Character Functions in R title banner

The strings or character data type is an equally important data type under R programming after numerics. It is important to have an overview of some commonly used built-in functions that are specifically designed to deal with the character or string values. In this article, we are going to discuss some of the built-in functions that can be used to manipulate the string or character values under R.

 

The functions which are useful in working with characters and their manipulation are known as character functions in R. We will be discussing the following character functions throughout this article.

  • is.character() function 

  • as.character() function

  • substr() function

  • grep() function

  • sub() function

  • strsplit() function

  • paste() function

  • toupper() function

  • tolower() function

Let us discuss these functions one by one in detail.


 

The is.character() Function in R

 

When you want to check if the given values (or a vector of values) are character/string or not, you can use is.character() function in R. This function takes an object or a vector and checks if the object/s is a character or not. If it is a character/string, the functions return TRUE as an output, else FALSE.

See an example below for the is.character() function.


This image shows multiple example codes as well as the output for the is.character() function.

Example code with output for is.character() function in R


Since the first two examples hold the strings, the function returned as TRUE in output. But, the third example has numeric values in it, due to which the output returned by the function is FALSE.

 

The as.character() Function in R

 

In case you want the provided input value to be converted into a character, we have as.character() function which does the task under R programming for you. This function can take an object or a vector as an argument (or any data type) and then converts it into the string or character. 

 

Let’s see an example for the as.character() function given below:


This image shows the examples as well as output code for the as.character() function in R.

Example code for as.character() function in R


As you can see, we feed this function with objects of different data types (boolean, numeric, complex) and could see that the function converts it into a character every time. We can always use the class() function to check in which class the function falls.

 

The substr() Function in R

 

The substr() function in R allows us to extract a substring from a given string object (or a vector containing string objects) based on the starting and the ending point. 

The syntax for a substr function is as shown below:

 

 

Where,

x - is a character vector that contains all the character objects.

start - is the starting position from where the extraction should start

stop - is the ending position at where the split of substring stops.

 

Interestingly, this function also allows you to replace the substrings based on the starting and ending position.

Let’s see an example for substr() function below:


This image shows an example code with output for the substr() function in R programming.

Example code with output for the substr() function


The grep() Function in R

 

Whenever we need to search for a specific pattern in R, we use the grep() function. This function searches for a specific pattern in the string provided and returns the output accordingly. The output will be the index of the character in each element of the given string that matches a specific pattern.

 

Let us see an example for the grep() function in R:


This image shows the example code with output for the grep() function and it's working in R.

Example code with output for the grep() function


In the example code above, we can see that we are trying to search the position of character “a” in the entire vector named “vect”. Since “a” is present within the first four elements of the vector, the output we are getting is 1, 2, 3, and 4. These are indices of the elements within which we have character “a” present.


 

The sub() Function in R

 

The sub() function under R falls under the replacement functions category. This function checks for a specific character/substring in the given character vector and replaces the first occurrence of it according to the input provided.

 

Let us see an example for the sub() function through the image below:


This image shows an example code and output for the sub() function in R programming.

Example code with output for the sub() function


Here, in this example, the first “world” is replaced with “universe”. However, the second occurrence of the word is not replaced by the function. This is how the sub() function works. You can only change the first occurrence of the substring.


The strsplit() Function in R

 

While working on text mining projects, we might be in need to check what are the most occurring words in the given string. In such cases, we might be in need to split the entire string into multiple substrings. The strsplit() function helps us to achieve the result. The function splits the given string based on the specific placeholders. Ex. splits the text in every space.

 

See an example below for a better realization of the strsplit function:


This image shows the example code with output for the strsplit() function in R.

Example code with output for the strsplit() function


In this code, you can see that the split happens at every space in the vector str_1.

 

The paste() Function in R

 

The paste() function allows us to paste the output of a vector or a data frame by converting it’s elements into a string and concatenates the elements from a vector together with a separator. 

 

Example for the paste() function is as given below:


This image shows the example code with output for the paste() function in R programming.

Example code with output for the paste() function


In the first example, we have created a sequence of numbers from 1 to 5 and concatenated them with the character “y” with separator “_”.

In the second example, we have used the “collapse =” argument that collapses the entire string with a delimiter (we used comma. You can use any one of your choice). 


 

The toupper() Function in R

 

When you need to convert your entire string into upper case letters, you can use the toupper() function which allows you to convert the text into the upper case.

 

See an example below for the toupper() function.


This image shows the example code with an output for the toupper() function in R.

Example code with output for the toupper() function


The tolower() Function in R

 

Contradictory to the toupper() function, when we need to convert the entire string into a lower case, we can use the tolower() function in R. This function converts the entire string provided as an input to the lower case and returns as an output.

 

Example for the tolower() function is as shown below:


This image shows the example code with an output for the tolower() function in R programming.

Example code with output for the tolower() function


In the above example, we can see that the input vector has an entire string in upper case and the tolower() function converts that into a lower case and returns as an output.

 

Conclusion

 

  • Character functions are designed specially to deal with the character strings or vectors with strings.

  • Character functions like is.character() and as.character() can be used to check whether the given input is a string or not and to convert the input values into the string respectively.

  • The grep(), sub(), and strsplit() functions are one of the most important and widely used functions under the text analytics and text mining fields.

  • Character functions such as toupper() and tolower() helps us to convert the entire input string into upper case or lower case respectively.

  • These are not the only character functions under R programming. There are many more (and you can define some of your own as well). However, we tried to explain some of the most widely used and common character functions in this article.

 

This is it from this article. Stay tuned for the series of R programming articles where we are moving in a great way from basics to advanced level with each passing day. We will come up with a new, interesting, and decent article from the world of R programming. Until then, stay home! Stay safe!

Latest Comments