Open In App

dplyr Package in R Programming

Last Updated : 24 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The dplyr package for R offers efficient data manipulation functions. It makes data transformation and summarization simple with concise, readable syntax.

Key Features of dplyr

Here are some important features of dplyr that make working with data easier.

1. Data Frame

Data frames in dplyr in R is organized tables where each column stores specific types of information, like names, ages, or scores for creating a data frame involves specifying column names and their respective values.

R
df <- data.frame(Name = c("vipul", "jayesh", "anurag"),
                 Age = c(25, 23, 22), Score = c(95, 89, 78))
df                 

Output:

dplyr_
A data.frame

Note: Tibbles introduced through the tibble package, share similar functionality but offer enhanced user-friendly features. The syntax for creating a tibble is similar to that of a data frame.

2. Pipes (%>%)

dplyr in R The pipe operator (%>%) in dplyr package, which allows us to chain multiple operations together, improving code readability.

R
library(dplyr)

result <- mtcars %>%
filter(mpg > 20) %>%	 
select(mpg, cyl, hp) %>% 
group_by(cyl) %>%		
summarise(mean_hp = mean(hp)) 

print(result)

Output:

Screenshot-2025-06-24-123750
Output

Important dplyr Functions

dplyr in R provides various important functions that can be used for Data Manipulation. These are: 

1. filter()

For choosing cases and using their values as a base for doing so.

R
d <- data.frame(name = c("Abhi", "Bhavesh", "Chaman", "Dimri"),
                age = c(7, 5, 9, 16),
                ht = c(46, NA, NA, 69),
                school = c("yes", "yes", "no", "no"))

print(d)

r_w_na <- d %>% filter(is.na(ht))
print(r_w_na)

r_w_na <- d %>% filter(!is.na(ht))
print(r_w_na)

Output: 

Screenshot-2025-06-24-124451
Output

2. arrange()

For reordering of the cases.

R
d <- data.frame( name = c("Abhi", "Bhavesh", "Chaman", "Dimri"), 
                 age = c(7, 5, 9, 16), 
                 ht = c(46, NA, NA, 69),
                 school = c("yes", "yes", "no", "no") )
print(d)

d.name<- arrange(d, age)
print(d.name)

Output: 

image
Output

3. select() and rename()

For choosing variables and using their names as a base for doing so.

R
library(dplyr)

d <- data.frame(name=c("Abhi", "Bhavesh", "Chaman", "Dimri"),
                age=c(7, 5, 9, 16),
                ht=c(46, NA, NA, 69),
                school=c("yes", "yes", "no", "no"))

d_new <- d %>%
  select(name, age, contains("h")) %>%
  rename(height = ht)

print(d_new)

Output: 

Screenshot-2025-06-24-144402
Output

4. mutate() and transmute()

Addition of new variables which are the functions of prevailing variables.

R
d <- data.frame(name = c("Abhi", "Bhavesh", "Chaman", "Dimri"), 
                age = c(7, 5, 9, 16), 
                ht = c(46, NA, NA, 69),
                school = c("yes", "yes", "no", "no"))

mutate(d, x3 = ht + age)

transmute(d, x3 = ht + age)

Output: 

Screenshot-2025-06-24-130842
Output

5. summarise()

Condensing various values to one value.

R
d <- data.frame(
  name = c("Abhi", "Bhavesh", "Chaman", "Dimri"),
  age = c(7, 5, 9, 16),
  ht = c(46, NA, NA, 69),
  school = c("yes", "yes", "no", "no")
)

summarise(d, 
          mean_age = mean(age, na.rm = TRUE), 
          min_age = min(age, na.rm = TRUE), 
          max_age = max(age, na.rm = TRUE), 
          median_age = median(age, na.rm = TRUE))

Output: 

Screenshot-2025-06-24-145124
Output

6. sample_n() and sample_frac()

For taking random specimens.

R
d <- data.frame( name = c("Abhi", "Bhavesh",
                          "Chaman", "Dimri"), 
                 age = c(7, 5, 9, 16), 
                 ht = c(46, NA, NA, 69),
                 school = c("yes", "yes", "no", "no") )

sample_n(d, 3)

sample_frac(d, 0.50)

Output: 

Screenshot-2025-06-24-131305
Output

Article Tags :

Similar Reads