How to add a column based on other columns in R DataFrame ?
Last Updated :
30 Apr, 2021
A data frame can be accessed and modified to store new insertions and deletions. The data frame can undergo mutations to increase its dimensions and store more data, as well as the rows and columns values, which can be modified based on other cell values.
In this article, we will see how to add columns based on other columns in DataFrame in R Programming Language. There can be various methods to do the same. Let's discuss them in detail.
Method 1 : Using transform() function
The transform() method in R is used to modify the data and perform mutations. It transforms the first argument that is supplied to the function. New columns can also be added as a second argument of the function, where it may be either a list declared at the beginning or initialized during run times using the desired regular expression evaluation. We can specify the newly added column name on the left side of the second argument, and declare the if-else expression on the right. The if-else expression consists of three parts,
- The condition to test the data upon
- The second part is evaluated when the condition is not satisfied
- Third when it isn't.
The result has to be explicitly into the original data frame, in order to pertain to the results.
Syntax:
transform(dataframe,x=c(..))
where x is the newly added column.
Example:
R
# creating a data frame
data_frame = data.frame(col1=c(1,2,3,-4),
col2=c(8,9,5,10),
col3=c(0,2,3,5))
# printing original data frame
print("Original Data Frame")
print (data_frame)
# transforming data frame
# declare col4 where if col1 is equal
# to col3, replace by col1+col3 value,
# otherwise by col1+col2 value
data_frame <- transform(
data_frame, col4= ifelse(col1==col3, col1+col2, col1+col3))
print("Modified Data Frame")
print(data_frame)
Output
[1] "Original Data Frame"
col1 col2 col3
1 1 8 0
2 2 9 2
3 3 5 3
4 -4 10 5
[1] "Modified Data Frame"
col1 col2 col3 col4
1 1 8 0 1
2 2 9 2 11
3 3 5 3 8
4 -4 10 5 1
Method 2 : Using with() method
The with() method in R can be used to evaluate expressions and then transform the data contained in a data frame. With is a generic function that evaluates expression specified as the second argument of the function in a local environment constructed from data, which is defined in the first argument of the function. Any logical expression can be provided as the first argument of the method and the value in the new column is replaced depending on the truth value of the expression after evaluating the condition in the argument parts of the with method.
Syntax:
with(data, expr, …)
Example:
R
# creating a data frame
data_frame = data.frame(col1=c(1,2,3,-4),
col2=c(8,9,5,10),
col3=c(0,2,3,5))
# printing original data frame
print("Original Data Frame")
print (data_frame)
# transforming data frame
# declare col4 where if col1 is equal
# to col3, replace by col1+col3 value,
# otherwise by col1+col2 value
data_frame$col4 <- with(
data_frame, ifelse(col1+col3>5, col1+col3, col1+col2))
print("Modified Data Frame")
print(data_frame)
Output
[1] "Original Data Frame"
col1 col2 col3
1 1 8 0
2 2 9 2
3 3 5 3
4 -4 10 5
[1] "Modified Data Frame"
col1 col2 col3 col4
1 1 8 0 9
2 2 9 2 11
3 3 5 3 6
4 -4 10 5 6
Method 3 : Using apply() method
apply() method in R takes a well-organized data frame or matrix as an input and gives as output a vector, list, or an array. apply() method is primarily used to avoid explicit uses of loop constructs. Any function can be specified into the apply() method. The result has to be explicitly into the original data frame, in order to pertain the results.
Syntax: apply(X, margin, FUN)
Parameter :
- x: a data frame or a matrix
- margin: take a value or range between 1 and 2 to define where to apply the function:
- margin=1 : the manipulation is performed on rows
- margin=2 : the manipulation is performed on columns
- margin=c(1,2) : the manipulation is performed on both rows and columns
- FUN: the function to apply where in built functions like mean, median, sum, min, max and even user-defined functions can be applied
Example:
R
# creating a data frame
data_frame = data.frame(col1=c(1,2,3,-4),
col2=c(8,9,5,10),
col3=c(0,2,3,5))
# printing original data frame
print("Original Data Frame")
print (data_frame)
# transforming data frame
# declare col4 where if col1 is
# equal to col3, replace by col1+col2
# value, otherwise by col3-col2 value
data_frame$col4 <- apply(
data_frame, 1, FUN = function(x) if(mean(x[1])>1) x[2]+x[1] else x[3]-x[2])
print("Modified Data Frame")
print(data_frame)
Output
[1] "Original Data Frame"
col1 col2 col3
1 1 8 0
2 2 9 2
3 3 5 3
4 -4 10 5
[1] "Modified Data Frame"
col1 col2 col3 col4
1 1 8 0 -8
2 2 9 2 11
3 3 5 3 8
4 -4 10 5 -5
Similar Reads
Non-linear Components
In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Spring Boot Tutorial
Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Class Diagram | Unified Modeling Language (UML)
A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read
Steady State Response
In this article, we are going to discuss the steady-state response. We will see what is steady state response in Time domain analysis. We will then discuss some of the standard test signals used in finding the response of a response. We also discuss the first-order response for different signals. We
9 min read
Backpropagation in Neural Network
Back Propagation is also known as "Backward Propagation of Errors" is a method used to train neural network . Its goal is to reduce the difference between the modelâs predicted output and the actual output by adjusting the weights and biases in the network.It works iteratively to adjust weights and
9 min read
Polymorphism in Java
Polymorphism in Java is one of the core concepts in object-oriented programming (OOP) that allows objects to behave differently based on their specific class type. The word polymorphism means having many forms, and it comes from the Greek words poly (many) and morph (forms), this means one entity ca
7 min read
3-Phase Inverter
An inverter is a fundamental electrical device designed primarily for the conversion of direct current into alternating current . This versatile device , also known as a variable frequency drive , plays a vital role in a wide range of applications , including variable frequency drives and high power
13 min read
What is Vacuum Circuit Breaker?
A vacuum circuit breaker is a type of breaker that utilizes a vacuum as the medium to extinguish electrical arcs. Within this circuit breaker, there is a vacuum interrupter that houses the stationary and mobile contacts in a permanently sealed enclosure. When the contacts are separated in a high vac
13 min read
AVL Tree Data Structure
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees for any node cannot be more than one. The absolute difference between the heights of the left subtree and the right subtree for any node is known as the balance factor of
4 min read
What is a Neural Network?
Neural networks are machine learning models that mimic the complex functions of the human brain. These models consist of interconnected nodes or neurons that process data, learn patterns, and enable tasks such as pattern recognition and decision-making.In this article, we will explore the fundamenta
14 min read