
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Create Binary Variable Column Based on Condition in R Data Frame
Sometimes we need to create extra variable to add more information about the present data because it adds value. This is especially used while we do feature engineering. If we come to know about something that may affect our response then we prefer to use it as a variable in our data, hence we make up that with the data we have. For example, creating another variable applying conditions on other variable such as creating a binary variable for goodness if the frequency matches a certain criterion.
Example
Consider the below data frame −
set.seed(100) Group<-rep(c("A","B","C","D","E"),times=4) Frequency<-sample(20:30,20,replace=TRUE) df1<-data.frame(Group,Frequency) df1
Output
Group Frequency 1 A 29 2 B 26 3 C 25 4 D 22 5 E 28 6 A 29 7 B 26 8 C 25 9 D 25 10 E 23 11 A 26 12 B 25 13 C 21 14 D 26 15 E 26 16 A 26 17 B 30 18 C 27 19 D 21 20 E 22
Creating a column category having two levels as Good and Bad, where Good is for those that have Frequency greater than 25−
Example
df1$Category<-ifelse(df1$Frequency>25,"Good","Bad") df1
Output
Group Frequency Category 1 A 29 Good 2 B 26 Good 3 C 25 Bad 4 D 22 Bad 5 E 28 Good 6 A 29 Good 7 B 26 Good 8 C 25 Bad 9 D 25 Bad 10 E 23 Bad 11 A 26 Good 12 B 25 Bad 13 C 21 Bad 14 D 26 Good 15 E 26 Good 16 A 26 Good 17 B 30 Good 18 C 27 Good 19 D 21 Bad 20 E 22 Bad
Let’s have a look at another example −
Example
Class<-rep(c("Lower","Middle","Upper Middle","Higher"),times=5) Ratings<-sample(1:10,20,replace=TRUE) df2<-data.frame(Class,Ratings) df2
Output
Class Ratings 1 Lower 3 2 Middle 8 3 Upper Middle 2 4 Higher 9 5 Lower 2 6 Middle 3 7 Upper Middle 4 8 Higher 4 9 Lower 4 10 Middle 5 11 Upper Middle 7 12 Higher 9 13 Lower 4 14 Middle 2 15 Upper Middle 6 16 Higher 7 17 Lower 1 18 Middle 6 19 Upper Middle 9 20 Higher 9
Example
df2$Group<-ifelse(df2$Ratings>5,"Royal","Standard") df2
Output
Class Ratings Group 1 Lower 3 Standard 2 Middle 8 Royal 3 Upper Middle 2 Standard 4 Higher 9 Royal 5 Lower 2 Standard 6 Middle 3 Standard 7 Upper Middle 4 Standard 8 Higher 4 Standard 9 Lower 4 Standard 10 Middle 5 Standard 11 Upper Middle 7 Royal 12 Higher 9 Royal 13 Lower 4 Standard 14 Middle 2 Standard 15 Upper Middle 6 Royal 16 Higher 7 Royal 17 Lower 1 Standard 18 Middle 6 Royal 19 Upper Middle 9 Royal 20 Higher 9 Royal
Advertisements