The 'join' command in UNIX is a powerful command-line utility that allows you to merge lines from two files based on a common field, effectively combining related data into a more meaningful format. For instance, if you have one file with names and another with IDs, the join command can combine these files so that each name and corresponding ID appear on the same line.
Here, we will cover everything you need to know about the 'join' command, including its syntax and examples, to help you use it effectively in your workflows.
What is the 'join' Command?
The 'join' command is used to merge lines from two files based on a key field that is present in both files. This key can be any column of data, separated by whitespace or other delimiters. The default behavior of 'join' is to use the first field from each file as the key for joining.
Syntax:
$join [OPTION] FILE1 FILE2
Join Command Example in Linux
Let us assume there are two files 'file1.txt' and 'file2.txt' and we want to combine the contents of these two files.
Displaying the contents of first file:
$cat file1.txt
1 AAYUSH
2 APAAR
3 HEMANT
4 KARTIK
Displaying contents of second file:
$cat file2.txt
1 101
2 102
3 103
4 104
Now, in order to combine two files the files must have some common field. In this case, we have the numbering 1, 2... as the common field in both the files.
Note: When using join command, both the input files should be sorted on the KEY on which we are going to join the files.
Using join command:
$join file1.txt file2.txt
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
By default join command takes the first column as the key to join as in the above case.
So, the output contains the key followed by all the matching columns from the first file 'file1.txt', followed by all the columns of second file 'file2.txt'.
Now, if we wanted to create a new file with the joined contents, we could use the following command:
$join file1.txt file2.txt > newjoinfile.txt
This will direct the output of joined files into a new file 'newjoinfile.txt' containing the same output as the example above.
Options for 'join' command
- -a FILENUM: Also, print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2.
- -e EMPTY: Replace missing input fields with EMPTY.
- -i - -ignore-case: Ignore differences in case when comparing fields.
- -j FIELD: Equivalent to "-1 FIELD -2 FIELD".
- -o FORMAT: Obey FORMAT while constructing output line.
- -t CHAR: Use CHAR as input and output field separator.
- -v FILENUM: Like -a FILENUM, but suppress joined output lines.
- -1 FIELD: Join on this FIELD of file 1.
- -2 FIELD: Join on this FIELD of file 2.
- - -check-order: Check that the input is correctly sorted, even if all input lines are pairable.
- - -nocheck-order: Do not check that the input is correctly sorted.
- - -help: Display a help message and exit.
- - -version: Display version information and exit.
Using join with options
1. Using -a FILENUM option:
Now, sometimes it is possible that one of the files contain extra fields so what join command does in that case is that by default, it only prints pairable lines. For example, even if file file1.txt contains an extra field provided that the contents of file2.txt are same then the output produced by join command would be same:
//displaying the contents of file1.txt//
$cat file1.txt
1 AAYUSH
2 APAAR
3 HEMANT
4 KARTIK
5 DEEPAK
//displaying contents of file2.txt//
$cat file2.txt
1 101
2 102
3 103
4 104
//using join command//
$join file1.txt file2.txt
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
// although file1.txt has extra field the
output is not affected cause the 5 column in
file1.txt was unpairable with any in file2.txt//
What if such unpairable lines are important and must be visible after joining the files. In such cases we can use '-a' option with join command which will help in displaying such unpairable lines. This option requires the user to pass a file number so that the tool knows which file you are talking about.
//using join with -a option//
//1 is used with -a to display the contents of
first file passed//
$join file1.txt file2.txt -a 1
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
5 DEEPAK
//5 column of first file is
also displayed with help of -a option
although it is unpairable//
2. Using -v option:
Now, in case you only want to print unpairable lines i.e suppress the paired lines in output then '-v' option is used with join command. This option works exactly the way '-a' works(in terms of 1 used with -v in example below).
//using -v option with join//
$join file1.txt file2.txt -v 1
5 DEEPAK
//the output only prints unpairable lines found
in first file passed//
3. Using -1, -2 and -j option:
As we already know that join combines lines of files on a common field, which is first field by default. However, it is not necessary that the common key in the both files always be the first column. join command provides options if the common key is other than the first column.
Now, if you want the second field of either file or both the files to be the common field for join, you can do this by using the -1 and -2 command line options. The -1 and -2 here represents he first and second file and these options requires a numeric argument that refers to the joining field for the corresponding file. This will be easily understandable with the example below:
//displaying contents of first file//
$cat file1.txt
AAYUSH 1
APAAR 2
HEMANT 3
KARTIK 4
//displaying contents of second file//
$cat file2.txt
101 1
102 2
103 3
104 4
//now using join command //
$join -1 2 -2 2 file1.txt file2.txt
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
//here -1 2 refers to the use of 2 column of
first file as the common field and -2 2
refers to the use of 2 column of second
file as the common field for joining//
So, this is how we can use different columns other than the first as the common field for joining. In case, we have the position of common field same in both the files(other than first) then we can simply replace the part '-1[field] -2[field]' in the command with '-j[field]'. So, in the above case the command could be:
//using -j option with join//
$join -j2 file1.txt file2.txt
1 AAYUSH 101
2 APAAR 102
3 HEMANT 103
4 KARTIK 104
4. Using -i option:
Now, other thing about join command is that by default, it is case sensitive. For example, consider the following examples:
//displaying contents of file1.txt//
$cat file1.txt
A AAYUSH
B APAAR
C HEMANT
D KARTIK
//displaying contents of file2.txt//
$cat file2.txt
a 101
b 102
c 103
d 104
Now, if you try joining these two files, using the default (first) common field, nothing will happen. That's because the case of field elements in both files is different. To make join ignore this case issue, use the -i command line option.
//using -i option with join//
$join -i file1.txt file2.txt
A AAYUSH 101
B APAAR 102
C HEMANT 103
D KARTIK 104
5. Using - -nocheck-order option:
By default, the join command checks whether or not the supplied input is sorted, and reports if not. In order to remove this error/warning then we have to use '- -nocheck-order' command like:
//syntax of join with --nocheck-order option//
$join --nocheck-order file1 file2
6. Using -t option:
Most of the times, files contain some delimiter to separate the columns. Let us update the files with comma delimiter.
$cat file1.txt
1, AAYUSH
2, APAAR
3, HEMANT
4, KARTIK
5, DEEPAK
//displaying contents of file2.txt//
$cat file2.txt
1, 101
2, 102
3, 103
4, 104
Now, '-t' option is the one we use to specify the delimiterin such cases. Since comma is the delimiter we will specify it along with '-t'.
//using join with -t option//
$join -t, file1.txt file2.txt
1, AAYUSH, 101
2, APAAR, 102
3, HEMANT, 103
4, KARTIK, 104
Conclusion
The 'join' command is a versatile tool for merging data files based on common keys, providing options to handle different delimiters, case sensitivity, and unmatched lines. Mastering the 'join' command can simplify your data processing tasks in UNIX whether its simple text files or complex datasets. With the right options and sorting techniques, you can effectively use 'join' to combine files in a way that makes your data more meaningful and useful.
Similar Reads
Linux/Unix Tutorial Linux is one of the most widely used open-source operating systems. It's fast, secure, stable, and powers everything from smartphones and servers to cloud platforms and IoT devices. Linux is especially popular among developers, system administrators, and DevOps professionals.Linux is:A Unix-like OS
10 min read
Getting Started with Linux
What is Linux Operating SystemLinux is based on the UNIX operating system. UNIX is a powerful, multi-user, multitasking operating system originally developed in the 1970s at AT&T Bell Labs. It laid the foundation for many modern operating systems, including Linux.Linux is free and open-source, accessible to everyone.Its sour
10 min read
LINUX Full Form - Lovable Intellect Not Using XPLINUX stands for Lovable Intellect Not Using XP. Linux was developed by Linus Torvalds and named after him. Linux is an open-source and community-developed operating system for computers, servers, mainframes, mobile devices, and embedded devices. Linux receives requests from system programs and it r
2 min read
Difference between Linux and WindowsLinux: Linux could be a free and open supply OS supported operating system standards. It provides programming interface still as programme compatible with operating system primarily based systems and provides giant selection applications. A UNIX operating system additionally contains several several
7 min read
What are Linux Distributions ?A Linux distribution, often shortened to âdistro,â is a packaged version of Linux that comes with the Linux kernel plus a collection of software and utilities that make the OS functional and user-friendly. Some distros are optimized for business environments, offering tools for productivity and ente
8 min read
Difference between Unix and LinuxUnix was created in the 1970s by Ken Thompson and Dennis Ritchie at Bell Labs. Dennis Ritchie was also the creator of the C programming language. Originally a command-line operating system, Unix has evolved to support graphical interfaces (GUI) as well. It became popular in universities, enterprises
5 min read
Installation with Linux
How to Install Arch Linux in VirtualBox?Installing Arch Linux on a virtual machine is an excellent way to experience this powerful and flexible Linux distribution without affecting your main system. If you're looking to install Arch Linux in VirtualBox, this guide will take you through the process step-by-step. Arch Linux is known for its
7 min read
Fedora Linux Operating SystemFedora Linux is a free and open-source operating system based on the Linux kernel and was developed by the community-supported Fedora Project. It is known for its fast release cycle, which keeps the operating system up to date with the latest software and technologies.What is the Fedora Linux Operat
12 min read
How to install Ubuntu on VirtualBox?Installing Ubuntu on VirtualBox is a great way to experience the powerful features of this popular Linux distribution without altering your main operating system. Whether youâre a developer, a student, or simply curious about Linux, setting up Ubuntu on VirtualBox allows you to test and explore in a
6 min read
How to Install Linux Mint?Linux Mint is the second-largest Linux-based distro used in the world. Linux Mint is a community-driven Linux distribution based on Ubuntu which itself is based on Debian and bundled with a variety of free and open-source applications. So here we discuss the installation of Linux mint. Installation
3 min read
How to Install Kali Linux on Windows?Kali Linux is an open-source Linux distribution based on Debian, designed for sophisticated penetration testing and security auditing. Kali Linux includes hundreds of tools for diverse information security activities such as penetration testing, security research, computer forensics, and reverse eng
2 min read
How to Install Linux on Windows PowerShell Subsystem?There are several ways to Install a Linux subsystem on your Windows PC Powershell Environment. It is good for learners, but it is recommended using original Linux OS if you are a developer as the Subsystem lacks the pre-installed Linux tools. Before we begin installing a Linux subsystem, we need to
2 min read
How to Find openSUSE Linux Version?openSUSE is well known for its GNU/Linux-based operating systems, mainly Tumbleweed, a tested rolling release, and Leap, a distribution with Long-Term-Support(LTS). MicroOS and Kubic are new transactional, self-contained distributions for use as desktop or container runtime. Here we figure out which
2 min read
How to Install CentOSCentOS is a popular open-source Linux distribution aimed at servers and provides compatibility with Red Hat's RPM package manager. It is built with the goal of providing a stable operating system that provided great compatibility with the upstream RHEL (Red hat enterprise Linux) CentOS is therefore
2 min read
Linux Commands
Linux CommandsLinux commands are essential for controlling and managing the system through the terminal. This terminal is similar to the command prompt in Windows. Itâs important to note that Linux/Unix commands are case-sensitive. These commands are used for tasks like file handling, process management, user adm
15+ min read
Essential Unix CommandsUnix commands are a set of commands that are used to interact with the Unix operating system. Unix is a powerful, multi-user, multi-tasking operating system that was developed in the 1960s by Bell Labs. Unix commands are entered at the command prompt in a terminal window, and they allow users to per
7 min read
How to Find a File in Linux | Find CommandThe find command in Linux is used to search for files and directories based on name, type, size, date, or other conditions. It scans the specified directory and its sub directories to locate files matching the given criteria.find command uses are:Search based on modification time (e.g., files edited
9 min read
Linux File System
Linux File SystemA file system is a structured method of storing and managing dataâincluding files, directories, and metadataâon your machine. Think of it like a library. If thousands of books were scattered around, finding one would be hard. But in an organized structure, like labeled shelves, locating a book becom
12 min read
Linux File Hierarchy StructureThe Linux File Hierarchy Structure or the Filesystem Hierarchy Standard (FHS) defines the directory structure and directory contents in Unix-like operating systems. It is maintained by the Linux Foundation. In the FHS, all files and directories appear under the root directory /, even if they are sto
6 min read
Linux Directory StructureIn Linux, everything is treated as a file even if it is a normal file, a directory, or even a device such as a printer or keyboard. All the directories and files are stored under one root directory which is represented by a forward slash /. The Linux directory layout follows the Filesystem Hierarchy
6 min read
Linux Kernel
Linux KernelLinux Kernel is the heart of Linux operating systems. It is an open-source (source code that can be used by anyone freely) software that is most popular and widely used in the industry as well as on a personal use basis. Who created Linux and why? Linux was created by Linus Torvalds in 1991 as a hob
4 min read
Kernel in Operating SystemA kernel is the core part of an operating system. It acts as a bridge between software applications and the hardware of a computer. The kernel manages system resources, such as the CPU, memory and devices, ensuring everything works together smoothly and efficiently. It handles tasks like running pro
9 min read
How Linux Kernel Boots?Many processes are running in the background when we press the system's power button. It is very important to learn the Linux boot process to understand the workings of any operating system. Knowing how the kernel boots is a must to solve the booting error. It is a very interesting topic to learn, l
11 min read
Difference between Operating System and KernelIn the world of computing, two terms that are frequently mentioned are Operating System (OS) and Kernel. In this article, we will explore the key differences between the OS and the Kernel, their functions, and how they work together to manage hardware and software.What is an Operating System?An Oper
3 min read
Linux Kernel Module Programming: Hello World ProgramKernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. They extend the functionality of the kernel without the need to reboot the system. Custom codes can be added to Linux kernels via two methods. The basic way is to add the code to the kernel source tree and
7 min read
Linux Loadable Kernel ModuleIf you want to add code to a Linux kit, the basic way to do that is to add source files to the kernel source tree and assemble the kernel. In fact, the process of setting up the kernel consists mainly of selecting which files to upload to the kernel will be merged. But you can also add code to the L
7 min read
Loadable Kernel Module - Linux Device Driver DevelopmentFor Linux device drivers, we can use only two languages: Assembler and C. Assembler implements the main parts of the Linux kernel, while C implements the architecture-dependent parts. Uploaded kernel modules are often referred to as kernel modules or modules, but those are misleading names because t
4 min read
Linux Networking Tools
Network configuration and troubleshooting commands in LinuxComputers are often connected to each other on a network. They send requests to each other in the form of packets that travel from the host to the destination. Linux provides various commands from network configuration and troubleshooting. Network Configuration and Troubleshooting Commands in Linux
5 min read
How to configure network interfaces in CentOS?A network interface is a link between a computer and another network(Private or Public). The network interface is basically a card which is known as NIC or Network Interface Card, this does not necessarily have to be in a physical form instead, it can be inbuilt into the software. If we take the exa
5 min read
Command-Line Tools and Utilities For Network Management in LinuxIf you are thinking of becoming a system administrator, or you are already a system admin, then this article is for you.As a system admin, your daily routine will include configuring, maintaining, troubleshooting, monitoring, securing networks, and managing servers within data centers. Network confi
8 min read
Linux - Network Monitoring ToolsNetwork monitoring is using a system (hardware or software) that continuously observes your network and the data flows through it, depending on how the monitoring solution actually functions and informs the network administrator. We can keep a check on all the activities of our network easily. While
4 min read
Linux Process
Linux Firewall
Shell Scripting & Bash Scripting
Introduction to Linux Shell and Shell ScriptingWhenever we use any modern operating system like Linux, macOS, or Windows we are indirectly interacting with a shell, the program that interprets and executes our commands. While running Ubuntu, Linux Mint, or any other Linux distribution, we are interacting with the shell by using the terminal. In
8 min read
What is Terminal, Console, Shell and Kernel?Understanding the terms terminal, console, shell, and kernel is crucial for anyone working with computers or learning about operating systems. These concepts are key components of how we interact with our devices and software. The terminal is a text-based interface used to interact with the computer
5 min read
How to Create a Shell Script in linuxShell is an interface of the operating system. It accepts commands from users and interprets them to the operating system. If you want to run a bunch of commands together, you can do so by creating a shell script. Shell scripts are very useful if you need to do a task routinely, like taking a backup
7 min read
Shell Scripting - Different types of VariablesThe shell is a command-line interpreter for Linux and Unix systems. It provides an interface between the user and the kernel and executes commands. A sequence of commands can be written in a file for execution in the shell. It is called shell scripting. It helps to automate tasks in Linux. Scripting
4 min read
Bash Scripting - Introduction to Bash and Bash ScriptingBash is a command-line interpreter or Unix Shell and it is widely used in GNU/Linux Operating System. It is written by Brian Jhan Fox. It is used as a default login shell for most Linux distributions. Scripting is used to automate the execution of the tasks so that humans do not need to perform them
12 min read
Bash Script - Define Bash Variables and its typesVariables are an important aspect of any programming language. Without variables, you will not be able to store any required data. With the help of variables, data is stored at a particular memory address and then it can be accessed as well as modified when required. In other words, variables let yo
12 min read
Shell Scripting - Shell VariablesA shell variable is a character string in a shell that stores some value. It could be an integer, filename, string, or some shell command itself. Basically, it is a pointer to the actual data stored in memory. We have a few rules that have to be followed while writing variables in the script (which
6 min read
Bash Script - Difference between Bash Script and Shell ScriptIn computer programming, a script is defined as a sequence of instructions that is executed by another program. A shell is a command-line interpreter of Linux which provides an interface between the user and the kernel system and executes a sequence of instructions called commands. A shell is capabl
4 min read
Shell Scripting - Difference between Korn Shell and Bash shellKorn Shell: Korn Shell or KSH was developed by a person named David Korn, which attempts to integrate the features of other shells like C shell, Bourne Shell, etc. Korn Shell allows developers to generate and create new shell commands whenever it is required. Korn shell was developed a long year bac
3 min read
Shell Scripting - Interactive and Non-Interactive ShellA shell gives us an interface to the Unix system. While using an operating system, we indirectly interact with the shell. On Linux distribution systems, each time we use a terminal, we interact with the shell. The job of the shell is to interpret or analyze the Unix commands given by users. A shell
3 min read
Shell Script to Show the Difference Between echo â$SHELLâ and echo â$SHELLâIn shell scripting and Linux, the echo command is used to display text on the terminal or console. When used with the $SHELL variable, which contains the path of the current user's shell program, the output of the echo command can be different depending on whether the variable is enclosed in single
4 min read