How to convert tab-separated file into a dataframe using Python
Last Updated :
11 Jul, 2024
In this article, we will learn how to convert a TSV file into a data frame using Python and the Pandas library.
A TSV (Tab-Separated Values) file is a plain text file where data is organized in rows and columns, with each column separated by a tab character.
- It is a type of delimiter-separated file, similar to CSV (Comma-Separated Values).
- Tab-separated files are commonly used in data manipulation and analysis, and being able to convert them into a data frame can greatly enhance our ability to work with structured data efficiently.
Methods to Convert Tab-Separated Files into a Data Frame
Method 1: Using pandas 'read_csv()' with 'sep' parameter
In this method, we will use the Pandas library to read a tab-separated file into a data frame.
Look at the following code snippet.
- We have imported the pandas library and defined the path of the tab-separated file.
- Then, we use 'pd.read_csv()' function to read the contents of the tab-separated file into a DataFrame and specified that the file is tab-separated using "sep ='\t'"
- The '
read_csv()'
function automatically detects the delimiter and parses the file accordingly.
Python
import pandas as pd
file_path = "file.tsv"
df = pd.read_csv(file_path,sep='\t')
df.head()
Output:
0 50 5 881250949
0 0 172 5 881250949
1 0 133 1 881250949
2 196 242 3 881250949
3 186 302 3 891717742
4 22 377 1 878887116
Method 2: Using pandas 'read_table()' function
In the following code snippet, we have again used the pandas library in Python to read the contents of a tab-separated file named 'file.tsv' into a DataFrame named 'df'. The pd.read_table()
function is employed for this task, which automatically infers the tab separator.
Python
import pandas as pd
df = pd.read_table('file.tsv')
df.head()
Output:
0 50 5 881250949
0 0 172 5 881250949
1 0 133 1 881250949
2 196 242 3 881250949
3 186 302 3 891717742
4 22 377 1 878887116
Method 3: Using csv module
The code example, begin by importing the csv module, which provides functionality for reading and writing CSV files.
- Uses the
open()
function to open the file specified by file_path
in read-only mode ('r'
). Utilized the with
statement to ensure proper file closure after reading. - Creates a CSV reader object using
csv.reader
(file, delimiter='\t'), specifing that the values in the file are tab-separated.
Python
import csv
file_path = "file.tsv"
with open(file_path, 'r') as file:
reader = csv.reader(file, delimiter='\t')
df = pd.DataFrame(reader)
df.head()
Output:
0 1 2 3
0 0 50 5 881250949
1 0 172 5 881250949
2 0 133 1 881250949
3 196 242 3 881250949
4 186 302 3 891717742
Method 4: Use 'numpy' to load the data and then convert to a DataFrame
This code segment employs NumPy's 'genfromtxt()' function to import tab-separated data from 'file.tsv' into a NumPy array, configuring the tab delimiter and data type. Following this, it converts the NumPy array into a pandas DataFrame, facilitating structured data representation for further analysis and manipulation.
Python
import numpy as np
import pandas as pd
data = np.genfromtxt('file.tsv', delimiter='\t', dtype=None, encoding=None)
df = pd.DataFrame(data)
df.head()
Output:
0 1 2 3
0 0 50 5 881250949
1 0 172 5 881250949
2 0 133 1 881250949
3 196 242 3 881250949
4 186 302 3 891717742
Similar Reads
How to Convert Tab-Delimited File to Csv in Python? We are given a tab-delimited file and we need to convert it into a CSV file in Python. In this article, we will see how we can convert tab-delimited files to CSV files in Python. Convert Tab-Delimited Files to CSV in PythonBelow are some of the ways to Convert Tab-Delimited files to CSV in Python: U
2 min read
Reading .Dat File in Python Python, with its vast ecosystem of libraries and modules, provides a flexible and efficient environment for handling various file formats, including generic .dat files. In this article, we will different approaches to reading and process .dat files in Python. your_file.dat 1.0 2.0 3.04.0 5.0 6.07.0
2 min read
Exporting Multiple Sheets As Csv Using Python In data processing and analysis, spreadsheets are a common format for storing and manipulating data. However, when working with large datasets or conducting complex analyses, it's often necessary to export data from multiple sheets into a more versatile format. CSV (Comma-Separated Values) files are
3 min read
Convert Dict of List to CSV - Python To convert a dictionary of lists to a CSV file in Python, we need to transform the dictionary's structure into a tabular format that is suitable for CSV output. A dictionary of lists typically consists of keys that represent column names and corresponding lists that represent column data.For example
3 min read
Python program to read CSV without CSV module CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the c
3 min read
Python program to read CSV without CSV module CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the c
3 min read