Python | Categorizing input Data in Lists
Last Updated :
10 Mar, 2022
Lists in Python are linear containers used for storing data of various Data Types. The ability to store a variety of data is what makes Lists a very unique and vital Data Structure in Python.
Once created, lists can be modified further depending on one's needs. Hence, they are 'mutable.'
Lists, when created and by defining the values in the code-section, generates an output similar to this:
Code :
Python3
List =['GeeksForGeeks', 'VenD', 5, 9.2]
print('\n List: ', List)
Output:
List: ['GeeksForGeeks', 'VenD', 5, 9.2]
In the above picture, the defined list is a combination of integer and string values. The interpreter implicitly interprets 'GeeksForGeeks' and 'VenD' as string values whereas 5 and 9.2 are interpreted as integer and float values respectively. We can perform the usual arithmetic operations on integer and float values as follows.
Code :
Python3
# Usual Arithmetic Operations on 5 and 9.2:
List =['GeeksForGeeks', 'VenD', 5, 9.2]
print('\n List[2]+2, Answer: ', end ='')
print(List[List.index(5)]+2)
print('\n\n List[3]+8.2, Answer: ', end ='')
print(List[List.index(9.2)]+8.2)
Output:
List[2]+2, Answer: 7
List[3]+8.2, Answer: 17.4
Also, the string-specific operations like string concatenation can be performed on the respective strings:
Code :
Python3
# String Concatenation Operation
# List: ['GeeksForGeeks', 'VenD', 5, 9.2]
# Concatenating List[0] and List[1]
List = ['GeeksForGeeks', 'VenD', 5, 9.2]
print(List[0]+' '+List[1])
However, since we know that lists contain items of various data types those which might be of type: string, integer, float, tuple, dictionaries, or maybe even list themselves (a list of lists), the same is not valid if you are generating a list as a result of user input. For instance, consider the example below:
Code :
Python3
# All Resultant Elements of List2 will be of string type
list2 =[] # This is the list which will contain elements as an input from the user
element_count = int(input('\n Enter Number of Elements you wish to enter: '))
for i in range(element_count):
element = input(f'\n Enter Element {i + 1} : ')
list2.append(element)
print("\n List 2 : ", list2)
Output:
Enter Number of Elements you wish to enter: 4
Enter Element 1 : GeeksForGeeks
Enter Element 2 : VenD
Enter Element 3 : 5
Enter Element 4 : 9.2
List 2 : ['GeeksForGeeks', 'VenD', '5', '9.2']
You may notice that List2 generated as a result of User Input, now contains values of string data-type only. Also, the numerical elements have now lost the ability to undergo arithmetic operations since they are of string data-type. This behaviour straightaway contradicts the versatile behavior of Lists.
It becomes necessary for us as programmers to process the user data and store it in the appropriate format so that operations and manipulations on the target data set become efficient.
In this approach, we shall distinguish data obtained from the user into three sections, namely integer, string, and float. For this, we use a small code to carry out the respective typecasting operations.
A technique to Overcome the proposed Limitation:
Code :
Python3
import re
def checkInt(string):
string_to_integer = re.compile(r'\d+')
if len(string_to_integer.findall(string)) != 0:
if len(string_to_integer.findall(string)[0])== len(string):
return 1
else:
return 0
else:
return 0
def checkFloat(string):
string_to_float = re.compile(r'\d*.\d*')
if len(string_to_float.findall(string)) != 0:
if len(string_to_float.findall(string)[0])== len(string):
return 1
else:
return 0
else:
return 0
List2 =[]
element_count = int(input('\n Enter number of elements : '))
for i in range(element_count):
input_element = input(f'\n Enter Element {i + 1}: ')
if checkInt(input_element):
input_element = int(input_element)
List2.append(input_element)
elif checkFloat(input_element):
input_element = float(input_element)
List2.append(input_element)
else:
List2.append(input_element)
print(List2)
Output:
Enter number of elements : 4
Enter Element 1: GeeksForGeeks
Enter Element 2: VenD
Enter Element 3: 5
Enter Element 4: 9.2
['GeeksForGeeks', 'VenD', 5, 9.2]
The above technique is essentially an algorithm which uses the Regular Expression library along with an algorithm to analyze the data-types of elements being inserted. After successfully analyzing the pattern of data, we then proceed to perform the type conversion.
For example, consider the following cases:
- Case1: All values in the string are numeric. The user input is 7834, the Regular Expression function analyzes the given data and identifies that all values are digits between 0 to 9 hence the string '7834' is typecasted to the equivalent integer value and then appended to the list as an integer.
Expression Used for Integer identification : r'\d+'

- Case2: The String expression contains elements which represent a floating point number. A floating point value is identified over a pattern of digits preceding or succeeding a full stop('.'). Ex: 567., .056, 6.7, etc.
Expression used for float value identification: r'\d*.\d*' - Case3: String Input contains characters, special characters and numerical values as well. In this case, the data element is generalized as a string value. No special Regular Expression needed since the given expression will return false when being categorised as Integer or Float Values.
Ex: '155B, Baker Street ! ', '12B72C_?', 'I am Agent 007', '_GeeksForGeeks_', etc.
Conclusion: This method, however, is just a small prototype of typecasting values and processing raw data before storing it in the list. It definitely offers a fruitful outcome which overcomes the proposed limitations on list inputs. Further, with the advanced applications of regular expressions and some improvements in the algorithm, more forms of data such as tuples, dictionaries, etc. can be analysed and stored accordingly.
Advantages of the Dynamic TypeCasting:
- Since data is stored in the appropriate format, various 'type-specific' operations can be performed on it. Ex: Concatenation in case of strings, Addition, Subtraction, Multiplication in case of numerical values and various other operations on the respective data types.
- Typecasting phase takes place while storing the data. Hence, the programmer does not have to worry about Type Errors which occur while performing operations on the data set.
Limitations of Dynamic TypeCasting:
- Some data elements which do not necessarily require typecasting undergo the process, resulting in unnecessary computing.
- Multiple condition-checks and function calls result in memory wastage when each time a new element is inserted.
- The flexibility of processing numerous types of data might require several new additions to the existing code as per the developer's needs.
Similar Reads
Python | Integer count in Mixed List
The lists in python can handle different type of data types in it. The manipulation of such lists is complicated. Sometimes we have a problem in which we need to find the count of integer values in which the list can contain string as a data type i.e heterogeneous. Letâs discuss certain ways in whic
6 min read
List As Input in Python in Single Line
Python provides several ways to take a list as input in Python in a single line. Taking user input is a common task in Python programming, and when it comes to handling lists, there are several efficient ways to accomplish this in just a single line of code. In this article, we will explore four com
3 min read
Python | Convert list into list of lists
Given a list of strings, write a Python program to convert each element of the given list into a sublist. Thus, converting the whole list into a list of lists. Examples: Input : ['alice', 'bob', 'cara'] Output : [['alice'], ['bob'], ['cara']] Input : [101, 202, 303, 404, 505] Output : [[101], [202],
5 min read
How to Input a List in Python using For Loop
Using a for loop to take list input is a simple and common method. It allows users to enter multiple values one by one, storing them in a list. This approach is flexible and works well when the number of inputs is known in advance.Letâs start with a basic way to input a list using a for loop in Pyth
2 min read
Python | Convert given list into nested list
Sometimes, we come across data that is in string format in a list and it is required to convert it into a list of the list. This kind of problem of converting a list of strings to a nested list is quite common in web development. Let's discuss certain ways in which this can be performed. Convert the
4 min read
Load CSV data into List and Dictionary using Python
Prerequisites: Working with csv files in Python CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or m
2 min read
Dictionary Programs involving Lists - Python
Dictionaries and lists are two of the most commonly used data structures in Python, and often, we need to work with both together. Whether it's storing lists as dictionary values, converting lists into dictionaries, filtering dictionary data using lists, or modifying dictionary values dynamically, P
2 min read
Bigram formation from a given Python list
When we are dealing with text classification, sometimes we need to do certain kind of natural language processing and hence sometimes require to form bigrams of words for processing. In case of absence of appropriate library, its difficult and having to do the same is always quite useful. Let's disc
4 min read
Iterate Over a List of Lists in Python
We are given a list that contains multiple sublists, and our task is to iterate over each of these sublists and access their elements. For example, if we have a list like this: [[1, 2], [3, 4], [5, 6]], then we need to loop through each sublist and access elements like 1, 2, 3, and so on. Using Nest
2 min read
Create Dictionary from the List-Python
The task of creating a dictionary from a list in Python involves mapping each element to a uniquely generated key, enabling structured data storage and quick lookups. For example, given a = ["gfg", "is", "best"] and prefix k = "def_key_", the goal is to generate {'def_key_gfg': 'gfg', 'def_key_is':
3 min read