Using Generators for substantial memory savings in Python
Last Updated :
10 May, 2020
When memory management and maintaining state between the value generated become a tough job for programmers, Python implemented a friendly solution called
Generators.
Generators
With Generators, functions evolve to
access and compute data in pieces. Hence functions can return the result to its caller upon request and can maintain its state.
Generators maintain the function state by halting the code after producing the value to the caller and upon request, it continues execution from where it is left off.
Since Generator access and compute value on-demand, a large chunk of data doesn’t need to be saved in memory entirely and results in substantial memory savings.
Generator Syntax
yield statement
We can say that a function is a generator when it has a yield statement within the code. Like in a return statement, the yield statement also sends a value to the caller, but it doesn’t exit the function’s execution.
Instead, it halts the execution until the next request is received. Upon request, the generator continues executing from where it is left off.
Python3 1==
def primeFunction():
prime = None
num = 1
while True:
num = num + 1
for i in range(2, num):
if(num % i) == 0:
prime = False
break
else:
prime = True
if prime:
# yields the value to the caller
# and halts the execution
yield num
def main():
# returns the generator object.
prime = primeFunction()
# generator executes upon request
for i in prime:
print(i)
if i > 50:
break
if __name__ == "__main__":
main()
Output
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
Communication With Generator
next, stopIteration and send
How the caller and generator communicate with each other? Here we will discuss 3 in-built functions in python. They are:
next
The next function can
request a generator for its next value. Upon request, the generator code executes and the yield statement provides the value to the caller. At this point, the generator halts the execution and waits for the next request. Let’s dig deeper by considering a Fibonacci function.
Python3 1==
def fibonacci():
values = []
while True:
if len(values) < 2:
values.append(1)
else :
# sum up the values and
# append the result
values.append(sum(values))
# pop the first value in
# the list
values.pop(0)
# yield the latest value to
# the caller
yield values[-1]
continue
def main():
fib = fibonacci()
print(next(fib)) # 1
print(next(fib)) # 1
print(next(fib)) # 2
print(next(fib)) # 3
print(next(fib)) # 5
if __name__ == "__main__":
main()
Output
1
1
2
3
5
StopIteration
StopIteration is a built-in exception that is used to exit from a Generator. When the generator's iteration is complete, it signals the caller by raising the StopIteration exception and it exits.
Below code explains the scenario.
Python3 1==
def stopIteration():
num = 5
for i in range(1, num):
yield i
def main():
f = stopIteration()
# 1 is generated
print(next(f))
# 2 is generated
print(next(f))
# 3 is generated
print(next(f))
# 4 is generated
print(next(f))
# 5th element - raises
# StopIteration Exception
next(f)
if __name__ == "__main__":
main()
Output
1
2
3
4
Traceback (most recent call last):
File "C:\Users\Sonu George\Documents\GeeksforGeeks\Python Pro\Generators\stopIteration.py", line 19, in
main()
File "C:\Users\Sonu George\Documents\GeeksforGeeks\Python Pro\Generators\stopIteration.py", line 15, in main
next(f) # 5th element - raises StopIteration Exception
StopIteration
The below code explains another scenario, where a programmer can raise StopIteration and exit from the generator.
raise StopIteration
Python3 1==
def stopIteration():
num = 5
for i in range(1, num):
if i == 3:
raise StopIteration
yield i
def main():
f = stopIteration()
# 1 is generated
print(next(f))
# 2 is generated
print(next(f))
# StopIteration raises and
# code exits
print(next(f))
print(next(f))
if __name__ == "__main__":
main()
Output
1
2
Traceback (most recent call last):
File "C:\Users\Sonu George\Documents\GeeksforGeeks\Python Pro\Generators\stopIteration.py", line 5, in stopIteration
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Sonu George\Documents\GeeksforGeeks\Python Pro\Generators\stopIteration.py", line 19, in
main()
File "C:\Users\Sonu George\Documents\GeeksforGeeks\Python Pro\Generators\stopIteration.py", line 13, in main
print(next(f)) # StopIteration raises and code exits
RuntimeError: generator raised StopIteration
send
So far, we have seen how generator yield values to the invoking code where the communication is unidirectional. As of now, the generator hasn't received any data from the caller.
In this section, we will discuss
the `send` method that allows the caller to communicate with the generator.
Python3 1==
def factorial():
num = 1
while True:
factorial = 1
for i in range(1, num + 1):
# determines the factorial
factorial = factorial * i
# produce the factorial to the caller
response = yield factorial
# if the response has value
if response:
# assigns the response to
# num variable
num = int(response)
else:
# num variable is incremented
# by 1
num = num + 1
def main():
fact = factorial()
print(next(fact))
print(next(fact))
print(next(fact))
print(fact.send(5)) # send
print(next(fact))
if __name__ == "__main__":
main()
Output
1
2
6
120
720
The generator yields the first three values (1, 2 and 6) based on the request by the caller (using the next method) and the fourth value (120) is produced based on the data (5) provided by the caller (using send method).
Let's consider the 3rd data (6) yielded by the generator. Factorial of 3 = 3*2*1, which is yielded by the generator and the execution halts.
factorial = factorial * i
At this point, the caller uses the `send` method and provide the data '5`. Hence generator executes from where it is left off i.e. saves the data sent by the caller to the `response` variable (
response = yield factorial
).
Since the `response` contains a value, the code enters the `if` condition and assigns the response to the `num` variable.
Python3 1==
if response:
num = int(response)
Now the flow passes to the `while` loop and determines the factorial and is yielded to the caller. Again, the generator halts the execution until the next request.
If we look into the output, we can see that the order got interrupted after the caller uses the `send` method. More precisely, within the first 3 requests the output as follows:
Factorial of 1 = 1
Factorial of 2 = 2
Factorial of 3 = 6
But when the user sends the value 5 the output becomes 120 and the `num` maintains the value 5. On the next request (using `next`) we expect num to get incremented based on last `next` request (i.e. 3+1 = 4) rather than the `send` method. But in this case, the `num` increments to 6 (based on last value using `send`) and produces the output 720.
The below code shows a different approach in handling values sent by the caller.
Python3 1==
def factorial():
num = 0
value = None
response = None
while True:
factorial = 1
if response:
value = int(response)
else:
num = num + 1
value = num
for i in range(1, value + 1):
factorial = factorial * i
response = yield factorial
def main():
fact = factorial()
print(next(fact))
print(next(fact))
print(next(fact))
print(fact.send(5)) # send
print(next(fact))
if __name__ == "__main__":
main()
Output
1
2
6
120
24
Standard Library - Generators
Standard Library
- range
- dict.items
- zip
- map
- File Objects
range
Range function returns an iterable range object and its iterator is a generator. It returns the sequential value which starts from the lower limit and continues till the upper limit is reached.
Python3 1==
def range_func():
r = range(0, 4)
return r
def main():
r = range_func()
iterator = iter(r)
print(next(iterator))
print(next(iterator))
if __name__ == "__main__":
main()
Output
0
1
dict.items
Dictionary class in python provides three iterable methods to iterate the dictionary. They are key, values and items and their iterators are generators.
Python3 1==
def dict_func():
dictionary = {'UserName': 'abc', 'Password':'a@123'}
return dictionary
def main():
d = dict_func()
iterator = iter(d.items())
print(next(iterator))
print(next(iterator))
if __name__ == "__main__":
main()
Output
('UserName', 'abc')
('Password', 'a@123')
zip
zip is a built-in python function which takes multiple iterable object and iterates all at once. They yield the first element from each iterable, then the second and so on.
Python3 1==
def zip_func():
z = zip(['a', 'b', 'c', 'd'], [1, 2, 3, 4])
return z
def main():
z = zip_func()
print(next(z))
print(next(z))
print(next(z))
if __name__ == "__main__":
main()
Output
('a', 1)
('b', 2)
('c', 3)
map
The map function takes function and iterables as parameters and computes the result of the function to each item of the iterable.
Python3 1==
def map_func():
m = map(lambda x, y: max([x, y]), [8, 2, 9], [5, 3, 7])
return m
def main():
m = map_func()
print(next(m)) # 8 (maximum value among 8 and 5)
print(next(m)) # 3 (maximum value among 2 and 3)
print(next(m)) # 9 (maximum value among 9 and 7)
if __name__ == "__main__":
main()
Output
8
3
9
File Object
Even though the file object has a readline method to read the file line by line, it supports the generator pattern. One difference is that here the readline method catches the StopIteration exception and returns an empty string once the end of file is reached, which is different while using the next method.
While using next method, file object yields the entire line including the newline (\n) character
Python3 1==
def file_func():
f = open('sample.txt')
return f
def main():
f = file_func()
print(next(f))
print(next(f))
if __name__ == "__main__":
main()
Input: sample.txt
Rule 1
Rule 2
Rule 3
Rule 4
Output
Rule 1
Rule 2
Generators Use Case
Generator Use Cases
The Fundamental concept of Generator is determining the value on demand. Below we will discuss two use cases that derive from the above concept.
- Accessing Data in Pieces
- Computing Data in Pieces
Accessing Data in Pieces
Why do we need to access data in pieces? The question is valid when the programmer has to deal with a large amount of data, say reading a file and so. In this case, making a copy of data and processing it is not a feasible solution.
By using generators, programmers can access the data one at a time. When considering file operation, the user can access data line by line and in case of a dictionary, two-tuple at a time.
Hence Generator is an essential tool to deal with a large chunk of data that avoids unnecessary storage of the data and results in substantial memory savings.
Computing Data in Pieces
Another reason to write a Generator is its ability to compute data on request. From the above Fibonacci function, one can understand that
the generator produces the value on demand. This process avoids unnecessary computing and storing the values and hence can increase the performance and also results in substantial memory savings.
Another point to note is that the generator's ability to compute an infinite number of data.
Generator Delegation
yield from
The generator can invoke another generator as a function does. Using 'yield from' statement a generator can achieve this, and the process is called
Generator Delegation.
Since the generator is delegating to another generator,
the values sent to the wrapping generator will be available to the current delegate generator.
Python3 1==
def gensub1():
yield 'A'
yield 'B'
def gensub2():
yield '100'
yield '200'
def main_gen():
yield from gensub1()
yield from gensub2()
def main():
delg = main_gen()
print(next(delg))
print(next(delg))
print(next(delg))
print(next(delg))
if __name__ == "__main__":
main()
Output
A
B
100
200
Summary
A generator is an essential tool for programmers who deal with large amounts of data. Its ability to compute and access data on-demand results in terms of both increase in performance and memory savings. And also, consider using generators when there is a need to represent an infinite sequence.
Similar Reads
Writing Memory Efficient Programs Using Generators in Python
When writing code in Python, wise use of memory is important, especially when dealing with large amounts of data. One way to do this is to use Python generators. Generators are like special functions that help save memory by processing data one at a time, rather than all at once. The logic behind me
5 min read
Sparse Matrix in Python using Dictionary
A sparse matrix is a matrix in which most of the elements have zero value and thus efficient ways of storing such matrices are required. Sparse matrices are generally utilized in applied machine learning such as in data containing data-encodings that map categories to count and also in entire subfie
2 min read
How are variables stored in Python - Stack or Heap?
Memory allocation can be defined as allocating a block of space in the computer memory to a program. In Python memory allocation and deallocation method is automatic as the Python developers created a garbage collector for Python so that the user does not have to do manual garbage collection. Garbag
3 min read
Releasing Memory in Python
Python's memory management is primarily handled by its built-in garbage collector (GC), which automatically deallocates memory that is no longer in use. However, to optimize memory usage, developers can employ explicit techniques to manage memory more effectively, especially in long-running or memor
4 min read
How to Use Pytest for Efficient Testing in Python
Writing, organizing, and running tests is made easier with Pytest, a robust and adaptable testing framework for Python. Developers looking to guarantee code quality and dependability love it for its many capabilities and easy-to-use syntax. A critical component of software development is writing tes
5 min read
Memory leak using Pandas DataFrame
Pandas is a powerful and widely-used open-source data analysis and manipulation library for Python. It provides a DataFrame object that allows you to store and manipulate tabular data in rows and columns in a very intuitive way. Pandas DataFrames are powerful tools for working with data, but they ca
9 min read
Memory profiling in Python using memory_profiler
If you use Python a lot then you probably know that many people claim that Python takes up more time to execute. Well, you probably have seen approaches like the total time spent to execute a part of code or something like that but sometimes you need something more than that. What about RAM usage? N
3 min read
Handle Memory Error in Python
One common issue that developers may encounter, especially when working with loops, is a memory error. In this article, we will explore what a memory error is, delve into three common reasons behind memory errors in Python for loops, and discuss approaches to solve them. What is a Memory Error?A mem
3 min read
Using C codes in Python | Set 1
Prerequisite: How to Call a C function in Python Let's discuss the problem of accessing C code from Python. As it is very evident that many of Pythonâs built-in libraries are written in C. So, to access C is a very important part of making Python talk to existing libraries. There is an extensive C p
4 min read
Memory Leak in Python requests
When a programmer forgets to clear a memory allocated in heap memory, the memory leak occurs. It's a type of resource leak or wastage. When there is a memory leak in the application, the memory of the machine gets filled and slows down the performance of the machine. This is a serious issue while bu
5 min read