SlideShare a Scribd company logo
Streamlining Testing in a Large
Python Codebase
Jimmy Lai, Staff Software Engineer, Zip
July 12, 2024
Python Testing: pytest, coverage, and continuous integration
01
02
03
04
05
Outline
The Slow Test Challenges
Optimization Strategies
Results
Recap
Zip is the world’s leading
Intake & Procurement
Orchestration Platform
450+ global
customers
$4.4 billion
total customer savings
Top talent from
tech disruptors
$181 million
raised at $1.5 billion valuation
A Large Python Codebase
100 developers
We’re hiring fast
1
A Large Python Codebase
100 developers
We’re hiring fast
2.5 million lines of
Python code
Doubling every year
1 2
Scaling Challenges
100 developers
We’re hiring
2.5 million lines of
Python code
Doubling every year
1 2
Number of tests and
tech debt increase
fast
3
Why Tests?
Quality Assurance
1
Why Tests?
Quality Assurance Refactoring Confidence
1 2
Why Tests?
Quality Assurance Refactoring Confidence Documentation
1 2 3
Test Execution Time
01
02
03
Useful Test Metrics
Test Reliability
Test Coverage
Simple Testing using pytest
https://pypi.org/project/pytest/
# in helper.py
def is_even(number: int) -> bool:
if number % 2 == 0:
return True
else:
return False
Simple Testing using pytest
https://pypi.org/project/pytest/
# in helper.py
def is_even(number: int) -> bool:
if number % 2 == 0:
return True
else:
return False
# in test_helper.py
from helper import is_even
def test_is_even_with_even_number():
assert is_even(4) == True
def test_is_even_with_zero():
assert is_even(0) == True
Simple Testing using pytest
https://pypi.org/project/pytest/
# in helper.py
def is_even(number: int) -> bool:
if number % 2 == 0:
return True
else:
return False
# in test_helper.py
from helper import is_even
def test_is_even_with_even_number():
assert is_even(4) == True
def test_is_even_with_zero():
assert is_even(0) == True
> pytest . -vv
======= test session starts =======
collected 2 items
test_helper.py::test_is_even_with_even_number PASSED
test_helper.py::test_is_even_with_zero PASSED
======= 2 passed in 0.03s =======
Simple Testing using pytest
https://pypi.org/project/pytest/
# in helper.py
def is_even(number: int) -> bool:
if number % 2 == 0:
return True
else:
return False
# in test_helper.py
from helper import is_even
def test_is_even_with_even_number():
assert is_even(4) == True
def test_is_even_with_zero():
assert is_even(0) == True
> pytest . -vv
======= test session starts =======
collected 2 items
test_helper.py::test_is_even_with_even_number PASSED
test_helper.py::test_is_even_with_zero PASSED
======= 2 passed in 0.03s =======
Test Execution Time
Test Reliability
Measure Test Coverage
> pytest --cov . -vv
======= test session starts =======
collected 2 items
test_helper.py::test_is_even_with_even_number PASSED
test_helper.py::test_is_even_with_zero PASSED
------------- coverage -------------
Name Stmts Miss Cover
------------------------------------
helper.py 5 1 80%
test_helper.py 6 0 100%
------------------------------------
TOTAL 11 1 91%
======= 2 passed in 0.03s =======
https://pypi.org/project/pytest-cov/
Test Coverage
Measure Test Coverage
> pytest --cov . -vv
======= test session starts =======
collected 2 items
test_helper.py::test_is_even_with_even_number PASSED
test_helper.py::test_is_even_with_zero PASSED
------------- coverage -------------
Name Stmts Miss Cover
------------------------------------
helper.py 5 1 80%
test_helper.py 6 0 100%
------------------------------------
TOTAL 11 1 91%
======= 2 passed in 0.03s =======
To increase the test coverage: add a new test case
for odd numbers
https://pypi.org/project/pytest-cov/
Test Coverage
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
● Developers submit a pull request (PR) for code review
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
● Developers submit a pull request (PR) for code review
● Run tests to verify the code changes
Continuous Integration
Practice: continuous merge changes into the shared codebase
● Developers submit a pull request (PR) for code review
● Run tests to verify the code changes
● Merge a PR after all tests passed and approved
Continuous Integration
Practice: continuously merge changes into the shared codebase
while ensuring the quality
● Developers submit a pull request (PR) for code review
● Run tests to verify the code changes
● Merge a PR after all tests passed and approved
Ensure that test reliability and test coverage meet the required
thresholds
Continuous Integration using Github Workflows
# File: .github/workflows/ci.yml
name: CI
on:
pull_request: # on updating a pull request
branches:
- main
push: # on merging to the main branch
branches:
- main
https://docs.github.com/en/actions/using-workflows
Continuous Integration using Github Workflows
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.13'
- run: pip install -r requirements.txt
- run: pytest
# File: .github/workflows/ci.yml
name: CI
on:
pull_request: # on updating a pull request
branches:
- main
push: # on merging to the main branch
branches:
- main
https://docs.github.com/en/actions/using-workflows
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
1
Pain Point:
Long Test Execution Time
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
Codebase size
increases
1 2
Pain Point:
Test Coverage Overhead
Pain Point:
Long Test Execution Time
Challenge: Test Execution Time Increases Over Time
Number of tests
increases
Codebase size
increases
Number of
dependencies increases
1 2 3
requirements.txt
Pain Point:
Test Coverage Overhead Pain Point: Slow Test Startup
Pain Point:
Long Test Execution Time
🎯Strategy #1: Parallel Execution
Run Tests in Parallel on multiple CPUs
https://pypi.org/project/pytest-xdist/
pytest -n 8 # use 8 worker processes
# use all available CPU cores
pytest -n auto
Run Tests in Parallel on multiple CPUs
https://pypi.org/project/pytest-xdist/
pytest -n 8 # use 8 worker processes
# use all available CPU cores
pytest -n auto
N: number of CPUs (e.g. 8 cores)
Test Execution Time ÷ N
10,000 tests ÷ N is still slow
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests into 10 parts and run the 1st part
pytest --splits 10 --group 1
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests into 10 parts and run the 1st part
pytest --splits 10 --group 1
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests to 10 parts and run the 1st part
pytest --splits 10 --group 1
# Assumption: All tests have the same
# test execution time.
# Unbalanced test execution time can lead to
# unbalanced Runner durations
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Run Tests in Parallel on multiple Runners
https://pypi.org/project/pytest-split/
# Split tests to 10 parts and run the 1st part
pytest --splits 10 --group 1
# Assumption: All tests have the same
# test execution time.
# Unbalanced test execution time can lead to
# unbalanced Runner durations
# To collect test execution time
pytest --store-durations
# To use the collected time
pytest --splits 10 --group 1 --durations-path
.test_durations
N: number of CPUs
Test Execution Time ÷ N
M: number of runners
10,000 tests ÷ N ÷ M
Use Multi-Runners and Multi-CPUs in a Github Workflow
python-test-matrix:
runs-on: ubuntu-latest-8-cores # needs larger runner configuration
strategy:
fail-fast: false # to collect all failed tests
matrix:
group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
steps:
- run: pytest -n auto -split 10 --group ${{ matrix.group }} ...
https://docs.github.com/en/actions/using-workflows
python-test-matrix:
runs-on: ubuntu-latest-8-cores # needs larger runner configuration
strategy:
fail-fast: false # to collect all failed tests
matrix:
group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
steps:
- run: pytest -n auto -split 10 --group ${{ matrix.group }} ...
Use Multi-Runners and Multi-CPUs in a Github Workflow
https://docs.github.com/en/actions/using-workflows
10 x 8 = 80 concurrent test worker processes
🎯Strategy #2: Cache
Cache Python Dependency Installation
pip install -r requirements.txt
# resolve dependency versions
# download and install dependencies
Cache Python Dependency Installation
pip install -r requirements.txt
# resolve dependency versions
# download and install dependencies
# In Github Workflow
steps:
- uses: actions/cache@v3
id: dependency-cache
with:
key: ${{ hashFiles('requirements.txt') }}
- if: steps.dependency-cache.outputs.cache-hit != 'true'
run: pip install -r requirements.txt
Cache Python Dependency Installation
pip install -r requirements.txt
# resolve dependency versions
# download and install dependencies
# In Github Workflow
steps:
- uses: actions/cache@v3
id: dependency-cache
with:
key: ${{ hashFiles('requirements.txt') }}
- if: steps.dependency-cache.outputs.cache-hit != 'true'
run: pip install -r requirements.txt
Save 5-10 minutes on each CI run in a large
codebase
Cache Python Dependency Installation
pip install -r requirements.txt
# resolve dependency versions
# download and install dependencies
# In Github Workflow
steps:
- uses: actions/cache@v3
id: dependency-cache
with:
key: ${{ hashFiles('requirements.txt') }}
- if: steps.dependency-cache.outputs.cache-hit != 'true'
run: uv pip install -r requirements.txt --system
Save 5-10 minutes on each CI run in a large
codebase
Use uv to install faster
https://pypi.org/project/uv/
Cache Non-Python Dependency Installation
Common non-Python dependencies:
● Python and Node interpreters
● Database: Postgres
● System packages: protobuf-compiler, graphviz, etc.
● Browsers for end-to-end tests: Playwright
Cache Non-Python Dependency Installation
Common non-Python dependencies:
● Python and Node interpreters
● Database: Postgres
● System packages: protobuf-compiler, graphviz, etc.
● Browsers for end-to-end tests: Playwright
# Dockerfile
FROM … # a base image
RUN sudo apt-get install -y postgresql-16 protobuf-compiler
Cache Non-Python Dependency Installation
Common non-Python dependencies:
● Python and Node interpreters
● Database: Postgres
● System packages: protobuf-compiler, graphviz, etc.
● Browsers for end-to-end tests: Playwright
# Dockerfile
FROM … # a base image
RUN sudo apt-get install -y postgresql-16 protobuf-compiler
# After publishing the image
# to a registry
# Github Workflow
Jobs:
run-in-container:
runs-on:ubuntu-latest
container:
image: …
Cache Non-Python Dependency Installation
Common non-Python dependencies:
● Python and Node interpreters
● Database: Postgres
● System packages: protobuf-compiler, graphviz, etc.
● Browsers for end-to-end tests: Playwright
# Dockerfile
FROM … # a base image
RUN sudo apt-get install -y postgresql-16 protobuf-compiler
Save 10 minutes or more on each CI run
in a large codebase
https://docs.github.com/en/actions/using-jobs/running-jobs-in-a-container
# After publishing the image
# to a registry
# Github Workflow
Jobs:
run-in-container:
runs-on:ubuntu-latest
container:
image: …
🎯Strategy #3: Skip Unnecessary Computing
Skip Unnecessary Tests and Linters
Only run specific tests when only specific code are changed
https://github.com/marketplace/actions/changed-files
Skip Unnecessary Tests and Linters
Only run specific tests when only specific code are changed
# Github workflow
jobs:
changed-files:
outputs:
has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }}
runs-on: ubuntu-latest
steps: actions/checkout@v4
- uses: tj-actions/changed-files@44
id: find-py-changes
with:
files: **/*.py
https://github.com/marketplace/actions/changed-files
Skip Unnecessary Tests and Linters
Only run specific tests when only specific code are changed
# Github workflow
jobs:
changed-files:
outputs:
has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }}
runs-on: ubuntu-latest
steps: actions/checkout@v4
- uses: tj-actions/changed-files@44
id: find-py-changes
with:
files: **/*.py
run-pytest:
needs: changed-files
if: needs.changed-files.outputs.has-py-changes == 'True'
steps:
- run: pytest
https://github.com/marketplace/actions/changed-files
Only run specific tests when only specific code are changed
# Github workflow
jobs:
changed-files:
outputs:
has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }}
runs-on: ubuntu-latest
steps: actions/checkout@v4
- uses: tj-actions/changed-files@44
id: find-py-changes
with:
files: **/*.py
run-pytest:
needs: changed-files
if: needs.changed-files.outputs.has-py-changes == 'True'
steps:
- run: pytest
Skip Unnecessary Tests and Linters
💡Can also only runs on updated files in linters
✨Modularize code and use build systems to run even fewer tests
https://github.com/marketplace/actions/changed-files
Skip Coverage Analysis for Unchanged Files
# pytest --cov by default measures coverage for all files
and it’s slow in a large codebase
# Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only
measure the updated files
Skip Coverage Analysis for Unchanged Files
# pytest --cov by default measures coverage for all files
and it’s slow in a large codebase
# Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only
measure the updated files
Save 1 minute or more on each CI run in a
large codebase
🎯Strategy #4: Modernize Runners
Use Faster and Cheaper Runners
Use the new generation CPU/MEM to run faster and cheaper
The 3rd-party-hosted runner providers:
● Namespace
● BuildJet
● Actuated
● …
Use self-hosted runners with auto-scaling
https://github.com/actions/actions-runner-controller/
Use Actions Runner Controller to deploy auto-scaling runners using
Kubernetes with custom hardware specifications (e.g. AWS EC2)
5X+ Cost Saving and 2X+ Faster Test Speed compared to Github runners
Rujul Zaparde
Co-Founder and CEO
Continuously optimizing CI test execution time to improve
developer experiences
Results
Rujul Zaparde
Co-Founder and CEO
Continuously optimizing CI test execution time to improve
developer experiences
Results
Increasing test coverage with
beer quality assurance
Recap: 🎯Strategies for Scaling Slow Tests
in a Large Codebase
Parallel Execution
01
02
03
04
Cache
Skip Unnecessary Computing
Modernize Runners
Rujul Zaparde
Co-Founder and CEO
Lu Cheng
Co-Founder and CTO
Engineering Blog
hps://engineering.ziphq.com
Job Opportunities
hps://ziphq.com/careers
Thank You!
Ad

Recommended

PyCon JP 2024 Streamlining Testing in a Large Python Codebase .pdf
PyCon JP 2024 Streamlining Testing in a Large Python Codebase .pdf
Jimmy Lai
 
Software development practices in python
Software development practices in python
Jimmy Lai
 
Simple tools to fight bigger quality battle
Simple tools to fight bigger quality battle
Anand Ramdeo
 
Pynvme introduction
Pynvme introduction
Crane Chu
 
Getting Started with Test-Driven Development at Longhorn PHP 2023
Getting Started with Test-Driven Development at Longhorn PHP 2023
Scott Keck-Warren
 
Leveling Up With Unit Testing - php[tek] 2023
Leveling Up With Unit Testing - php[tek] 2023
Mark Niebergall
 
Making the most of your Test Suite
Making the most of your Test Suite
ericholscher
 
Continuous integration / continuous delivery of web applications, Eugen Kuzmi...
Continuous integration / continuous delivery of web applications, Eugen Kuzmi...
Evgeniy Kuzmin
 
Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
Ceph Community
 
Quality of life through Unit Testing
Quality of life through Unit Testing
Sian Lerk Lau
 
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
GeeksLab Odessa
 
Leveling Up With Unit Testing - LonghornPHP 2022
Leveling Up With Unit Testing - LonghornPHP 2022
Mark Niebergall
 
Performance and Scalability Testing with Python and Multi-Mechanize
Performance and Scalability Testing with Python and Multi-Mechanize
coreygoldberg
 
DIY in 5 Minutes: Testing Django App with Pytest
DIY in 5 Minutes: Testing Django App with Pytest
Inexture Solutions
 
Test Driven Development
Test Driven Development
Papp Laszlo
 
QA Meetup at Signavio (Berlin, 06.06.19)
QA Meetup at Signavio (Berlin, 06.06.19)
Anesthezia
 
PresentationqwertyuiopasdfghUnittest.pdf
PresentationqwertyuiopasdfghUnittest.pdf
kndemo34
 
Continuous feature-development
Continuous feature-development
nhm taveer hossain khan
 
Automated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and Challenges
Tao Xie
 
Effective testing with pytest
Effective testing with pytest
Hector Canto
 
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
timohund
 
Continuous Delivery - Automate & Build Better Software with Travis CI
Continuous Delivery - Automate & Build Better Software with Travis CI
wajrcs
 
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios
 
Test Kitchen and Infrastructure as Code
Test Kitchen and Infrastructure as Code
Cybera Inc.
 
Testing in Craft CMS
Testing in Craft CMS
JustinHolt20
 
Continuous Integration Testing in Django
Continuous Integration Testing in Django
Kevin Harvey
 
Automated Unit Testing
Automated Unit Testing
Mike Lively
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
Annibale Panichella
 
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
Jimmy Lai
 
Python Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 

More Related Content

Similar to EuroPython 2024 - Streamlining Testing in a Large Python Codebase (20)

Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
Ceph Community
 
Quality of life through Unit Testing
Quality of life through Unit Testing
Sian Lerk Lau
 
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
GeeksLab Odessa
 
Leveling Up With Unit Testing - LonghornPHP 2022
Leveling Up With Unit Testing - LonghornPHP 2022
Mark Niebergall
 
Performance and Scalability Testing with Python and Multi-Mechanize
Performance and Scalability Testing with Python and Multi-Mechanize
coreygoldberg
 
DIY in 5 Minutes: Testing Django App with Pytest
DIY in 5 Minutes: Testing Django App with Pytest
Inexture Solutions
 
Test Driven Development
Test Driven Development
Papp Laszlo
 
QA Meetup at Signavio (Berlin, 06.06.19)
QA Meetup at Signavio (Berlin, 06.06.19)
Anesthezia
 
PresentationqwertyuiopasdfghUnittest.pdf
PresentationqwertyuiopasdfghUnittest.pdf
kndemo34
 
Continuous feature-development
Continuous feature-development
nhm taveer hossain khan
 
Automated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and Challenges
Tao Xie
 
Effective testing with pytest
Effective testing with pytest
Hector Canto
 
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
timohund
 
Continuous Delivery - Automate & Build Better Software with Travis CI
Continuous Delivery - Automate & Build Better Software with Travis CI
wajrcs
 
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios
 
Test Kitchen and Infrastructure as Code
Test Kitchen and Infrastructure as Code
Cybera Inc.
 
Testing in Craft CMS
Testing in Craft CMS
JustinHolt20
 
Continuous Integration Testing in Django
Continuous Integration Testing in Django
Kevin Harvey
 
Automated Unit Testing
Automated Unit Testing
Mike Lively
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
Annibale Panichella
 
Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
Ceph Community
 
Quality of life through Unit Testing
Quality of life through Unit Testing
Sian Lerk Lau
 
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
JS Lab2017_Андрей Кучеренко _Разработка мультипакетных приложения: причины, с...
GeeksLab Odessa
 
Leveling Up With Unit Testing - LonghornPHP 2022
Leveling Up With Unit Testing - LonghornPHP 2022
Mark Niebergall
 
Performance and Scalability Testing with Python and Multi-Mechanize
Performance and Scalability Testing with Python and Multi-Mechanize
coreygoldberg
 
DIY in 5 Minutes: Testing Django App with Pytest
DIY in 5 Minutes: Testing Django App with Pytest
Inexture Solutions
 
Test Driven Development
Test Driven Development
Papp Laszlo
 
QA Meetup at Signavio (Berlin, 06.06.19)
QA Meetup at Signavio (Berlin, 06.06.19)
Anesthezia
 
PresentationqwertyuiopasdfghUnittest.pdf
PresentationqwertyuiopasdfghUnittest.pdf
kndemo34
 
Automated Developer Testing: Achievements and Challenges
Automated Developer Testing: Achievements and Challenges
Tao Xie
 
Effective testing with pytest
Effective testing with pytest
Hector Canto
 
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
timohund
 
Continuous Delivery - Automate & Build Better Software with Travis CI
Continuous Delivery - Automate & Build Better Software with Travis CI
wajrcs
 
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios
 
Test Kitchen and Infrastructure as Code
Test Kitchen and Infrastructure as Code
Cybera Inc.
 
Testing in Craft CMS
Testing in Craft CMS
JustinHolt20
 
Continuous Integration Testing in Django
Continuous Integration Testing in Django
Kevin Harvey
 
Automated Unit Testing
Automated Unit Testing
Mike Lively
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
Annibale Panichella
 

More from Jimmy Lai (20)

[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
Jimmy Lai
 
Python Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 
EuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python Codebases
Jimmy Lai
 
Annotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoring
Jimmy Lai
 
The journey of asyncio adoption in instagram
The journey of asyncio adoption in instagram
Jimmy Lai
 
Data Analyst Nanodegree
Data Analyst Nanodegree
Jimmy Lai
 
Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...
Jimmy Lai
 
Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...
Jimmy Lai
 
Build a Searchable Knowledge Base
Build a Searchable Knowledge Base
Jimmy Lai
 
[LDSP] Solr Usage
[LDSP] Solr Usage
Jimmy Lai
 
[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast Prototyping
Jimmy Lai
 
Text classification in scikit-learn
Text classification in scikit-learn
Jimmy Lai
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
Jimmy Lai
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Jimmy Lai
 
Fast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython Notebook
Jimmy Lai
 
Documentation with sphinx @ PyHug
Documentation with sphinx @ PyHug
Jimmy Lai
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languages
Jimmy Lai
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHug
Jimmy Lai
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
Jimmy Lai
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Jimmy Lai
 
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
[PyCon US 2025] Scaling the Mountain_ A Framework for Tackling Large-Scale Te...
Jimmy Lai
 
Python Linters at Scale.pdf
Python Linters at Scale.pdf
Jimmy Lai
 
EuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python Codebases
Jimmy Lai
 
Annotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoring
Jimmy Lai
 
The journey of asyncio adoption in instagram
The journey of asyncio adoption in instagram
Jimmy Lai
 
Data Analyst Nanodegree
Data Analyst Nanodegree
Jimmy Lai
 
Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...
Jimmy Lai
 
Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...
Jimmy Lai
 
Build a Searchable Knowledge Base
Build a Searchable Knowledge Base
Jimmy Lai
 
[LDSP] Solr Usage
[LDSP] Solr Usage
Jimmy Lai
 
[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast Prototyping
Jimmy Lai
 
Text classification in scikit-learn
Text classification in scikit-learn
Jimmy Lai
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
Jimmy Lai
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Jimmy Lai
 
Fast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython Notebook
Jimmy Lai
 
Documentation with sphinx @ PyHug
Documentation with sphinx @ PyHug
Jimmy Lai
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languages
Jimmy Lai
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHug
Jimmy Lai
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
Jimmy Lai
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Jimmy Lai
 
Ad

Recently uploaded (20)

Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
" How to survive with 1 billion vectors and not sell a kidney: our low-cost c...
Fwdays
 
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
ReSTIR [DI]: Spatiotemporal reservoir resampling for real-time ray tracing ...
revolcs10
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
 
Lessons Learned from Developing Secure AI Workflows.pdf
Lessons Learned from Developing Secure AI Workflows.pdf
Priyanka Aash
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
Ad

EuroPython 2024 - Streamlining Testing in a Large Python Codebase

  • 1. Streamlining Testing in a Large Python Codebase Jimmy Lai, Staff Software Engineer, Zip July 12, 2024
  • 2. Python Testing: pytest, coverage, and continuous integration 01 02 03 04 05 Outline The Slow Test Challenges Optimization Strategies Results Recap
  • 3. Zip is the world’s leading Intake & Procurement Orchestration Platform 450+ global customers $4.4 billion total customer savings Top talent from tech disruptors $181 million raised at $1.5 billion valuation
  • 4. A Large Python Codebase 100 developers We’re hiring fast 1
  • 5. A Large Python Codebase 100 developers We’re hiring fast 2.5 million lines of Python code Doubling every year 1 2
  • 6. Scaling Challenges 100 developers We’re hiring 2.5 million lines of Python code Doubling every year 1 2 Number of tests and tech debt increase fast 3
  • 8. Why Tests? Quality Assurance Refactoring Confidence 1 2
  • 9. Why Tests? Quality Assurance Refactoring Confidence Documentation 1 2 3
  • 10. Test Execution Time 01 02 03 Useful Test Metrics Test Reliability Test Coverage
  • 11. Simple Testing using pytest https://pypi.org/project/pytest/ # in helper.py def is_even(number: int) -> bool: if number % 2 == 0: return True else: return False
  • 12. Simple Testing using pytest https://pypi.org/project/pytest/ # in helper.py def is_even(number: int) -> bool: if number % 2 == 0: return True else: return False # in test_helper.py from helper import is_even def test_is_even_with_even_number(): assert is_even(4) == True def test_is_even_with_zero(): assert is_even(0) == True
  • 13. Simple Testing using pytest https://pypi.org/project/pytest/ # in helper.py def is_even(number: int) -> bool: if number % 2 == 0: return True else: return False # in test_helper.py from helper import is_even def test_is_even_with_even_number(): assert is_even(4) == True def test_is_even_with_zero(): assert is_even(0) == True > pytest . -vv ======= test session starts ======= collected 2 items test_helper.py::test_is_even_with_even_number PASSED test_helper.py::test_is_even_with_zero PASSED ======= 2 passed in 0.03s =======
  • 14. Simple Testing using pytest https://pypi.org/project/pytest/ # in helper.py def is_even(number: int) -> bool: if number % 2 == 0: return True else: return False # in test_helper.py from helper import is_even def test_is_even_with_even_number(): assert is_even(4) == True def test_is_even_with_zero(): assert is_even(0) == True > pytest . -vv ======= test session starts ======= collected 2 items test_helper.py::test_is_even_with_even_number PASSED test_helper.py::test_is_even_with_zero PASSED ======= 2 passed in 0.03s ======= Test Execution Time Test Reliability
  • 15. Measure Test Coverage > pytest --cov . -vv ======= test session starts ======= collected 2 items test_helper.py::test_is_even_with_even_number PASSED test_helper.py::test_is_even_with_zero PASSED ------------- coverage ------------- Name Stmts Miss Cover ------------------------------------ helper.py 5 1 80% test_helper.py 6 0 100% ------------------------------------ TOTAL 11 1 91% ======= 2 passed in 0.03s ======= https://pypi.org/project/pytest-cov/ Test Coverage
  • 16. Measure Test Coverage > pytest --cov . -vv ======= test session starts ======= collected 2 items test_helper.py::test_is_even_with_even_number PASSED test_helper.py::test_is_even_with_zero PASSED ------------- coverage ------------- Name Stmts Miss Cover ------------------------------------ helper.py 5 1 80% test_helper.py 6 0 100% ------------------------------------ TOTAL 11 1 91% ======= 2 passed in 0.03s ======= To increase the test coverage: add a new test case for odd numbers https://pypi.org/project/pytest-cov/ Test Coverage
  • 17. Continuous Integration Practice: continuously merge changes into the shared codebase while ensuring the quality
  • 18. Continuous Integration Practice: continuously merge changes into the shared codebase while ensuring the quality ● Developers submit a pull request (PR) for code review
  • 19. Continuous Integration Practice: continuously merge changes into the shared codebase while ensuring the quality ● Developers submit a pull request (PR) for code review ● Run tests to verify the code changes
  • 20. Continuous Integration Practice: continuous merge changes into the shared codebase ● Developers submit a pull request (PR) for code review ● Run tests to verify the code changes ● Merge a PR after all tests passed and approved
  • 21. Continuous Integration Practice: continuously merge changes into the shared codebase while ensuring the quality ● Developers submit a pull request (PR) for code review ● Run tests to verify the code changes ● Merge a PR after all tests passed and approved Ensure that test reliability and test coverage meet the required thresholds
  • 22. Continuous Integration using Github Workflows # File: .github/workflows/ci.yml name: CI on: pull_request: # on updating a pull request branches: - main push: # on merging to the main branch branches: - main https://docs.github.com/en/actions/using-workflows
  • 23. Continuous Integration using Github Workflows jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.13' - run: pip install -r requirements.txt - run: pytest # File: .github/workflows/ci.yml name: CI on: pull_request: # on updating a pull request branches: - main push: # on merging to the main branch branches: - main https://docs.github.com/en/actions/using-workflows
  • 24. Challenge: Test Execution Time Increases Over Time Number of tests increases 1 Pain Point: Long Test Execution Time
  • 25. Challenge: Test Execution Time Increases Over Time Number of tests increases Codebase size increases 1 2 Pain Point: Test Coverage Overhead Pain Point: Long Test Execution Time
  • 26. Challenge: Test Execution Time Increases Over Time Number of tests increases Codebase size increases Number of dependencies increases 1 2 3 requirements.txt Pain Point: Test Coverage Overhead Pain Point: Slow Test Startup Pain Point: Long Test Execution Time
  • 28. Run Tests in Parallel on multiple CPUs https://pypi.org/project/pytest-xdist/ pytest -n 8 # use 8 worker processes # use all available CPU cores pytest -n auto
  • 29. Run Tests in Parallel on multiple CPUs https://pypi.org/project/pytest-xdist/ pytest -n 8 # use 8 worker processes # use all available CPU cores pytest -n auto N: number of CPUs (e.g. 8 cores) Test Execution Time ÷ N 10,000 tests ÷ N is still slow
  • 30. Run Tests in Parallel on multiple Runners https://pypi.org/project/pytest-split/ # Split tests into 10 parts and run the 1st part pytest --splits 10 --group 1
  • 31. Run Tests in Parallel on multiple Runners https://pypi.org/project/pytest-split/ # Split tests into 10 parts and run the 1st part pytest --splits 10 --group 1 N: number of CPUs Test Execution Time ÷ N M: number of runners 10,000 tests ÷ N ÷ M
  • 32. Run Tests in Parallel on multiple Runners https://pypi.org/project/pytest-split/ # Split tests to 10 parts and run the 1st part pytest --splits 10 --group 1 # Assumption: All tests have the same # test execution time. # Unbalanced test execution time can lead to # unbalanced Runner durations N: number of CPUs Test Execution Time ÷ N M: number of runners 10,000 tests ÷ N ÷ M
  • 33. Run Tests in Parallel on multiple Runners https://pypi.org/project/pytest-split/ # Split tests to 10 parts and run the 1st part pytest --splits 10 --group 1 # Assumption: All tests have the same # test execution time. # Unbalanced test execution time can lead to # unbalanced Runner durations # To collect test execution time pytest --store-durations # To use the collected time pytest --splits 10 --group 1 --durations-path .test_durations N: number of CPUs Test Execution Time ÷ N M: number of runners 10,000 tests ÷ N ÷ M
  • 34. Use Multi-Runners and Multi-CPUs in a Github Workflow python-test-matrix: runs-on: ubuntu-latest-8-cores # needs larger runner configuration strategy: fail-fast: false # to collect all failed tests matrix: group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] steps: - run: pytest -n auto -split 10 --group ${{ matrix.group }} ... https://docs.github.com/en/actions/using-workflows
  • 35. python-test-matrix: runs-on: ubuntu-latest-8-cores # needs larger runner configuration strategy: fail-fast: false # to collect all failed tests matrix: group: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] steps: - run: pytest -n auto -split 10 --group ${{ matrix.group }} ... Use Multi-Runners and Multi-CPUs in a Github Workflow https://docs.github.com/en/actions/using-workflows 10 x 8 = 80 concurrent test worker processes
  • 37. Cache Python Dependency Installation pip install -r requirements.txt # resolve dependency versions # download and install dependencies
  • 38. Cache Python Dependency Installation pip install -r requirements.txt # resolve dependency versions # download and install dependencies # In Github Workflow steps: - uses: actions/cache@v3 id: dependency-cache with: key: ${{ hashFiles('requirements.txt') }} - if: steps.dependency-cache.outputs.cache-hit != 'true' run: pip install -r requirements.txt
  • 39. Cache Python Dependency Installation pip install -r requirements.txt # resolve dependency versions # download and install dependencies # In Github Workflow steps: - uses: actions/cache@v3 id: dependency-cache with: key: ${{ hashFiles('requirements.txt') }} - if: steps.dependency-cache.outputs.cache-hit != 'true' run: pip install -r requirements.txt Save 5-10 minutes on each CI run in a large codebase
  • 40. Cache Python Dependency Installation pip install -r requirements.txt # resolve dependency versions # download and install dependencies # In Github Workflow steps: - uses: actions/cache@v3 id: dependency-cache with: key: ${{ hashFiles('requirements.txt') }} - if: steps.dependency-cache.outputs.cache-hit != 'true' run: uv pip install -r requirements.txt --system Save 5-10 minutes on each CI run in a large codebase Use uv to install faster https://pypi.org/project/uv/
  • 41. Cache Non-Python Dependency Installation Common non-Python dependencies: ● Python and Node interpreters ● Database: Postgres ● System packages: protobuf-compiler, graphviz, etc. ● Browsers for end-to-end tests: Playwright
  • 42. Cache Non-Python Dependency Installation Common non-Python dependencies: ● Python and Node interpreters ● Database: Postgres ● System packages: protobuf-compiler, graphviz, etc. ● Browsers for end-to-end tests: Playwright # Dockerfile FROM … # a base image RUN sudo apt-get install -y postgresql-16 protobuf-compiler
  • 43. Cache Non-Python Dependency Installation Common non-Python dependencies: ● Python and Node interpreters ● Database: Postgres ● System packages: protobuf-compiler, graphviz, etc. ● Browsers for end-to-end tests: Playwright # Dockerfile FROM … # a base image RUN sudo apt-get install -y postgresql-16 protobuf-compiler # After publishing the image # to a registry # Github Workflow Jobs: run-in-container: runs-on:ubuntu-latest container: image: …
  • 44. Cache Non-Python Dependency Installation Common non-Python dependencies: ● Python and Node interpreters ● Database: Postgres ● System packages: protobuf-compiler, graphviz, etc. ● Browsers for end-to-end tests: Playwright # Dockerfile FROM … # a base image RUN sudo apt-get install -y postgresql-16 protobuf-compiler Save 10 minutes or more on each CI run in a large codebase https://docs.github.com/en/actions/using-jobs/running-jobs-in-a-container # After publishing the image # to a registry # Github Workflow Jobs: run-in-container: runs-on:ubuntu-latest container: image: …
  • 45. 🎯Strategy #3: Skip Unnecessary Computing
  • 46. Skip Unnecessary Tests and Linters Only run specific tests when only specific code are changed https://github.com/marketplace/actions/changed-files
  • 47. Skip Unnecessary Tests and Linters Only run specific tests when only specific code are changed # Github workflow jobs: changed-files: outputs: has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }} runs-on: ubuntu-latest steps: actions/checkout@v4 - uses: tj-actions/changed-files@44 id: find-py-changes with: files: **/*.py https://github.com/marketplace/actions/changed-files
  • 48. Skip Unnecessary Tests and Linters Only run specific tests when only specific code are changed # Github workflow jobs: changed-files: outputs: has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }} runs-on: ubuntu-latest steps: actions/checkout@v4 - uses: tj-actions/changed-files@44 id: find-py-changes with: files: **/*.py run-pytest: needs: changed-files if: needs.changed-files.outputs.has-py-changes == 'True' steps: - run: pytest https://github.com/marketplace/actions/changed-files
  • 49. Only run specific tests when only specific code are changed # Github workflow jobs: changed-files: outputs: has-py-changes: ${{ steps.find-py-changes.outputs.any_changed }} runs-on: ubuntu-latest steps: actions/checkout@v4 - uses: tj-actions/changed-files@44 id: find-py-changes with: files: **/*.py run-pytest: needs: changed-files if: needs.changed-files.outputs.has-py-changes == 'True' steps: - run: pytest Skip Unnecessary Tests and Linters 💡Can also only runs on updated files in linters ✨Modularize code and use build systems to run even fewer tests https://github.com/marketplace/actions/changed-files
  • 50. Skip Coverage Analysis for Unchanged Files # pytest --cov by default measures coverage for all files and it’s slow in a large codebase # Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only measure the updated files
  • 51. Skip Coverage Analysis for Unchanged Files # pytest --cov by default measures coverage for all files and it’s slow in a large codebase # Add --cov=UPDATED_PATH1 --cov=UPDATED_PATH2 … to only measure the updated files Save 1 minute or more on each CI run in a large codebase
  • 53. Use Faster and Cheaper Runners Use the new generation CPU/MEM to run faster and cheaper The 3rd-party-hosted runner providers: ● Namespace ● BuildJet ● Actuated ● …
  • 54. Use self-hosted runners with auto-scaling https://github.com/actions/actions-runner-controller/ Use Actions Runner Controller to deploy auto-scaling runners using Kubernetes with custom hardware specifications (e.g. AWS EC2) 5X+ Cost Saving and 2X+ Faster Test Speed compared to Github runners
  • 55. Rujul Zaparde Co-Founder and CEO Continuously optimizing CI test execution time to improve developer experiences Results
  • 56. Rujul Zaparde Co-Founder and CEO Continuously optimizing CI test execution time to improve developer experiences Results Increasing test coverage with beer quality assurance
  • 57. Recap: 🎯Strategies for Scaling Slow Tests in a Large Codebase Parallel Execution 01 02 03 04 Cache Skip Unnecessary Computing Modernize Runners
  • 58. Rujul Zaparde Co-Founder and CEO Lu Cheng Co-Founder and CTO Engineering Blog hps://engineering.ziphq.com Job Opportunities hps://ziphq.com/careers Thank You!