Creating Custom Tag in Python PyYAML
Last Updated :
23 Jul, 2024
YAML, or YAML Markup Language is a data interchange format that is as readable as a text file, and one of the relations of JSON and XML. PyYAML is a YAML parser/ emitter library for Python that can handle parsing as well as the emission of YAML documents. Another nice feature of PyYAML is its ability to handle builtins, that lets you define new tags and work with more complex data in the YAML format.
What Are Custom Tags?
In YAML, the tags are indicators of the node data type. For instance, str stands for a string, and int denotes an integer. In PyYAML, there is a possibility to create the so-called Custom tags which allow for using custom tags to represent the more complex or even the domain-specific data types. This becomes especially useful when the configuration files or data formats use more than simple data types.
Why Use Custom Tags?
Custom tags are beneficial when you need to:
- Encode complex data structures.
- Represent domain-specific concepts.
- Clean the data representation to assist in decreasing clutter.
- The values should be kept coherent and sound.
Creating Custom Tag in Python PyYAML
To create and use custom tags in PyYAML, you need to follow these steps:
- Define the custom data structure.
- Create a Python class to represent the custom data structure.
- Implement the necessary logic to serialize and deserialize the custom data structure.
- Register the custom tag with PyYAML.
Step 1: Define the Custom Data Structure
Let's define a simple custom data structure for a point in a 2D space:
!!point
x: 10
y: 20
Step 2: Create a Python Class
Next, create a Python class to represent this custom data structure, Save this class in a file named point.py.
Python
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Point(x={self.x}, y={self.y})"
Step 3: Implement Serialization and Deserialization
Create a file named custom_tags.py to implement the logic for serialising and deserialising the custom data structure and to register the custom tag with PyYAML.
Python
import yaml
def point_representer(dumper, data):
return dumper.represent_mapping('!point', {'x': data.x, 'y': data.y})
def point_constructor(loader, node):
values = loader.construct_mapping(node)
return Point(values['x'], values['y'])
# Register the representer and constructor with PyYAML
yaml.add_representer(Point, point_representer)
yaml.add_constructor('!point', point_constructor)
Step 4: Using the Custom Tag
Now, you can use the custom tag in your YAML files and load them with PyYAML:
File: example.yaml
!point
x: 10
y: 20
Load the YAML data in main.py file
Python
import yaml
from custom_tags import Point
# Load the YAML data
with open('example.yaml', 'r') as file:
point = yaml.load(file, Loader=yaml.FullLoader)
print(point)
# Dump the Point object back to YAML
yaml_string = yaml.dump(point)
print(yaml_string)
Step 5: Run the Python Script
Navigate to the my_project directory in your terminal and run the main.py script:
cd path/to/my_project
python main.py
output:
Point(x=10, y=20)
!!point
x: 10
y: 20
Advanced PyYAML Custom Tags
Let's consider a more advanced example where we define a custom tag for a 3D point:
File: 'point3d.py'
Python
class Point3D:
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
def __repr__(self):
return f"Point3D(x={self.x}, y={self.y}, z={self.z})"
File: custom_tags.py
Add the following code to handle the 3D point:
Python
import yaml
from point import Point
from point3d import Point3D
# Existing Point serialization and deserialization
def point_representer(dumper, data):
return dumper.represent_mapping('!point', {'x': data.x, 'y': data.y})
def point_constructor(loader, node):
values = loader.construct_mapping(node)
return Point(values['x'], values['y'])
# New Point3D serialization and deserialization
def point3d_representer(dumper, data):
return dumper.represent_mapping('!point3d', {'x': data.x, 'y': data.y, 'z': data.z})
def point3d_constructor(loader, node):
values = loader.construct_mapping(node)
return Point3D(values['x'], values['y'], values['z']})
# Register the representers and constructors with PyYAML
yaml.add_representer(Point, point_representer)
yaml.add_constructor('!point', point_constructor)
yaml.add_representer(Point3D, point3d_representer)
yaml.add_constructor('!point3d', point3d_constructor)
File: example3d.yaml
!point3d
x: 10
y: 20
z: 30
File: main3d.py
Python
import yaml
from custom_tags import Point3D
# Load the YAML data
with open('example3d.yaml', 'r') as file:
point3d = yaml.load(file, Loader=yaml.FullLoader)
print(point3d) # Output: Point3D(x=10, y=20, z=30)
# Dump the Point3D object back to YAML
yaml_string = yaml.dump(point3d)
print(yaml_string)
Run the Python Script
Navigate to the my_project directory in your terminal and run the main.py script:
cd path/to/my_project
python main3d.py
output:
Point3D(x=10, y=20, z=30)
!!point3d
x: 10
y: 20
z: 30
Advanced Features
Here are some of the key advanced features:
1. Custom Constructors and Representers
Custom constructors and representors allow you to define how YAML nodes are converted to Python objects and vice versa. This feature is particularly useful for handling complex data structures or domain-specific objects.
Example: Custom Constructor and Representer for a Date
Python
import yaml
from datetime import datetime
class CustomDate(datetime):
pass
def date_constructor(loader, node):
value = loader.construct_scalar(node)
return CustomDate.strptime(value, '%Y-%m-%d')
def date_representer(dumper, data):
value = data.strftime('%Y-%m-%d')
return dumper.represent_scalar('!date', value)
yaml.add_constructor('!date', date_constructor)
yaml.add_representer(CustomDate, date_representer)
Usage:
Python
# YAML data with custom date tag
yaml_data = """
!date '2024-07-10'
"""
# Load the YAML data
date_obj = yaml.load(yaml_data, Loader=yaml.FullLoader)
print(date_obj)
# Dump the date object back to YAML
yaml_string = yaml.dump(date_obj)
print(yaml_string)
Output:
2024-07-10 00:00:00
2. Custom Resolver
A custom resolver allows you to define how YAML tags are matched to Python types. This can be used to create more intuitive or concise YAML representations.
Example: Custom Resolver for Dates
Python
def date_resolver(loader, node):
return loader.construct_scalar(node)
yaml.add_implicit_resolver('!date', date_resolver, ['\\d{4}-\\d{2}-\\d{2}'])
Usage:
Python
# YAML data with implicit date recognition
yaml_data = """
2024-07-10
"""
# Load the YAML data
date_obj = yaml.load(yaml_data, Loader=yaml.FullLoader)
print(date_obj)
Output:
2024-07-10
3. Multi-Document YAML
PyYAML supports multi-document YAML files which allows you to load and dump multiple documents to a single file.
Example: Multi-Document YAML
Python
# Multi-document YAML data
yaml_data = """
---
name: Document 1
value: 123
---
name: Document 2
value: 456
"""
# Load multiple documents
documents = list(yaml.load_all(yaml_data, Loader=yaml.FullLoader))
print(documents)
# Dump multiple documents
yaml_string = yaml.dump_all(documents)
print(yaml_string)
Output :
[{'name': 'Document 1', 'value': 123}, {'name': 'Document 2', 'value': 456}]
Conclusion
Custom tags in PyYAML allow you to set up specific extensions of the YAML language and define new arbitrary structures and domains. Custom types can be defined in Python, and the serialization and deserialization logic required for YAML configurations can be provided by writing appropriate logic in these classes. That is why PyYAML can be considered as a flexible and stable solution for the configuration data management and interchange in Python-based software systems.
Similar Reads
Python Tutorial | Learn Python Programming Language Python Tutorial â Python is one of the most popular programming languages. Itâs simple to use, packed with features and supported by a wide range of libraries and frameworks. Its clean syntax makes it beginner-friendly.Python is:A high-level language, used in web development, data science, automatio
10 min read
Python Interview Questions and Answers Python is the most used language in top companies such as Intel, IBM, NASA, Pixar, Netflix, Facebook, JP Morgan Chase, Spotify and many more because of its simplicity and powerful libraries. To crack their Online Assessment and Interview Rounds as a Python developer, we need to master important Pyth
15+ min read
Non-linear Components In electrical circuits, Non-linear Components are electronic devices that need an external power source to operate actively. Non-Linear Components are those that are changed with respect to the voltage and current. Elements that do not follow ohm's law are called Non-linear Components. Non-linear Co
11 min read
Python OOPs Concepts Object Oriented Programming is a fundamental concept in Python, empowering developers to build modular, maintainable, and scalable applications. By understanding the core OOP principles (classes, objects, inheritance, encapsulation, polymorphism, and abstraction), programmers can leverage the full p
11 min read
Python Projects - Beginner to Advanced Python is one of the most popular programming languages due to its simplicity, versatility, and supportive community. Whether youâre a beginner eager to learn the basics or an experienced programmer looking to challenge your skills, there are countless Python projects to help you grow.Hereâs a list
10 min read
Support Vector Machine (SVM) Algorithm Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. It tries to find the best boundary known as hyperplane that separates different classes in the data. It is useful when you want to do binary classification like spam vs. not spam or
9 min read
Python Exercise with Practice Questions and Solutions Python Exercise for Beginner: Practice makes perfect in everything, and this is especially true when learning Python. If you're a beginner, regularly practicing Python exercises will build your confidence and sharpen your skills. To help you improve, try these Python exercises with solutions to test
9 min read
Python Programs Practice with Python program examples is always a good choice to scale up your logical understanding and programming skills and this article will provide you with the best sets of Python code examples.The below Python section contains a wide collection of Python programming examples. These Python co
11 min read
Spring Boot Tutorial Spring Boot is a Java framework that makes it easier to create and run Java applications. It simplifies the configuration and setup process, allowing developers to focus more on writing code for their applications. This Spring Boot Tutorial is a comprehensive guide that covers both basic and advance
10 min read
Class Diagram | Unified Modeling Language (UML) A UML class diagram is a visual tool that represents the structure of a system by showing its classes, attributes, methods, and the relationships between them. It helps everyone involved in a projectâlike developers and designersâunderstand how the system is organized and how its components interact
12 min read