Type checking verifies that the data types used in a computer program are correct. It’s a critical part of software development, as it helps prevent errors and improve code quality. Data validation is the process of ensuring that data is accurate, complete, and consistent. Validating data before using it in any application is important, as invalid data can lead to errors and incorrect results.
Pydantic is a Python library that provides a powerful and intuitive way to perform type-checking and data validation. It leverages Python’s type annotations to define and validate data structures, making it easy to ensure that data is consistent and correct. Pydantic can be installed with the following terminal command:
pip install pydantic
Python’s type annotations are a way to hint to the type checker what type of data is expected for a particular variable or function parameter. Pydantic takes this one step further by allowing us to define custom constraints on the given data structures. For example, we can specify that a field must be a non-empty string, a positive integer, or a list of unique values.
To use Pydantic for type checking, we simply create a Pydantic model class and define the fields we need. The type annotations for the fields will specify the expected types. For example, the following code defines a model class for a user:
from pydantic import BaseModelclass User(BaseModel):name: strage: intemail: str
In the code above, we import BaseModel
from the pydantic
library. We defined a class named User
with BaseModel
as a parameter. The class contains three variables:
name
: string
age
: integer
email
: string
If we try to create a new User
instance with invalid data, Pydantic will raise a ValidationError
exception. For example, the following code will raise an exception because the age
field isn’t an integer:
from pydantic import BaseModelclass User(BaseModel):name: strage: intemail: struser_data = {"name": "Tester1","age": "Eighteen","email": "Tester1@example.com",}user = User.model_validate(user_data)print(user.name)print(user.age)print(user.email)
Here’s a breakdown of the code:
Line 1: We import the BaseModel
class from the pydantic
library.
Lines 3–6: We define a User
class that inherits from the BaseModel
class. This class defines the structure of a user object, which contains name
, age
, and email
with their respective data types.
Lines 8–12: We create a user_data
dictionary containing the data for a new user.
Line 14: We validate the user_data
dictionary using the User.model_validate()
class method. This method returns a User
object named user
if the data is valid, or raises a ValidationError
exception if the data is invalid.
Lines 16–18: We print the values of the user
object’s name
, age
, and email
properties.
Now, when we correct the input value for age
and use an integer, the code should work. Here’s the updated code:
from pydantic import BaseModelclass User(BaseModel):name: strage: intemail: struser_data = {"name": "Tester1","age": 18, #updated value"email": "Tester1@example.com",}user = User.model_validate(user_data)print(user.name)print(user.age)print(user.email)
Pydantic makes type checking easier, faster, and more efficient than manual type checking.
Pydantic can validate data in a number of ways, including range checking, regular expression matching, uniqueness checking, and custom validation.
Pydantic range checking is a feature that allows us to validate data against a specified range of values. This can be done by using the Field()
and constr
class decorators to manage integer value and string length.
The min_length()
and max_length()
keyword arguments are used for the constr()
class decorator to define the range of the string length. The ge
(greater than) and le
(less than) keyword arguments are used for the Field()
class decorator to define the integer value bracket. Here’s an example of using Pydantic range checking to validate the name
and age
field of a class:
from pydantic import BaseModel, constr, Fieldclass User(BaseModel):name: constr(min_length=3, max_length=20)age: int = Field(ge=18, le=68)email: struser_data = {"name": "Te","age": 16,"email": "Tester1@example.com",}user = User.model_validate(user_data)print(user.name)print(user.age)print(user.email)
Here’s an explanation:
Line 1: We import the BaseModel
, contr
, and Field
classes from the pydantic
library.
Line 3: We define a User
class that inherits from the BaseModel
class.
Line 4: We define a name
field for the User
class. The name
field is a string with a minimum length of 3 characters and a maximum length of 20 characters.
Line 5: We define an age
field for the User
class. The age
field is an integer with a minimum value of 18 and a maximum value of 68.
Line 6: We define an email
field for the User
class. The email
field is a string.
Lines 8–12: We create a user_data
dictionary containing the data for a new user.
Lines 14: We validate the user_data
dictionary against the User
model using the model.validate()
method of the class and return a validated User
object named user
.
Lines 16–18: We print the data of the class object user
.
When we run the code above, we see two errors:
The name
string is of length 2, but according to the constraints, the length should be between 3–20.
The age
integer has a value of 16, while the range defined for it is 18–68.
When we correct the values according to the constraints, the code runs perfectly:
from pydantic import BaseModel, constr, Fieldclass User(BaseModel):name: constr(min_length=3, max_length=20)age: int = Field(ge=18, le=68)email: struser_data = {"name": "Tester1", #updated value"age": 20, #updated value"email": "Tester1@example.com",}user = User.model_validate(user_data)print(user.name)print(user.age)print(user.email)
Through these checks, data gathering becomes convenient and we get clean data in the end.
To use regular expression matching in Pydantic, we can use the constr()
field type validator. The constr()
field type validator allows specifying a regular expression pattern that the field value must match. For example, the following code shows how to use the constr()
field type validator and the pattern
keyword argument to validate an email address:
from pydantic import BaseModel, constrclass CheckEmail(BaseModel):email: constr(pattern=r'[a-zA-Z0-9._]@([\w-]+\.)+[\w-]{2,4}')user_data = {"email" : "Tester1@example"}user = CheckEmail.model_validate(user_data)print(user.email)
Here’s an explanation of the code above:
Line 1: We import the BaseModel
and constr
classes from the Pydantic library.
Lines 3–4: We use the BaseModel
class to create the Pydantic model class CheckEmail
, and the constr
class is used to define field validators that validate strings against pattern
or regular expression.
Lines 6–8: We create a dictionary called user_data
with a single key-value pair email
.
Line 10: We call the CheckEmail.model_validate()
method to validate the user_data
dictionary.
Line 12: We print the email
attribute of the user
variable.
The code above will show an error because user_data
doesn’t contain a valid email address. If we correct the email address, the code works fine:
from pydantic import BaseModel, constrclass CheckEmail(BaseModel):email: constr(pattern=r'[a-zA-Z0-9._]@([\w-]+\.)+[\w-]{2,4}')user_data = {"email" : "Tester1@example.com"}user = CheckEmail.model_validate(user_data)print(user.email)
Using regular expressions in Pydantic is a great way to ensure that the models contain valid data.
To check for uniqueness in Pydantic, we can use the field_validator()
decorator. The field_validator()
decorator allows us to validate the entire model instance rather than just individual fields.
Here’s a simple example of how to use the field_validator()
decorator to check for uniqueness in a Pydantic model:
from pydantic import BaseModel, Field, field_validatorclass User(BaseModel):name: str = Field(unique=True)__values__ = {}def __init__(self, **data):super().__init__(**data)self.__values__[self.name] = self@field_validator("name")def validate_unique_name(cls, value, **kwargs):if value in cls.__values__:raise ValueError("Duplicate names are not allowed")return valuedef check_for_duplicates(user_data):duplicates = []for name in user_data:try:User(name=name)except ValueError:duplicates.append(name)return duplicatesuser_data = ["Tester1", "Tester1", "Tester2", "Tester2"]duplicates = check_for_duplicates(user_data)if duplicates:print("Duplicate names:")for name in duplicates:print(f"* {name}")else:print("There are no duplicate names.")
Here’s an explanation of the code above:
Line 1: We import the BaseModel
, Field
, and field_validator
classes from the Pydantic library.
Line 3: We define a Pydantic model called User
.
Line 4: The User
model has a single field name
, which is defined as a str
field with the unique
keyword.
Line 6: The User
model also has a __values__
attribute, which is a dictionary that stores all of the existing User
instances.
Lines 8–10: The __init__()
method of the User
model adds the new User
instance to the __values__
dictionary.
Lines 12–16: The validate_unique_name()
field validator checks if the name
field value is already present in the __values__
dictionary. If it is, the field validator raises a ValueError
exception.
Lines 18–25: check_for_duplicates()
checks for duplicate names in a list of names. It does this by trying to create a new User
instance for each name in the list. If the User()
constructor raises a ValueError
exception, then the name is already present in the __values__
dictionary and is therefore a duplicate.
Line 27: We create a list of names called user_data
.
Line 29: We call the check_for_duplicates()
function to check for duplicate names in the list.
Lines 30–35: We print a list of duplicate names, if any.
When we run the above code, it displays the duplicates in the list. Now, when we remove the duplicates and run the code again, it shows that there are no duplicates in the list:
from pydantic import BaseModel, Field, field_validatorclass User(BaseModel):name: str = Field(unique=True)__values__ = {}def __init__(self, **data):super().__init__(**data)self.__values__[self.name] = self@field_validator("name")def validate_unique_name(cls, value, **kwargs):if value in cls.__values__:raise ValueError("Duplicate names are not allowed")return valuedef check_for_duplicates(user_data):duplicates = []for name in user_data:try:User(name=name)except ValueError:duplicates.append(name)return duplicatesuser_data = ["Tester1", "Tester2"]duplicates = check_for_duplicates(user_data)if duplicates:print("Duplicate names:")for name in duplicates:print(f"* {name}")else:print("There are no duplicate names.")
Using uniqueness checking in Pydantic is a great way to ensure that our data is consistent, accurate, and efficient.
Using Pydantic for type checking and data validation has a number of benefits, including:
Pydantic helps in writing more robust and maintainable code by ensuring that data is consistent and correct.
Pydantic can catch data validation errors early on before they cause problems in the application.
Pydantic makes it easy to define and validate data structures, which can save time and effort.
Pydantic isn’t included in the Python standard library, so it requires a separate installation.
Pydantic can be more complex to use for simple programs compared to other command-line parsing libraries.
Pydantic can be slower compared to other command-line parsing libraries because it does extra validation and processing of the data.
Pydantic is a powerful and intuitive library for type checking and data validation in Python. It’s easy to use and can provide significant benefits for our code quality, error reduction, and productivity.
Free Resources