Python magic: Metaclasses and Descriptors

Python magic: Metaclasses and Descriptors

In Python, the design and control of classes and objects can be incredibly powerful. Two tools that truly reflect this power are metaclasses and descriptors. These constructs allow you to define and control class behavior dynamically, offering a high degree of abstraction and customization.

Before we delve into the tutorial and examples, let's have a brief overview of these terms.

  • Metaclass: In Python, everything is an object - including classes. Metaclasses are the 'classes' that create these classes. They control the creation and management of classes in Python.
  • Descriptor: Descriptors are Python objects that implement a method of the descriptor protocol, which gives you the ability to create managed attributes.

The Power of Metaclasses and Descriptors

One of the primary benefits of using metaclasses and descriptors is the potential for abstraction and encapsulation. With these tools, you can create complex behaviors and interfaces while keeping the actual usage simple and clean. They allow you to add extra behaviors to your classes and objects, such as type checking, thread-safety, and more.

The Complexity

While metaclasses and descriptors are powerful tools, they also add a considerable layer of complexity to your code. They are considered advanced Python features and can be quite challenging to understand and debug, particularly for those new to the language. Moreover, because they affect the very creation and management of classes and attributes, improper use can lead to complex issues and hard-to-track bugs.

Drawbacks

In addition to complexity, another drawback of metaclasses and descriptors is the potential for overengineering. While these tools can be used to enforce certain behaviors or constraints, they often aren't the simplest or most Pythonic way to achieve these goals.

For instance, Python already has built-in ways to handle many common tasks like attribute access and instance creation. Deviating from these standard tools can make your code harder to read and maintain.

Tutorial Example

Now, let's examine a complex example that utilizes both metaclasses and descriptors in a highly customized implementation of data classes. This code uses descriptors and metaclasses to define a powerful, customizable framework for creating data classes, which are a Pythonic way to create classes solely used to contain values.

import json
from dataclasses import dataclass, is_dataclass
from typing import Any, Callable, Optional, get_type_hints
from datetime import datetime
from uuid import UUID

def clean_dict(d:dict)->dict: # Utility function to clean null values
    return {k: v for k, v in d.items() if v is not None}

class PythonicJSONEncoder(json.JSONEncoder): # jsonifies datetime and UUID
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.astimezone().isoformat()
        elif isinstance(obj, UUID):
            return str(obj)
        elif hasattr(obj, 'dict'):
            return obj.dict()
        return super().default(obj)


Required:Any = Ellipsis # type for "Whatever"
NoArgsCallable = Callable[[], Any] # Callable with no signature


class DataMetaClass(type): # Our beautiful metaclass that typechecks and applies dataclass decorator
    def __new__(cls,name, bases, attrs):
        for attr_name, attr_value in attrs.items():
            if not isinstance(attr_value, DataFieldModel):
                continue
            if attr_name not in attrs.get('__annotations__', {}):
                raise TypeError(f'Attribute "{attr_name}" must have a type annotation')
        new_class = super().__new__(cls, name, bases, attrs)
        if not is_dataclass(new_class):
            return dataclass(new_class) # type: ignore
        return new_class


class DataFieldDescriptor: # Descriptor factory
    def __init__(self):
        self.name = None

    def __set_name__(self, owner, name):
        self.name = name
        self.type = get_type_hints(owner).get(name, Any)

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        if not isinstance(value, self.type):
            raise TypeError(f'{self.name} field must be of type {self.type.__name__} not {type(value).__name__}')
        instance.__dict__[self.name] = value

    def __delete__(self, instance):
        del instance.__dict__[self.name]
        
class DataFieldModel(DataFieldDescriptor): The actual descriptor implementation
    def __init__(self, default=None,*, default_factory=None, required=None, index=None, unique=None):
        super().__init__()
        self.default = default
        self.default_factory = default_factory
        self.required = required
        self.index = index
        self.unique = unique
        
    def __set__(self, instance, value):
        super().__set__(instance, value)
        if self.default == Required and value is None:
            raise ValueError(f'{self.name} is required')
        elif value is None:
            if self.default is not None:
                value = self.default
            elif self.default_factory is not None:
                value = self.default_factory()
        instance.__dict__[self.name] = value        
        
    def __set_name__(self, owner, name):
        self.name = name
        self.type = owner.__annotations__.get(name, Any)
        if self.index is not None:
            indexes = getattr(owner, '__indexes__', {})
            indexes[name] = self.index
            setattr(owner, '__indexes__', indexes)
        if self.unique is not None:
            uniques = getattr(owner, '__uniques__', {})
            uniques[name] = self.unique
            setattr(owner, '__uniques__', uniques)
            
def Data(default:Any=None,*, default_factory:Optional[NoArgsCallable]=None, required:Optional[bool]=None, index:Optional[bool]=None, unique:Optional[bool]=None)->Any:
    return DataFieldModel(default=default, default_factory=default_factory, required=required, index=index, unique=unique)


class DataClass(metaclass=DataMetaClass): # Our BaseModel, know pydantic?
    
    metadata = {
        'indexes': [],
        'uniques': []
    }
    
    def __init__(self, **kwargs):
        for name, _ in self.__annotations__.items(): # pylint: disable=no-member
            value = kwargs.get(name)
            if type(value) not in (type(value),DataFieldModel):
                raise TypeError(f'{name} must be of type {self.__annotations__[name]}') # pylint: disable=no-member
            attr = getattr(self.__class__, name, None)
            if isinstance(attr, DataFieldModel):
                if attr.default == Required and value is None:
                    if attr.default_factory is not None:
                        value = attr.default_factory()
                        if type(value) != attr.type:
                            raise TypeError(f'{name} must be of type {attr.type.__name__} not {type(value).__name__}')
                    else:
                        raise ValueError(f'{name} is required')
                elif value is None:
                    if attr.default is not None:
                        if type(attr.default) == attr.type:
                            value = attr.default
                            if type(value) != attr.type:
                                raise TypeError(f'{name} must be of type {attr.type.__name__} not {type(value).__name__}')
                    elif attr.default_factory is not None:
                        value = attr.default_factory()
                        if type(value) != attr.type:
                            raise TypeError(f'{name} must be of type {attr.type.__name__} not {type(value).__name__}')
                if attr.index is not None:
                    self.metadata['indexes'].append(name)
                if attr.unique is not None:
                    self.metadata['uniques'].append(name)
            setattr(self, name, value)
                    
    def __repr__(self):
        return f'<{self.__class__.__name__} {self.__dict__}>'
    
    
    def dict(self):
        return clean_dict(self.__dict__)
    
    def json(self):
        return json.dumps(self.dict(), cls=PythonicJSONEncoder, ensure_ascii=False, indent=4)n        

In the provided code:

  • DataMetaClass is a custom metaclass that ensures that every attribute of the class it generates that uses DataFieldModel for its value also has a type annotation.
  • DataFieldDescriptor and DataFieldModel implement descriptor protocol and perform additional checks, including checking whether the value is of correct type, and whether it's required but was given None. If the value is None, and a default value or a default factory is provided, they are used to create a new value.
  • DataClass is a base class for all classes that want to use the described system. It uses DataMetaClass as its metaclass, meaning that DataMetaClass's logic will be applied to all subclasses of DataClass.

This code demonstrates the power and flexibility that metaclasses and descriptors can provide. However, it also illustrates the complexity they can introduce.

Conclusion

While metaclasses and descriptors are incredibly powerful, they should be used sparingly and with caution. In many cases, simpler design patterns and features can accomplish the same goals more cleanly and understandably.

When used judiciously, however, these features can offer high levels of abstraction and customization that can truly unleash the power of your Python code.

Parsa Vakili

IUST.EE db_python_AI_backend _ cpp _ programmer _ electronic engineer

9mo

👌 nice !

Like
Reply

To view or add a comment, sign in

More articles by Oscar Martin Bahamonde Muñoz

  • Leveraging AsyncIO and Decorators to Boost Python Performance

    Python is a versatile language offering many powerful constructs to its users. Two of these constructs are asyncio, a…

  • Are you still using requests module?

    Why we should use async and await in Python? Asynchronous programming is a technique used to improve the performance of…

    1 Comment
  • Build your own cybersecurity enumeration tool - Part 1

    On my last post in Linkedin I wrote about the Domain Name Service and its main concepts, today we will build out own…

  • DNS simplified for Devs

    @dns @hostedzone @networking @dnsrecords

  • Rayke: Tecnología e Innovación

    El término Rayke es una representación romanizada no canónica de la fonética del vocablo 雷克 que en escritura cabezal…

  • Ctrl C -> Ctrl V

    La marca personal es el conjunto de atributos que te hacen único en el mercado, siendo en sendas oportunidades ese…

  • SPOPC®

    Gracias a #CertiProf por acreditarme como Profesional en el rol de #ProductOwner el cual según expertos #AgileCoach es…

    1 Comment

Insights from the community

Others also viewed

Explore topics