ðŠķ Generate from Apache Avro Schema¶
Generate Python models from JSON-encoded Apache Avro schemas (.avsc).
ð Quick Start¶
datamodel-codegen \
--input user.avsc \
--input-file-type avro \
--output-model-type pydantic_v2.BaseModel \
--output model.py
ð Example¶
user.avsc
{
"type": "record",
"name": "User",
"namespace": "example.avro",
"doc": "A user record.",
"fields": [
{
"name": "id",
"type": {
"type": "string",
"logicalType": "uuid"
}
},
{
"name": "name",
"type": "string",
"doc": "Display name"
},
{
"name": "email",
"type": ["null", "string"],
"default": null
},
{
"name": "roles",
"type": {
"type": "array",
"items": {
"type": "enum",
"name": "Role",
"symbols": ["admin", "member"]
}
},
"default": []
}
]
}
âĻ Generated model.py
from __future__ import annotations
from enum import Enum
from typing import Optional
from uuid import UUID
from pydantic import BaseModel, Field
class Role(Enum):
admin = 'admin'
member = 'member'
class User(BaseModel):
id: UUID
name: str = Field(..., description='Display name')
email: Optional[str] = None
roles: list[Role] = []
Supported Avro Schema Features¶
The Avro parser supports the schema constructs needed to generate Python models:
| Avro construct | Model generation behavior |
|---|---|
record |
Generates a Python model class with fields |
enum |
Generates an enum class |
array |
Generates a list field or root list model |
map |
Generates a dict field or root dict model |
fixed |
Generates a fixed-length binary string model |
union |
Generates union types; nullable unions such as ["null", "string"] become optional fields |
| Named types | Resolves name, namespace, fullnames, nested named types, and references |
| Field metadata | Preserves doc, default, aliases, and order where applicable |
| Logical types | Maps Avro logical types to Python-friendly formats where possible |
Supported primitive types are null, boolean, int, long, float, double, bytes, and string.
Logical Types¶
Avro logical types are mapped through the generator's normal type handling:
| Avro logical type | Typical generated Python type |
|---|---|
decimal, big-decimal |
Decimal |
uuid |
UUID |
date |
date |
time-millis, time-micros |
time |
timestamp-millis, timestamp-micros, timestamp-nanos |
datetime |
local-timestamp-millis, local-timestamp-micros, local-timestamp-nanos |
local date-time format |
duration |
timedelta |
Avro-specific metadata is preserved in generated JSON Schema extensions such as x-avro-fullname, x-avro-namespace, x-avro-aliases, and x-avro-logicalType before model generation.
Schema Version¶
Apache Avro schemas do not include an in-schema version marker equivalent to JSON Schema's $schema, OpenAPI's openapi, or XML Schema versioning attributes. For that reason, --schema-version does not select an Avro specification version. The Avro parser follows the currently implemented Apache Avro schema rules and logical types.
Limitations¶
datamodel-code-generator uses Avro schemas to generate Python model definitions. It does not implement Avro runtime validation, binary encoding, decoding, or serialization.
ð See Also¶
- ðĨïļ CLI Reference - Complete CLI options reference
- ð Supported Data Types - Data type support details
- ð Generate from JSON Schema - JSON Schema input documentation