Skip to content

Commit 40cec9e

Browse files
committed
feat: implement proper Schema.org inheritance and JSON-LD compatibility
- Add SchemaOrgBase class with JSON-LD fields (@id, @type, @context) - Implement true Python class inheritance that follows Schema.org hierarchy - Use consistent URL type from utils.py instead of repeating pattern - Add comprehensive tests for inheritance and JSON-LD serialization - Update documentation with examples of inheritance and JSON-LD usage
1 parent 38595f1 commit 40cec9e

14 files changed

Lines changed: 1484 additions & 1026 deletions

File tree

README.md

Lines changed: 64 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -27,17 +27,18 @@ While AI assisted in development, all code was reviewed and tested.
2727
## Features
2828

2929
* **Schema Acquisition:** Downloads the latest Schema.org vocabulary (JSON-LD).
30-
* **Type Mapping:** Maps Schema.org types (Text, Number, Date, URL, etc.) to Python types (`str`, `int | float`, `datetime.date`, `Annotated[str, Meta(pattern=...)]`, `bool`).
30+
* **Type Mapping:** Maps Schema.org types (Text, Number, Date, URL, etc.) to Python types (`str`, `int | float`, `datetime.date`, `URL`, `bool`).
3131
* **Code Generation:** Creates `msgspec.Struct` definitions from Schema.org types, including type hints and docstrings.
32-
* **Inheritance Handling:** Resolves the class hierarchy (`rdfs:subClassOf`) and includes parent properties.
32+
* **Proper Inheritance:** Preserves the Schema.org class hierarchy using Python inheritance (`Book` inherits from `CreativeWork`, which inherits from `Thing`).
33+
* **JSON-LD Compatibility:** All models support JSON-LD fields (`@id`, `@type`, `@context`) that serialize correctly.
3334
* **Category Organization:** Organizes generated classes into subdirectories (CreativeWork, Person, etc.).
3435
* **Circular Dependency Resolution:** Uses forward references (`"TypeName"`) and `TYPE_CHECKING` imports.
3536
* **Python Compatibility:** Handles reserved keywords.
3637
* **Convenient Imports:** All generated classes are importable from `msgspec_schemaorg.models`.
3738
* **ISO8601 Date Handling:** Utility function `parse_iso8601` for date/datetime strings.
3839
* **Type Specificity:** Sorts type unions to prioritize more specific types (e.g., `Integer` before `Number`).
39-
* **URL Validation:** Validates URL fields using `msgspec` pattern matching.
40-
* **Comprehensive Testing:** Includes tests for model generation, validation, and usage.
40+
* **URL Validation:** Validates URL fields using a centralized `URL` type with pattern validation.
41+
* **Comprehensive Testing:** Includes tests for model generation, validation, inheritance, and usage.
4142

4243
## Installation
4344

@@ -70,13 +71,16 @@ address = PostalAddress(
7071
person = Person(
7172
name="Jane Doe",
7273
jobTitle="Software Engineer",
73-
address=address
74+
address=address,
75+
# JSON-LD fields
76+
id="https://example.com/people/jane",
77+
context="https://schema.org"
7478
)
7579

7680
# Encode to JSON
7781
json_bytes = msgspec.json.encode(person)
7882
print(json_bytes.decode())
79-
# Output: {"name":"Jane Doe","jobTitle":"Software Engineer","address":{"streetAddress":"123 Main St","addressLocality":"Anytown","postalCode":"12345","addressCountry":"US"}}
83+
# Output: {"name":"Jane Doe","jobTitle":"Software Engineer","address":{"streetAddress":"123 Main St","addressLocality":"Anytown","postalCode":"12345","addressCountry":"US"},"@id":"https://example.com/people/jane","@context":"https://schema.org","@type":"Person"}
8084
```
8185

8286
## Usage
@@ -110,10 +114,56 @@ blog_post = BlogPosting(
110114
author=Person(name="Jane Author"),
111115
publisher=Organization(name="TechMedia Inc."),
112116
image=ImageObject(url="https://example.com/images/header.jpg"),
113-
datePublished="2023-09-15" # ISO8601 date string
117+
datePublished="2023-09-15", # ISO8601 date string
118+
# JSON-LD fields
119+
id="https://example.com/blog/schema-org-python",
120+
context="https://schema.org"
114121
)
115122
```
116123

124+
### Inheritance Structure
125+
126+
All Schema.org models preserve the original class hierarchy:
127+
128+
```python
129+
from msgspec_schemaorg.models import Thing, CreativeWork, Book
130+
131+
# All Schema.org types inherit ultimately from Thing
132+
isinstance(Book(), Thing) # True
133+
isinstance(Book(), CreativeWork) # True
134+
135+
# Properties are inherited
136+
book = Book(name="The Great Gatsby")
137+
print(book.name) # Inherited from Thing
138+
```
139+
140+
### JSON-LD Compatibility
141+
142+
All models have JSON-LD fields for linked data integration:
143+
144+
```python
145+
from msgspec_schemaorg.models import Product
146+
import msgspec
147+
import json
148+
149+
# Create a product with JSON-LD fields
150+
product = Product(
151+
name="Smartphone",
152+
id="https://example.com/products/123", # Maps to @id
153+
context="https://schema.org", # Maps to @context
154+
type="Product" # Maps to @type (usually has default value)
155+
)
156+
157+
# Encode to JSON
158+
json_bytes = msgspec.json.encode(product)
159+
data = json.loads(json_bytes)
160+
161+
# JSON-LD fields are properly serialized with @ prefix
162+
print(data["@id"]) # https://example.com/products/123
163+
print(data["@context"]) # https://schema.org
164+
print(data["@type"]) # Product
165+
```
166+
117167
### Handling Dates
118168

119169
Use the `parse_iso8601` utility for date strings:
@@ -131,7 +181,7 @@ print(post.datePublished.year) # 2023
131181

132182
### URL Validation
133183

134-
URL fields are automatically validated using a regex pattern via `msgspec`.
184+
URL fields are automatically validated using a centralized URL type:
135185

136186
```python
137187
import msgspec
@@ -175,16 +225,19 @@ Or run specific test groups:
175225
python run_tests.py unittest
176226
python run_tests.py examples
177227
python run_tests.py imports
228+
python run_tests.py inheritance # Test the inheritance structure
178229
```
179230

180-
The tests cover model generation, imports, date parsing, URL validation, and example script execution.
231+
The tests cover model generation, imports, date parsing, URL validation, inheritance, and example script execution.
181232

182233
## Type System
183234

184-
* **Primitives:** Schema.org types like `Text`, `Number`, `Date`, `URL` are mapped to Python types (`str`, `int | float`, `datetime.date`, `Annotated[str, Meta(pattern=...)]`).
235+
* **Primitives:** Schema.org types like `Text`, `Number`, `Date`, `URL` are mapped to Python types (`str`, `int | float`, `datetime.date`, `URL`, `bool`).
185236
* **Specificity:** Type unions are sorted (e.g., `Integer` before `Number`).
186237
* **Literals:** `Boolean` constants use `Literal[True]` / `Literal[False]`.
187-
* **URLs:** Validated using `typing.Annotated` and `msgspec.Meta(pattern=...)`.
238+
* **URLs:** Validated using a consistent `URL` type with pattern validation.
239+
* **Inheritance:** Schema.org hierarchy is preserved through Python class inheritance.
240+
* **JSON-LD:** All models support standard JSON-LD fields (`@id`, `@type`, `@context`).
188241

189242
## Limitations
190243

msgspec_schemaorg/base.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
"""
2+
Base classes for Schema.org models with JSON-LD compatibility.
3+
"""
4+
from __future__ import annotations
5+
from typing import Any, Dict, List, Optional, Union
6+
import msgspec
7+
from msgspec import field
8+
9+
10+
class SchemaOrgBase(msgspec.Struct, frozen=True):
11+
"""
12+
Base class for all Schema.org models with JSON-LD fields.
13+
14+
This class provides the standard JSON-LD fields (@id, @type, @context, etc.)
15+
that are used to represent linked data. All Schema.org model classes
16+
inherit from this base.
17+
18+
JSON-LD fields are aliased using msgspec's field renaming to ensure
19+
that the serialized output uses the @ prefix.
20+
"""
21+
id: Optional[str] = field(default=None, name="@id")
22+
context: Optional[Union[str, Dict[str, Any]]] = field(default=None, name="@context")
23+
# Note: type field is intentionally omitted since it will be provided by each class
24+
graph: Optional[List[Dict[str, Any]]] = field(default=None, name="@graph")
25+
reverse: Optional[Dict[str, Any]] = field(default=None, name="@reverse")

0 commit comments

Comments
 (0)