feat: schema based XML#1047
Conversation
- Add XML traits (XmlName, XmlFlattened, XmlAttribute, XmlNamespace) to Smithy module - Create SmithyRestXML module with Serializer, Deserializer, Codec, HTTPClientProtocol, Plugin, BaseError - Add SmithyRestXMLTypes.kt and RestXMLPlugin.kt for Kotlin codegen - Update SerdeUtils to include RestXML in useSchemaBased() - Update RestXmlCustomizations with renderClientProtocol and plugins - Update RestXmlProtocolGenerator to remove schema-based middlewares
- Add renderClientProtocol and plugins to mock customizations - Remove OperationInputBodyMiddleware and DeserializeMiddleware in mock - Fixes 4 pre-existing test failures
- Add containerType: .structure to BaseError member schemas for correct memberName resolution - Fix xmlElementName fallback to check id.member before id.name - Remove unused XmlNameTrait import from HTTPClientProtocol
Instead of building XML via string concatenation, the Serializer now delegates to the existing SmithyXML.Writer which uses libxml2 for correct XML generation (encoding, entities, namespaces, etc). Three internal serializer types handle different contexts: - Serializer: top-level entry point, creates root Writer - MemberSerializer: writes struct members as child elements - ValueSerializer: writes values into list/map element nodes
- Delete 18 XML serde test files in protocolspecificserde/xml/ that tested +Write.swift/+Read.swift codegen output no longer generated by schema-based codegen - Change pagination.smithy, pagination-truncation.smithy, waiters.smithy, and waiters-none.smithy from @restXml to @restJson1 since those tests don't test protocol-specific behavior
Inline single-expression function bodies onto signature line to satisfy standard:function-signature rule. - RestXmlCustomizations.kt:18 (renderClientProtocol) - MockHTTPRestXMLProtocolGenerator.kt:27 (renderClientProtocol) - SerdeUtils.kt:11 (useSchemaBased delegating overload)
Resolve SerdeUtils.kt conflict by unioning the trait-check lists: schema-based applies to Rpcv2Cbor, AwsJson1_0, AwsJson1_1, and RestXml. Keep the epic/sbs two-overload signature (settings+model / private service).
RestXML is a REST-style protocol that uses HTTP URI/header/query bindings, unlike AwsJson and RpcV2Cbor. The generated client's OperationInputUrlPathMiddleware, OperationInputHeadersMiddleware, and OperationInputQueryItemMiddleware reference Input.urlPathProvider(_:), headerProvider(_:), and queryItemProvider(_:), so those static funcs must still be emitted on Input extensions even when serde is schema- based. Extend the guard to keep emitting these providers for RestXML.
The Deserializer was using Data("<empty/>".utf8) as a substitute for
empty response bodies, then parsing it via libxml2. For tiny payloads
Swift may use inline Data storage, creating a stack-pointer lifetime
issue with xmlBufferCreateStatic that causes xmlReadMemory to fail with
'The XML could not be parsed'.
Return an empty Reader directly for empty data instead. This resolves
~16 RestXML protocol tests where responses have no body (headers-only
outputs, EmptyInputAndEmptyOutput, NoInputAndNoOutput, etc.).
Expose Reader's default init as public under the existing
SmithyReadWrite SPI so Deserializer (in SmithyRestXML) can construct
an empty Reader.
…eserialization Schema-based RestXML is the first schema-based protocol that needs HTTP response bindings (headers, prefix headers, response code, httpPayload). Earlier schema-based protocols (AwsJson, CBOR) are RPC-style and never need them. The deserialize path previously passed only the body bytes through the XML Reader, silently leaving header-/status-/payload-bound output members unset (or throwing when a non-XML raw payload hit the parser). Add the HTTP binding trait types that the Swift schema needs: - HttpHeaderTrait (string value: header name) - HttpPrefixHeadersTrait (string value: header name prefix) - HttpResponseCodeTrait (marker) - HttpPayloadTrait (marker) Register them in AllSupportedTraits so Swift SchemasCodegen emits them into generated schemas. Extend Deserializer to optionally carry the HTTPResponse + raw body data. In readStruct, route each member through httpBindingDeserializer which inspects the member's schema traits and, if HTTP-bound, returns a synthetic child Deserializer sourced from the header value, status code, or raw body bytes (as applicable) instead of the XML reader. Normal body-bound members keep the existing XML element lookup path untouched. Expose Reader.init(content:) and Reader.addChild publicly under the existing SmithyReadWrite SPI so the Deserializer can synthesize a Reader wrapping a single header value or a list of split values. HTTPClientProtocol.deserializeResponse now constructs a binding-aware Deserializer directly (passing through the HTTPResponse + bodyData) instead of going through codec.makeDeserializer which only accepts body data. The error path is unchanged.
…P headers readList filters reader.children by XML element name, which correctly matches XML-sourced lists but returns empty for synthetic list Readers built from split header values (children have no XML element name). Add an isHeaderList flag so readList can enumerate all children directly when the list came from an HTTP header.
1. HTTP header timestamps default to http-date per Smithy spec (vs .dateTime for XML body). Route through an isFromHttpHeader flag on Deserializer so readTimestamp picks the right default when no @timestampFormat trait is present. 2. HttpPrefixHeadersTrait: replace TODO with real map construction. Synthesize a Reader tree of entry/key/value nodes matching what the existing readMap logic already consumes. Header names that match the prefix (case-insensitive) yield map entries with the stripped suffix as key and comma-joined values. Adds a public Reader(nodeInfo:content:) initializer under the existing SmithyReadWrite SPI to support synthesizing named Reader nodes.
…erializer/deserializer
…apes for correct XML root element
… empty list/map serialization
… HTTPClientProtocol
…d improve deserialization detection Two fixes: 1. Serialization: The Serializer's PayloadMemberSerializer did not implement writeDataStream(), so streaming blob payloads (e.g. S3 PutObject body) were silently dropped — the request body was empty. Added writeDataStream() to capture the ByteStream, and updated serializeRequest() to use it as the request body directly instead of the serialized XML data. 2. Deserialization: The hasStreamingPayload check now also checks member.hasTrait(StreamingTrait.self) in addition to member.target?.hasTrait(StreamingTrait.self), since SchemasCodegen merges target traits into member schemas.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| // Use a deserializer for the protocol in use, by making it from the codec. | ||
| let payloadDeserializer = try codec.makeDeserializer(data: message.payload) | ||
| if let payloadMember = schema.members.first(where: { $0.hasTrait(EventPayloadTrait.self) }) { | ||
| try T.readConsumer(payloadMember, &value, payloadDeserializer) |
There was a problem hiding this comment.
Per the spec for event payload, it can be a blob, string, structure, or union.
https://smithy.io/2.0/spec/streaming.html#eventpayload-trait
We should probably just refactor this to do all 4.
| @_spi(SmithyTimestamps) import enum SmithyTimestamps.TimestampFormat | ||
|
|
||
| func resolveTimestampFormat(_ schema: Schema) -> TimestampFormat { | ||
| guard let traitFormat = try? schema.getTrait(TimestampFormatTrait.self)?.format else { |
There was a problem hiding this comment.
what does try? buy us here? Seems we should throw an exception if it happens; it would be highly unusual for this to throw
| @_spi(SmithyReadWrite) import class SmithyXML.Reader | ||
|
|
||
| @_spi(SchemaBasedSerde) | ||
| public struct Deserializer: ShapeDeserializer { |
There was a problem hiding this comment.
This type is created in the RestXML module.
We'll have AwsQuery/Ec2Query as well, which also use XML in responses. Can this be generalized for other formats?
| if case .string(let prefix) = dict["prefix"] { | ||
| self.prefix = prefix | ||
| } else { | ||
| self.prefix = nil | ||
| } |
There was a problem hiding this comment.
A simpler way:
| if case .string(let prefix) = dict["prefix"] { | |
| self.prefix = prefix | |
| } else { | |
| self.prefix = nil | |
| } | |
| self.prefix = dict["prefix"]?.string |
| } else { | ||
| writer.openBlock("dependencies: [", "]") { | ||
| dependenciesByTarget.forEach { writeTargetDependency(writer, it) } | ||
| } |
There was a problem hiding this comment.
Probably okay to write the plugin to the manifest for all protocols, including not yet schema-based. Non-schema-based protocols don't get a swift-settings.json so the plugin just exits early.
| } | ||
| val httpBindingResolver = getProtocolHttpBindingResolver(ctx, defaultContentType) | ||
| if (!usesSchemaBased) { | ||
| if (!usesSchemaBased || ctx.service.hasTrait<RestXmlTrait>()) { |
There was a problem hiding this comment.
Is this condition needed? usesSchemaBased should be true for RestXML anyway
Issue #
Description of changes
Scope
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.