Skip to content

feat: add custom JSON serializer support to HTTPClient#1590

Open
Gardner-Programs wants to merge 2 commits into
burnash:masterfrom
Gardner-Programs:feat/custom-json-serializer
Open

feat: add custom JSON serializer support to HTTPClient#1590
Gardner-Programs wants to merge 2 commits into
burnash:masterfrom
Gardner-Programs:feat/custom-json-serializer

Conversation

@Gardner-Programs

@Gardner-Programs Gardner-Programs commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #1556.

Adds the ability to configure a custom JSON serializer for request bodies, so users can write values that the standard library json module cannot encode by default (e.g. datetime, Decimal) without pre-stringifying their data.

import json
import datetime

gc = gspread.service_account()
gc.set_serializer(lambda body: json.dumps(body, default=lambda o: o.isoformat()))

ws.update([[datetime.date(2026, 6, 19)]], "A1")  # now works

Pass None to restore the default behavior.

Design

The issue suggested adding a serializer argument at the Worksheet.update / batch_update level. I went with configuring it once on the client instead, for a few reasons:

  • It's a single choke point in HTTPClient.request() rather than threading a parameter through several method signatures, so it's smaller and less likely to break existing write paths.
  • It applies uniformly to every write (update, batch_update, etc.).
  • It mirrors the existing set_timeout pattern (setter on both HTTPClient and Client), so it fits the codebase's conventions.

If you'd prefer a per-call argument, that can be added on top of this foundation later. Happy to adjust the API in review.

Implementation

  • SerializerType alias + optional self.serializer (defaults to None).
  • HTTPClient.set_serializer() and a Client.set_serializer() forwarder.
  • In HTTPClient.request(), when a serializer is set and there is a JSON body, the body is serialized and sent as data= with a Content-Type: application/json header instead of via json=. Without a serializer, behavior is unchanged.

Backward compatibility

Fully backward compatible. serializer defaults to None, and the existing json= code path is untouched when no serializer is set.

Tests

New tests/http_client_test.py (mocked session, no cassette needed since this is request-layer logic):

  • test_default_uses_json_kwarg: no serializer, so the body still goes out via json= (proves unchanged behavior).
  • test_custom_serializer_uses_data_and_header: serializer set, so the body is serialized into data=, json is nulled, and the JSON content-type header is added.
  • test_custom_serializer_handles_non_native_types: a datetime.date (which stdlib json.dumps rejects) is written successfully.

Also manually verified end-to-end against the live Sheets API: without a serializer a date write raises TypeError, and with one the value lands correctly in the cell.

Docs

Added a short "Using a Custom JSON Serializer" section to the user guide.

Notes

  • I left off a .. versionadded:: directive since I don't know the target release. Happy to add 6.3.0 (or whatever you prefer) on request.

Allow configuring a custom JSON serializer via Client.set_serializer /
HTTPClient.set_serializer, applied to all request bodies. Useful for
encoding types the stdlib json module can't handle by default (e.g.
datetime, Decimal). Defaults to None, preserving existing behavior.

Closes burnash#1556
Comment thread gspread/http_client.py Outdated
headers: Optional[MutableMapping[str, str]] = None,
) -> Response:
if self.serializer is not None and json is not None:
data = self.serializer(dict(json))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the dict needed here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just mirroring the existing json=dict(json) if json else None line below it. Since json is typed as Mapping[str, Any] it isn't guaranteed to be a concrete dict, so this makes sure the serializer gets one it can encode.

@tlienart

Copy link
Copy Markdown

cool, that would definitely solve the problem I had and I think your point about putting this at the higher level makes total sense 👍

@antoineeripret

Copy link
Copy Markdown
Collaborator

Hey @Gardner-Programs & @tlienart,

Can't we implement something simpler to fix the issue? I'm not a fan of ...

gc.set_serializer(lambda body: json.dumps(body, default=lambda o: o.isoformat()))

... because it's a global state.

Wouldn't it make more sense to let users serialize their data before running the update call? We'd have to implement the following:

  1. Add a serializer parameter. When provided, convert json → data before passing to requests (http_client.py // request())
  2. Accept and forward serializer (http_client.py // values_batch_update())
  3. Add the parameter and pass it through (worksheet.py // Worksheet.update() + Worksheet.batch_update())

User would be able to run

ws.update(
    [[datetime.date(2026, 6, 23)]],
    "A1",
    serializer=lambda b: json.dumps(b, default=lambda o: o.isoformat()),
)

Overall, we'd achieve the same but we'd remove layers of complexity.

What do you think?

@Gardner-Programs

Copy link
Copy Markdown
Contributor Author

Hey @antoineeripret, thanks for taking a look 🙏

I actually modeled set_serializer() on the existing set_timeout(). It's stored per-client and applied down in request(), so it follows a pattern the library already uses rather than introducing a new one. It's also a no-op until you opt in, since serializer defaults to None and request() only does anything custom when it's actually set. So nothing changes for anyone who doesn't touch it.

On why I went this way: in my own usage my serialization needs are per-spreadsheet, not per-call. A given script talks to one spreadsheet and every write needs the same handling (e.g. datetime to ISO), so I set it once at the top and I'm done. Having to re-pass serializer=... on every update/batch_update would be pretty painful from where I sit. That said, I'm aware my usage shouldn't dictate everyone's, so I do get the appeal of explicit per-call control.

The other thing that pushed me toward the setter is coverage. It hooks in at request(), which is the single point all the JSON-body calls flow through, so one setting covers update, batch_update, format, append_rows, permissions, and the rest.

That said, here's a possible compromise. What if we support both instead of picking one? Keep set_serializer() as a client-wide default, and also add an optional serializer= to the write methods that overrides it per call. Precedence would be per-call arg, then client default, then the built-in. So you'd get exactly the explicit, local API you're after:

ws.update(
    [[datetime.date(2026, 6, 23)]],
    "A1",
    serializer=lambda b: json.dumps(b, default=lambda o: o.isoformat()),
)

while folks with uniform per-spreadsheet needs can still set it once. It stays fairly small because request() already centralizes the logic, so it just takes the override and falls back to the client default.

I do want to be honest about the trade-offs though:

  • It's a bit more complexity. Two ways to do the same thing means more API surface to document, plus a precedence rule people have to understand.
  • Coverage would be partial. Realistically I'd thread serializer= through the data-writing methods (update/batch_update), so per-call control would exist for those but not for every method that sends a body. We could thread it through all of them for full parity, it's just more plumbing and more signatures to maintain. Either way the client-wide setter stays as the catch-all that covers everything.

Happy to go whichever direction you think is best, whether that's the setter as-is, the per-call param on its own, or the combined approach. Let me know what you'd prefer and I'll adjust.

@antoineeripret

Copy link
Copy Markdown
Collaborator

Hi @Gardner-Programs,

That's a very good point you raise, I actually have other scripts with other libraries where I have to do just that, and it's a pain 😢. That being said, I wouldn't go with the double-implementation because I'm unsure it's a good idea for a library that is currently short of maintainers.

Let me suggest a different approach as a compromise, w/o adding too much complexity.

  • We'd implement the per-call solution
  • To achieve a per-spreadsheet behavior, you could use functool as follows:
import json, functools
import gspread

def _iso_serializer(body):
    return json.dumps(body, default=lambda o: o.isoformat())

#we define the per_spreadsheet here 
def open_sheet(name):
    gc = gspread.oauth()
    ws = gc.open(name).sheet1
    ws.update = functools.partial(ws.update, serializer=_iso_serializer)
    ws.batch_update = functools.partial(ws.batch_update, serializer=_iso_serializer)
    return ws

ws = open_sheet("Report A")
ws.update(...)  # serializer already baked in

What do you think? Please do not hesitate to let me know if you think that your approach is better: I don't want to push mine, I just want to make sure we implement the right solution given our objectives and the library's context.

Thank you !

@Gardner-Programs

Copy link
Copy Markdown
Contributor Author

Hey @antoineeripret, the setter is what worked for my personal usage, but I agree it's not the best call for the repo. Let's go per-call and drop it.

The global default fit the majority of the cases I've personally hit, so it felt like the natural choice at the time. But I've never worked with a team that shared a single client across multiple scripts or functions, and once you do, I can see how a mutable setter would cause real problems. That's a context I just wasn't designing for. The functools.partial recipe gets me the same set-once feel on my side without any of that risk, so that works great as the alternative.

One small thing I wanted to run by you before I rework it. Instead of a full serializer that the user owns:

ws.update([[date(2026, 6, 23)]], "A1",
          serializer=lambda b: json.dumps(b, default=lambda o: o.isoformat()))

what if we expose just the default= hook from json.dumps and let gspread keep ownership of the encoding?

ws.update([[date(2026, 6, 23)]], "A1", json_default=str)  # str covers datetime, Decimal, UUID...

Same result for the datetime/Decimal case, but since the library still owns json.dumps, the user can't accidentally hand back invalid JSON or break the Content-Type. Smaller surface, fewer ways to get it wrong, which felt in line with your point about keeping things light for the maintainers.

I'm happy either way, so just let me know which you'd prefer. One thing I want to flag with the per-call approach: unlike the setter, it only applies to the methods we thread it through, so any future data-writing method would need the same param to support it. Nothing blocking, just worth deciding up front. Do you want me to thread it through update/batch_update only, or append_row(s) too, and should we treat "new write methods accept json_default" as a loose convention going forward?

@antoineeripret

antoineeripret commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Hey @Gardner-Programs,

Great idea, really love it !

I'd maybe use a different name for the parameter though: the name makes sense if you already know the json library, less so in gspread's context. serializer or value_serializer_default could be more self-describing IMO.

Regarding your questions:

1/ Which methods to thread it through: I'd add append_row(s) since they write cell data just like update. Non-data methods don't need it though.
2/ Convention going forward:: yes, establishing it as a soft convention ("write methods accept json_default") is the right call IMO.

Thank you !

@Gardner-Programs

Copy link
Copy Markdown
Contributor Author

Perfect, let's lock it in.

On the name, I went with default_serializer. value_serializer_default was my other thought but it felt a touch long, and I wanted to steer clear of plain serializer since that's what we'd been calling the full-body version earlier in the thread, didn't want to overload it. Happy to switch to the more explicit name if you'd rather, though.

For scope I'll thread it through update, batch_update, and append_row(s), and leave the non-data methods alone. I'll keep "write methods accept default_serializer" as the soft convention going forward.

I'll get the rework up shortly. Thanks for the back and forth on this!

Replace the client-wide set_serializer() with a per-call default_serializer
argument on update, batch_update, append_row and append_rows. It mirrors the
default= hook of json.dumps, so gspread keeps ownership of encoding while users
handle types the stdlib cannot serialize on its own (e.g. datetime, Decimal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using a custom json serialiser in worksheet.update

3 participants