feat(flags): add a flag_fallback_cache that tracks feature flag evaluation results and uses them as fallback values whenever the /flags API isn't available#275
Conversation
There was a problem hiding this comment.
PR Summary
Added native feature flag caching mechanism to provide fallback values when the /flags API is unavailable, improving reliability and performance.
- Implemented
FlagCacheinposthog/utils.pywith versioning, TTL support, and LRU eviction strategy - Added client configuration options in
posthog/client.pyfor cache control (enable_flag_cache,flag_cache_size,flag_cache_ttl) - Added automatic cache invalidation when flag definitions change through version tracking
- Implemented stale cache access as fallback when remote evaluation fails
- Configured cache clearing on quota limit hits for efficient resource management
4 files reviewed, 1 comment
Edit PR Review Bot Settings | Greptile
/flags API isn't available/flags API isn't available
/flags API isn't available/flags API isn't available
/flags API isn't availableflag_fallback_cache that tracks feature flag evaluation results and uses them as fallback values whenever the /flags API isn't available
| With Redis fallback cache for high-scale applications: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... enable_flag_fallback_cache=True, | ||
| ... flag_fallback_cache_backend="redis", | ||
| ... flag_fallback_cache_redis_url="redis://localhost:6379/0" | ||
| ... ) |
There was a problem hiding this comment.
I'd propose a slightly different API, similar to how we do configs on posthog-js
# In-memory
client = Client("api-key", flag_fallback_cache=True)
# Redis URL
client = Client("api-key", flag_fallback_cache=FlagFallbackCache(backend="redis", url="redis://localhost:6379/0")
# Existing redis
client = Client("api-key", flag_fallback_cache=FlagFallbackCache(backend="redis", client=redis_client)This avoids us flooding the main args list and offloads all of the config-handling to a separate class. That class can also handle instantiating the cache, for example. I believe the code would look much more modular that way.
There was a problem hiding this comment.
Why put the service name in the parameters at all, it's right there in the URL schema (I would just pass fallback_cache_url='redis://...', and then list the schema's we support as we extend support to further backend systems)
There was a problem hiding this comment.
Because someone might pass a custom client we dont recognize (custom redis wrapper) and we need to know how to use that. Im in favor of specifying the service. OR we make it optional, and fail in case we cant use it (or just dont use it, log a warning)
There was a problem hiding this comment.
I think we want to specifically prevent people passing clients here, though - for reasons outlined here - letting them pass a custom client is the same as defining an interface, except now we'd need to define one for every cache service rather than an abstraction.
I also can't find places we do this custom wrapper stuff in posthog-js?
There was a problem hiding this comment.
Agree with @oliverb123 for this first implementation. The less surface area we commit to, the better. Down the road, when our caching approach is mature, we could always add an interface for custom cache providers.
| project_root=None, | ||
| privacy_mode=False, | ||
| before_send=None, | ||
| enable_flag_fallback_cache=False, |
There was a problem hiding this comment.
This param seems overkill - if the URL isn't None, it's enabled
| import json | ||
| import logging | ||
| import numbers | ||
| import pickle |
There was a problem hiding this comment.
I think we need to use json for this - keep in mind more than just python SDKs could be talking to the same cache at the same time here. Pickle has other-platform implementations, but it's python-specific enough to be very off the beaten track for lots of them
| privacy_mode=False, | ||
| before_send=None, | ||
| enable_flag_fallback_cache=False, | ||
| flag_fallback_cache_size=10000, |
There was a problem hiding this comment.
I can't tell what the unit is here, and don't think a user will be able to easily either? See note below, seems backend-specific so I'd put it in a url param.
There was a problem hiding this comment.
good callout (the unit is entries and that's clearly not obvious). Agreed about your note with URL params, makes the abstraction even cleaner. Was trying to get too cute here.
| enable_flag_fallback_cache=False, | ||
| flag_fallback_cache_size=10000, | ||
| flag_fallback_cache_ttl=300, | ||
| flag_fallback_cache_backend="memory", |
There was a problem hiding this comment.
As above - you'll need to accept URLs for this anyway, and URLs are fundamentally a serialisation format to describe an access protocol and a location. I'd put these as query parameters into the URL, where they're relevant, like:
memory://local/?ttl=300&size=1000redis://username:password@host:port/?some_param=some_value.
This has the advantage of being universal across languages, so we can expose it in a consistent manner (avoiding interface skew a bit), and letting you take vastly different arguments depending on the backend without an endlessly growing list of parameters or wrapper/enum types exposed
| """Create a new PostHog client. | ||
|
|
||
| Examples: | ||
| Basic usage: | ||
| >>> client = Client("your-api-key") | ||
|
|
||
| With memory-based feature flag fallback cache: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="memory://local/?ttl=300&size=10000" | ||
| ... ) | ||
|
|
||
| With Redis fallback cache for high-scale applications: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="redis://localhost:6379/0/?ttl=300" | ||
| ... ) | ||
|
|
||
| With Redis authentication: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="redis://username:password@localhost:6379/0/?ttl=300" | ||
| ... ) | ||
| """ |
There was a problem hiding this comment.
is this overkill to include in the docstring? makes it seem like this is the only thing worth passing to the Client, but I didn't know where else to put the docs.
There was a problem hiding this comment.
I think it's fine, for now at least. Documenting the behaviour of these args on posthog.com seems like the generally accepted approach
| except ImportError: | ||
| self.log.warning( | ||
| "[FEATURE FLAGS] Redis not available, flag caching disabled" | ||
| ) | ||
| return None | ||
| except Exception as e: | ||
| self.log.warning( | ||
| f"[FEATURE FLAGS] Redis connection failed: {e}, flag caching disabled" | ||
| ) | ||
| return None |
There was a problem hiding this comment.
originally i was falling back to "memory" if they couldn't connect to redis, but that was a bad idea because I don't want to silently create an in-memory cache if they can't initialize a redis connection. Better to just warn and move on.
There was a problem hiding this comment.
Makes sense I think - do what's asked, or not at all.
oliverb123
left a comment
There was a problem hiding this comment.
Nice, this feels like a good interface to me
| """Create a new PostHog client. | ||
|
|
||
| Examples: | ||
| Basic usage: | ||
| >>> client = Client("your-api-key") | ||
|
|
||
| With memory-based feature flag fallback cache: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="memory://local/?ttl=300&size=10000" | ||
| ... ) | ||
|
|
||
| With Redis fallback cache for high-scale applications: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="redis://localhost:6379/0/?ttl=300" | ||
| ... ) | ||
|
|
||
| With Redis authentication: | ||
| >>> client = Client( | ||
| ... "your-api-key", | ||
| ... flag_fallback_cache_url="redis://username:password@localhost:6379/0/?ttl=300" | ||
| ... ) | ||
| """ |
There was a problem hiding this comment.
I think it's fine, for now at least. Documenting the behaviour of these args on posthog.com seems like the generally accepted approach
| except ImportError: | ||
| self.log.warning( | ||
| "[FEATURE FLAGS] Redis not available, flag caching disabled" | ||
| ) | ||
| return None | ||
| except Exception as e: | ||
| self.log.warning( | ||
| f"[FEATURE FLAGS] Redis connection failed: {e}, flag caching disabled" | ||
| ) | ||
| return None |
There was a problem hiding this comment.
Makes sense I think - do what's asked, or not at all.
what it says on the tin.
Somewhere between a prototype and an actual thing that I can ship, maybe a bit closer to the latter 😉
The idea here is laid out in this thread: https://posthog.slack.com/archives/C07Q2U4BH4L/p1751050758806129; the TL;DR is that rather than encourage our customers to build around any sort of downtime of our
/flagsAPI, we should have the SDKs handle this for them. It does introduce a new change to the SDK paradigm in that the SDK will no longer be stateless w.r.t. flags – part of the initialization process will involve maintaining a cache of flag evaluations by key for a given token.