Logging Redactor is a Python library designed to redact sensitive data in logs based on regex mask_patterns or dictionary keys. It supports JSON logging formats and handles nested data at the message level, at the positional argument level and also in the extra keyword argument.
You can install Logging Redactor via pip:
pip install loggingredactor
Below is a basic example that illustrates how to redact any digits in a logger message:
import re
import logging
import loggingredactor
# Create a logger
logger = logging.getLogger()
# Add the redact filter to the logger with your custom filters
redact_mask_patterns = [re.compile(r'\d+')]
# if no `mask` is passed in, 4 asterisks will be used
logger.addFilter(loggingredactor.RedactingFilter(redact_mask_patterns, mask='xx'))
logger.warning("This is a test 123...")
# Output: This is a test xx...Python only applies the filter on that logger, so any other files using logging will not get the filter applied. To have this filter applied to all loggers do the following (note that you must first run
pip install python-json-logger
to use this example
)
import re
import logging
import loggingredactor
from pythonjsonlogger import jsonlogger
# Create a pattern to hide api key in url. This uses a _Positive Lookbehind_
redact_mask_patterns = [re.compile(r'(?<=api_key=)[\w-]+')]
# Override the logging handler that you want redacted
class RedactStreamHandler(logging.StreamHandler):
def __init__(self, *args, **kwargs):
logging.StreamHandler.__init__(self, *args, **kwargs)
self.addFilter(loggingredactor.RedactingFilter(redact_mask_patterns))
root_logger = logging.getLogger()
sys_stream = RedactStreamHandler()
# Also set the formatter to use json, this is optional and all nested keys will get redacted too
sys_stream.setFormatter(jsonlogger.JsonFormatter('%(name)s %(message)s'))
root_logger.addHandler(sys_stream)
logger = logging.getLogger(__name__)
logger.error("Request Failed", extra={'url': 'https://example.com?api_key=my-secret-key'})
# Output: {"name": "__main__", "message": "Request Failed", "url": "https://example.com?api_key=****"}You can also redact by dictionary keys, rather than by regex, in cases where certain fields should always be redacted. To achieve this, you can provide any iterable representing the keys that you would like to redact on. An example is shown below (this time with a different default mask):
import re
import logging
import loggingredactor
from pythonjsonlogger import jsonlogger
# This list now contains all the dictioanry keys that will have their values redacted in the logger object
redact_keys = ['email', 'password']
# Override the logging handler that you want redacted
class RedactStreamHandler(logging.StreamHandler):
def __init__(self, *args, **kwargs):
logging.StreamHandler.__init__(self, *args, **kwargs)
self.addFilter(loggingredactor.RedactingFilter(mask='REDACTED', mask_keys=redact_keys))
root_logger = logging.getLogger()
sys_stream = RedactStreamHandler()
# Also set the formatter to use json, this is optional and all nested keys will get redacted too
sys_stream.setFormatter(jsonlogger.JsonFormatter('%(name)s %(message)s'))
root_logger.addHandler(sys_stream)
logger = logging.getLogger(__name__)
logger.warning("User %(firstname)s with email: %(email)s and password: %(password)s bought some food!", {'firstname': 'Arman', 'email': 'arman_jasuja@yahoo.com', 'password': '1234567'})
# Output: {"name": "__main__", "message": "User Arman with email: REDACTED and password: REDACTED bought some food"}The above example also illustrates the logger redacting positional arguments provided to the message.
Logging Redactor also integrates quite well with already created logging configurations, for example, say you have your logging config set up in the following format:
import re
import logging.config
... # Other imports
LOGGING = {
... # Your other configs
'filters':{
... # Some configs
'pii': {
'()': 'loggingredactor.RedactingFilter',
'mask_keys': ('password', 'email', 'last_name', 'first_name', 'gender', 'lastname', 'firstname',),
'mask_patterns': (re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'), ) # email regex
'mask': 'REDACTED',
},
... # Some other configs
}
'handlers': {
... # Some handlers
'stdout': {
... # Some configs
'filters': ['pii', ...],
},
... # Other handlers (add pii as a filter to all the ones where you want the appropriate information to be redacted)
}
... # Rest of your configs
}
logging.config.dictConfig(LOGGING)
... # Use your logger as normal, the redaction will now be applied.The essence boils down to adding the RedactingFilter to your logging config, and to the filters section of the associated handlers to which you want to apply the redaction.
If you don't want to assemble your own patterns and keys, use CommonPIIRedactingFilter. It comes preloaded with patterns and keys for the most common kinds of PII, and combines anything you pass in with those built-ins (so you keep the defaults and add your own):
import re
import logging
import loggingredactor
logger = logging.getLogger("demo")
logger.addFilter(loggingredactor.CommonPIIRedactingFilter(
mask="REDACTED",
# both optional, and combined with the built-ins rather than replacing them:
mask_patterns=[re.compile(r"SECRET-\w+")],
mask_keys={"order_token"},
))
logger.warning("contact %s or %s", "arman@example.com", "+1 415-555-2671")
# Output: contact REDACTED or REDACTED
logger.warning("login %(password)s", {"password": "hunter2"})
# Output: login REDACTEDOut of the box it redacts email addresses and phone numbers in the message text, and masks the value of common sensitive keys (e.g. password, token, api_key, ssn, credit_card, email, phone) in dicts and extra. Key matching is exact and case-sensitive, so add any other casings/spellings you use via mask_keys.
By default phone numbers are redacted for all built-in countries (US, GB, DE, FR, IN, AE, AU) plus any international +CC number. Pass phone_regions (ISO 3166-1 alpha-2 codes) to restrict it:
# only redact UAE and US phone numbers
loggingredactor.CommonPIIRedactingFilter(phone_regions=["AE", "US"])For accurate, validated detection — and coverage beyond the built-in countries — install the optional phonenumbers library; when present it's used automatically for the regions you list. Note that it is NOT a requirement however.
You can also grab a single country's pattern to use in a plain RedactingFilter:
from loggingredactor import RedactingFilter
from loggingredactor.common_pii_filters import PHONE
logging.getLogger().addFilter(RedactingFilter([PHONE.AE, PHONE.US]))CommonPIIRedactingFilter slots into a logging config just like RedactingFilter, so you get the standard patterns and keys without listing them yourself:
import logging.config
LOGGING = {
... # Your other configs
'filters': {
'pii': {
'()': 'loggingredactor.CommonPIIRedactingFilter',
'phone_regions': ('AE', 'US'), # optional; default is all built-in countries
'mask_keys': ('internal_id',), # optional; combined with the built-in keys
'mask_patterns': (re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'), ) # optional, combines with existng mask_patterns
'mask': 'REDACTED',
},
},
'handlers': {
... # Some handlers
'stdout': {
... # Some configs
'filters': ['pii'],
},
},
... # Rest of your configs
}
logging.config.dictConfig(LOGGING)Or pull specific standard patterns into a plain RedactingFilter:
import loggingredactor
'pii': {
'()': 'loggingredactor.RedactingFilter',
'mask_patterns': [loggingredactor.common_pii_filters.EMAIL, loggingredactor.common_pii_filters.PHONE.AE],
'mask': 'REDACTED',
},If redaction ever raises while processing a record, loggingredactor will not crash your application: the record is still emitted, and the error is reported via logger.exception on the originating logger. If you'd rather suppress that error log entirely, pass silent_failure=True (available on both filters):
loggingredactor.RedactingFilter(mask_patterns, silent_failure=True)
loggingredactor.CommonPIIRedactingFilter(silent_failure=True)It works in a logging config the same way, as a regular filter argument:
'filters': {
'pii': {
'()': 'loggingredactor.CommonPIIRedactingFilter',
'mask': 'REDACTED',
'silent_failure': True,
},
},- Added support for Python 3.13 and 3.14. (Reported in issue #13)
- Added
CommonPIIRedactingFilter, a filter preloaded with common PII patterns (email and phone numbers) and keys (passwords, tokens, etc.). User-suppliedmask_patterns/mask_keysare combined with the built-ins rather than replacing them. - Added per-country phone number redaction: select countries with
phone_regions(ISO 3166-1 alpha-2 codes) or use a single country's pattern directly vialoggingredactor.common_pii_filters.PHONE(e.g.PHONE.AE). When the optionalphonenumberslibrary is installed it is used for accurate, validated detection of the selected regions; it is never required. - Added a
silent_failurekeyword argument (defaultFalse) to both filters to suppress the error log emitted when redacting a record fails. - Generalised container redaction beyond
list/tupleto any standard sequence or set (e.g.set,frozenset,deque,namedtuple), preserving the original type where it can be reconstructed. Only registeredSequence/Settypes recurse, so third-party iterables whose iteration is costly or side-effecting (e.g. a DjangoQuerySet, or a numpy array) are left untouched.
- Redact sensitive data exposed through an object's
str/reprwhen it is logged, even if the object's type is not one of the explicitly handled types (e.g. a custom class whose__str__returns an email). The original object is left untouched, and numeric conversions such as%d/%fcontinue to use the real value since only__str__/__repr__are redacted. (Reported in issue #12) - Support mapping types with non-standard constructors, such as Django's
QueryDict. (Reported in issue #14) - Application crashes due to errors in loggingredactor no longer occur. loggingredactor will produce a handled
logger.exceptionon the originating logger and logging continues with the record otherwise intact. (Reported in issue #14) - A field whose name is in
mask_keysis now masked wholesale regardless of its value's type. Previously a container value (e.g. a dict or list) under a masked key was recursed into rather than masked, which could leave its contents unredacted.
Logging Redactor started as a fork of logredactor. However, due to the bugs present in the original (specifically the data mutations), it was not usable in production environments where the purpose was to only redact variables in the logs, not in their usage in the code. This, along with the fact that the original package is no longer maintained lead to the creation of Logging Redactor.