Add hashed_endpoint to disallow multiple shortened URLs (with same configs) to point to the same endpoint.
Since each endpoint is decrypted individually, using a per URL randomly generated IV (seed), the current implementation cannot test against duplicate endpoints without manually decrypting every row (which would be VERY bad and slow).
Solution:
Add a hashed_endpoint column to the database, using the same hashing algorithm used in the UrlCryptography interface and using a to the project unique salt. It represents the endpoint, but hashed, to keep it private, but allow for fast duplicate checks (requires index).
User experience:
When a user generates a shortened URL, pointing to an endpoint that has already been shortened, whose shortened URL has the same configuration, no new shortened URL is generated but the already existing is used and forwarded to the user.
When are two shortened URLs equal to each other?
A shortened URL is equal to another one, if their configs (i.e. expiration date, one-time use, password, etc.) and hashed endpoints are equal. Optionally, a unique index can be created that combines all of the said columns and ensures this behaviour on database level, in case the backend services fail (this is not necessarily recommended tho).
What is if two URLs are equal, but they have equal one-time uses?
If shortened URL A and B are both equal and both have a one-time use, they are not seen as duplicates.
Why would we implement this?
- Save storage: Over time, this can save a lot of unnecessarily wasted storage. Most users might go about and shorten a URL without modifiying it further (so setting no password, one-time use, etc.).
- Reduce indexes and saves speed: Commonly used URLs (such as
https://google.com) will not cause many many indexes of the same endpoint.
- Helps prevent spam
Add
hashed_endpointto disallow multiple shortened URLs (with same configs) to point to the same endpoint.Since each endpoint is decrypted individually, using a per URL randomly generated IV (
seed), the current implementation cannot test against duplicate endpoints without manually decrypting every row (which would be VERY bad and slow).Solution:
Add a
hashed_endpointcolumn to the database, using the same hashing algorithm used in the UrlCryptography interface and using a to the project unique salt. It represents the endpoint, but hashed, to keep it private, but allow for fast duplicate checks (requires index).User experience:
When a user generates a shortened URL, pointing to an endpoint that has already been shortened, whose shortened URL has the same configuration, no new shortened URL is generated but the already existing is used and forwarded to the user.
When are two shortened URLs equal to each other?
A shortened URL is equal to another one, if their configs (i.e. expiration date, one-time use, password, etc.) and hashed endpoints are equal. Optionally, a unique index can be created that combines all of the said columns and ensures this behaviour on database level, in case the backend services fail (this is not necessarily recommended tho).
What is if two URLs are equal, but they have equal one-time uses?
If shortened URL
AandBare both equal and both have a one-time use, they are not seen as duplicates.Why would we implement this?
https://google.com) will not cause many many indexes of the same endpoint.