-
Notifications
You must be signed in to change notification settings - Fork 23
Tbain/253 add tags count #506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
0473c0e
2c0b959
8846805
b01964b
a23afe8
3df68ab
435808c
457313b
a14c56e
2055a07
939f18c
5762c33
79be83a
c2f79d2
c017e8a
7d42793
1e47967
a87c877
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -3,6 +3,8 @@ | |||||||
| """ | ||||||||
| from __future__ import annotations | ||||||||
|
|
||||||||
| from collections import Counter, defaultdict | ||||||||
|
|
||||||||
| from django.core import exceptions | ||||||||
| from django.db import models | ||||||||
| from django.http import Http404, HttpResponse | ||||||||
|
|
@@ -847,17 +849,54 @@ def get_queryset(self) -> TagDataQuerySet: | |||||||
| ) | ||||||||
| if depth == 1: | ||||||||
| # We're already returning just a single level. It will be paginated normally. | ||||||||
| if include_counts: | ||||||||
| return self._add_counts(results) | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very minor nit to be clear what this returns:
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit subjective, some might not like the suggestion. If @bradenmacdonald you like it please thumbs up this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a strong opinion here but I guess I do like your suggestion a bit more, as it's more clear. |
||||||||
| return results | ||||||||
| elif full_depth_threshold and len(results) < full_depth_threshold: | ||||||||
| # We can load and display all the tags in this (sub)tree at once: | ||||||||
| self.pagination_class = DisabledTagsPagination | ||||||||
| if include_counts: | ||||||||
| return self._add_counts(results) | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit subjective, some might not like the suggestion. If @bradenmacdonald you like it please thumbs up this. |
||||||||
| return results | ||||||||
| else: | ||||||||
| # We had to do a deep query, but we will only return one level of results. | ||||||||
| # This is because the user did not request a deep response (via full_depth_threshold) or the result was too | ||||||||
| # large (larger than the threshold). | ||||||||
| # It will be paginated normally. | ||||||||
| return results.filter(parent_value=parent_tag_value) | ||||||||
| filtered_results = results.filter(parent_value=parent_tag_value) | ||||||||
| if include_counts: | ||||||||
| return self._add_counts(filtered_results) | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit subjective, some might not like the suggestion. If @bradenmacdonald you like it please thumbs up this. |
||||||||
|
|
||||||||
| return filtered_results | ||||||||
|
|
||||||||
| def _add_counts(self, tag_data: TagDataQuerySet) -> TagDataQuerySet: | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: this could be a top-level function in Nit: Also, along the lines of @jesperhodge 's comments, it would be good to explain the de-duplication here or at lest say "refer to test case X for examples and details". (Or put clear details here and have the test cases refer back here). |
||||||||
| """ | ||||||||
| Add usage counts to a list of tag data dictionaries. For performance | ||||||||
| reasons, we call this function with the list result of the | ||||||||
| QuerySet so we can then add the counts in-memory rather than to a | ||||||||
| QuerySet which would require a very expensive annotation to join the | ||||||||
| in-memory data to the original QuerySet. | ||||||||
| """ | ||||||||
|
|
||||||||
| taxonomy = self.get_taxonomy() | ||||||||
| object_tags = taxonomy.objecttag_set.values_list("object_id", "tag__lineage") | ||||||||
| tag_counts: Counter[str] = Counter() | ||||||||
| object_tag_lineage_seen: defaultdict[str, set] = defaultdict(set) | ||||||||
|
|
||||||||
| for object_id, tag_lineage in object_tags: | ||||||||
| # split the lineages to get a dict of {tag.value: [lineages]} | ||||||||
| lineage_tags = list(tag_lineage.split('\t')) if tag_lineage else [] | ||||||||
| # de-duplicate based on if the lineage is already 'seen' per object | ||||||||
| unseen_tags = [t for t in lineage_tags if t not in object_tag_lineage_seen[object_id]] | ||||||||
|
|
||||||||
| tag_counts.update(unseen_tags) | ||||||||
| object_tag_lineage_seen[object_id].update(unseen_tags) | ||||||||
|
|
||||||||
| # In-memory 'annotation'; this is faster than using annotate() on the QuerySet. | ||||||||
| for row in tag_data: | ||||||||
| row["usage_count"] = tag_counts.get(row["value"], 0) | ||||||||
|
|
||||||||
| return tag_data | ||||||||
|
|
||||||||
| def post(self, request, *args, **kwargs): | ||||||||
| """ | ||||||||
|
|
||||||||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we keeping these
include_countsparameters if they don't work? I think you can just remove them (from here and from the function that calls these ones).There is no code referencing
include_countsin openedx-platform as far as I can tell, so even though it's a breaking change, it's fine to include such changes in our 0.x version of openedx-core.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't sure about keeping these or not. I know it's probably a "you aren't going to need it" type thing, but for completions sake, I felt like it was worth continuing them down to this level since they are a param in the query. I'm perfectly fine removing them though, I was 50/50 either way.