homograph warning#12068
Conversation
|
Hi @Douglas-MacGregor, thanks for your interest in contributing! This project requires that pull request authors are vouched, and you are not in the list of vouched users. This PR will be closed automatically. See https://github.com/ghostty-org/ghostty/blob/main/CONTRIBUTING.md for more details. |
|
!vouch |
Triggered by [comment](#12068 (comment)) from @jcollie. Vouch: @Douglas-MacGregor Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Thank you, I hope the class learned a lot! I'll give an honest review in an attempt to be fair in how the real world would review this, but I hope the takeaway is educational rather than demoralizing in any way. The fact you got this far is in and of itself a great achievement.
The major issue is I think this PR spans too many subsystems: it attempts to do core work, libghostty work, UI work across both Linux and macOS all in a single large PR. In any context (open source or professional), it is better to break things down into smaller parts. It makes it easier to communicate with coworkers/maintainers and for us to review them.
Concretely within this:
The core (input/paste):
- I don't think we need all the span stuff. A pure boolean safe/unsafe is fine. We don't need to explain our reasoning for why it is unsafe. We don't do that today and so we don't need to add it.
- I'd like to learn more about why you chose the given ranges for confusables.
- I'd expose it in
paste.zigviaisSafe. If we want to expose reasons, I think it'd be cleaner to change that to something likevalidateSafetyand an error union with void as the result.
The rest:
- Remove all libghostty-vt stuff
- Remove all UI work and let it bubble up to the standard unsafe paste warnings we already have that make no attempt to explain why it is unsafe
I think those could be valuable but as a separate body of work.
|
Thanks for the feedback here. On behalf of the team, we really appreciate it. The main reason why we had scope creep is largely due to our requirements on our end (our PR had to be significant and span a number of files) for our course deliverable. I fully agree that these changes should be smaller in scope, and I can see a solution that lives fully within As far as your question on confusables, we chose to check directly for scripts that can look like latin glyphs in certain fonts. This includes the Cyrillic range (U+0400-U+052F), Greek range (U+0370-U+03FF), Armenian range (U+0530-058F), and the IPA extension small g (U+0261). We did this because we wanted to minimize false positives on mixed-script usage, but in retrospect it's probably simpler to just detect any mixed script at all, following the definition for single-script strings from TR39. From an educational perspective, this has been a good exercise for us to try working with larger projects, and diving into how Unicode works has been quite interesting. Thanks again for reviewing our work and giving us a chance to break into open-source projects. |
This PR addresses issue #10762 by detecting potential homograph attacks in user input
and warning the user when suspicious characters are used.
Closes #10762
Hi Maintainers at ghostty,
My team and I are students and this PR is part of the classic Computer Science project where we are trying to tack some OSP issues.
We had a lighter weight version of the feature, but it was running client side in the UI not as apart of the system which we thought maybe wasn't right, but maybe this version touches too many files....?
We hope the solution is helpful and, while we are entering exam season, one of us will try to be available to answer questions if needed. Thank you for your time and feedback!