HTML doesn’t allow nested comments. However, both Firefox and Chromium are somewhat lenient about that which can result in surprising issues when you parse a document with Floki (I tried this with 0.37.0):
raw_html = """
<!doctype html>
<body>
Before the comment<br>
<!--[if mso | IE]>
<div>
<!-- this is a nested comment -->
</div>
<![endif]-->
After the comment.
</body>
"""
parsed_html = raw_html |> Floki.parse_document!() |> Floki.raw_html()
File.write!("raw-html.html", raw_html)
File.write!("parsed-html.html", parsed_html)
raw-html.html looks exactly like the original string:
<!doctype html>
<body>
Before the comment<br>
<!--[if mso | IE]>
<div>
<!-- this is a nested comment -->
</div>
<![endif]-->
After the comment.
</body>
But parsed-html.html looks like this:
<body>
Before the comment<br/><!--[if mso | IE]>
<div>
<!-- this is a nested comment -->
<![endif]-->
After the comment.
</body>
Floki escapes the > of the outer comment to >. And because browsers are lenient when handling nested comments, this changes the way this file is displayed:


I’m not sure if this could be considered a bug but I did find it somewhat unexpected.
HTML doesn’t allow nested comments. However, both Firefox and Chromium are somewhat lenient about that which can result in surprising issues when you parse a document with Floki (I tried this with
0.37.0):raw-html.htmllooks exactly like the original string:But
parsed-html.htmllooks like this:Floki escapes the
>of the outer comment to>. And because browsers are lenient when handling nested comments, this changes the way this file is displayed:I’m not sure if this could be considered a bug but I did find it somewhat unexpected.