Move backslash unescaping to treeprocessor #1272

waylan · 2022-07-13T19:46:14Z

By unescaping backslash escapes in a treeprocessor, the text is properly
escaped during serialization. Fixes #1131.

As it is recognized that varous third-party extensions may be calling the
old class at postprocessors.UnescapePostprocessor the old class remains
in the codebase, but has been deprecated and will be removed in a future
release. The new class treeprocessors.UnescapeTreeprocessor should be
used instead.

By unescaping backslash escapes in a treeprocessor, the text is properly escaped during serialization. Fixes Python-Markdown#1131. As it is recognized that varous third-party extensions may be calling the old class at `postprocessors.UnescapePostprocessor` the old class remains in the codebase, but has been deprecated and will be removed in a future release. The new class `treeprocessors.UnescapeTreeprocessor` should be used instead.

waylan

Below are a few comments and concerns I have about this change. Feedback is welcome.

waylan · 2022-07-13T19:50:04Z

tests/basic/backlash-escapes.html

@@ -9,7 +9,7 @@
 <p>Right bracket: ]</p>
 <p>Left paren: (</p>
 <p>Right paren: )</p>
-<p>Greater-than: ></p>
+<p>Greater-than: &gt;</p>


This is the one and only change in behavior in the existing tests. I'm okay with this, however, as technically this results in valid output. The reason for the change is that the angle bracket gets escaped during serialization. Previously, a placeholder was there during serialization, which was swapped out for the actual character later. The whole point of this change was to better ensure valid HTML output, so this is an acceptable change in behavior.

Having unescaped > in HTML was a bug, so good that you fixed it.

waylan · 2022-07-13T19:53:11Z

markdown/treeprocessors.py

+        """ Loop over all elements and unescape all text. """
+        for elem in root.iter():
+            # Unescape text content
+            if elem.text and not elem.tag == 'code':


I'm not sure we actually need to skip code tags, In fact, if I remove the check, the tests all pass. In fact, the previous code did not have a way to distinguish between code and other content. However, there is always a possibility that code could intentionally contain what looks like a placeholder. In that case, the content should not be altered. Therefore, I have left the check in.

markdown/extensions/toc.py

facelessuser · 2022-07-14T17:27:12Z

I'll try and look at this soon. I'd like to pull it and see how it impacts some of my things.

mitya57

As all tests pass and the bug is fixed, I am happy with this change.

facelessuser

I see nothing breaking on my end. Seems good.

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

This replaced the deprecated `markdown.postprocessors.UnescapePostprocessor` in Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

This replaced the deprecated `markdown.postprocessors.UnescapePostprocessor` in Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

This replaced the deprecated `markdown.postprocessors.UnescapePostprocessor` in Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

waylan commented Jul 13, 2022

View reviewed changes

cleanup

2e3d14d

waylan requested review from facelessuser and mitya57 July 13, 2022 20:11

waylan added the needs-review Needs to be reviewed and/or approved. label Jul 14, 2022

Remove unnessecary unescape call from toc ext.

acb0c31

mitya57 approved these changes Jul 14, 2022

View reviewed changes

facelessuser approved these changes Jul 14, 2022

View reviewed changes

waylan merged commit c0f6e5a into Python-Markdown:master Jul 15, 2022

waylan deleted the 1131 branch July 15, 2022 12:38

waylan added approved The pull request is ready to be merged. and removed needs-review Needs to be reviewed and/or approved. labels Jul 15, 2022

andersk added a commit to andersk/zulip that referenced this pull request Feb 4, 2023

markdown: Replace deprecated UnescapePostprocessor.

44a448f

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

andersk mentioned this pull request Feb 4, 2023

markdown: Replace deprecated UnescapePostprocessor zulip/zulip#24289

Merged

andersk mentioned this pull request Feb 4, 2023

Add markdown.treeprocessors.UnescapeTreeprocessor python/typeshed#9671

Merged

andersk added a commit to andersk/zulip that referenced this pull request Feb 4, 2023

markdown: Replace deprecated UnescapePostprocessor.

e639399

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

timabbott pushed a commit to zulip/zulip that referenced this pull request Feb 5, 2023

markdown: Replace deprecated UnescapePostprocessor.

b91788b

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

UjjwalAggarwal-1 pushed a commit to UjjwalAggarwal-1/zulip that referenced this pull request Feb 20, 2023

markdown: Replace deprecated UnescapePostprocessor.

0266af4

See Python-Markdown/markdown#1272. Signed-off-by: Anders Kaseorg <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move backslash unescaping to treeprocessor #1272

Move backslash unescaping to treeprocessor #1272

Uh oh!

waylan commented Jul 13, 2022

Uh oh!

waylan left a comment

Uh oh!

waylan Jul 13, 2022

Uh oh!

mitya57 Jul 14, 2022

Uh oh!

waylan Jul 13, 2022

Uh oh!

Uh oh!

facelessuser commented Jul 14, 2022

Uh oh!

mitya57 left a comment

Uh oh!

facelessuser left a comment

Uh oh!

Uh oh!

Move backslash unescaping to treeprocessor #1272

Move backslash unescaping to treeprocessor #1272

Uh oh!

Conversation

waylan commented Jul 13, 2022

Uh oh!

waylan left a comment

Choose a reason for hiding this comment

Uh oh!

waylan Jul 13, 2022

Choose a reason for hiding this comment

Uh oh!

mitya57 Jul 14, 2022

Choose a reason for hiding this comment

Uh oh!

waylan Jul 13, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facelessuser commented Jul 14, 2022

Uh oh!

mitya57 left a comment

Choose a reason for hiding this comment

Uh oh!

facelessuser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!