Skip to content

Always wrap CodeHilite code in <code> tags #862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

t-mart
Copy link
Contributor

@t-mart t-mart commented Sep 29, 2019

Goal

Currently, the HTML generated by CodeHilite

  • looks like <pre>stuff</pre> if you are using Pygments
  • looks like <pre><code>stuff</code></pre> tags if you are not using Pygments.

See logic here: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/codehilite.py#L106

I argue that we should always be using the nested <code> tags in both cases. The code in this PR makes the two cases consistent.

Background

So, backing up a bit:

  • <code> indicates that the enclosed text is computer code. It can be used inline. In other words, this is literally rendered with a code tag.

  • <pre> indicates that the enclosed text's whitespace should be preserved. It creates a block.

  • Using them together?:

    The following example shows how a block of code could be marked up using the pre and code elements.
    <pre><code class="language-pascal">var i: Integer; ...

    from HTML Living Standard/HTML5

    "To represent multiple lines of code, wrap the <code> element within a <pre> element. The <code> element by itself only represents a single phrase of code or line of code."

    from MDN

Example

# This should render as just a <pre> tag.
# You get something like if you render with CodeHilite and you are using Pygments.
# This should render as a <pre> and <code> tag.
# You get something like if you render with CodeHilite and you **are not** using Pygments,
# and there is no way to opt out, but frankly, I think this is the right way to do it.

Note that there probably no visual difference in your browser, but the underlying HTML structure is different. Use your browser to inspect.

Rationale

We should always be wrapping our code blocks with code tags for a few reasons, I think:

  • Markdown specs (pick, your, poison) mandates <code> tags.
  • No matter which spec you look at, they're always referred to as code blocks. The output should match that.
  • In this very project, it's already hardcoded that the <code> tags get produced if you are not using Pygments, but otherwise using Pygments omits them. CodeHilite has been doing it this way for at least 9 years. If the community doesn't like this PR, we should at least be consistent in how we output nonetheless and figure out some other solution to give users control.
  • The HTML Living Standard, HTML5 and MDN suggest that blocks of code can be marked up with a code tag nested inside a pre tag.
    • Pygments further has been offering an opt-in keyword argument for <code> tag wrapping in its HTMLFormatter since version its 2.4 version released May 8, 2019, but we have not been utilizing it.
  • It offers designers and other consumers of HTML more structure to differentiate between code and other text that may be in the same <pre> tag.

Breakages

  • This is a backwards incompatible change for users that expect the output of CodeHilite while using Pygments to not have <code> tags. In my opinion, I think this will affect a low number of people because of the reasons listed in the rationale. Check the test string updates for how this will look.
    • For what it's worth, CodeHilite using <code> tags when no Pygments is present was introduced in a minor version, 2.1.0. I'm not saying that we should necessarily push this and break semantic versioning...again... but if there's some lower bar for this kind of change that I don't know about, then we might be ok.

Non-Breakages

Pygments since it's first major version release, 1.0 (again, current is 2.4.3), has been supporting arbitrary options being passed to its HTMLFormatter. With how this PR is coded, there should be no errors to users using the wrapcode option of at least that version, but the option will only have effect for people with Pygments >2.4.

This PR's Strategy

  • Always pass in wrapcode=True to the Pygments HTMLFormatter
  • Update tests
  • Verify tests pass locally
  • Update docs and docstrings

@waylan
Copy link
Member

waylan commented Sep 29, 2019

Cool! I hadn't noticed that Pygments added this option. This has always bugged me about Pygments' output. Glad to see it added (in other words, I didn't need all that convincing but its now documented for others so your effort is not waisted).

Pygments since it's first major version release, 1.0 (again, current is 2.4.3), has been supporting arbitrary options being passed to its HTMLFormatter. With how this PR is coded, there should be no errors to users using the wrapcode option of at least that version, but the option will only have effect for people with Pygments >2.4.

That addresses my one concern. Thank you.

For what it's worth, CodeHilite using <code> tags when no Pygments is present was introduced in a minor version, 2.1.0. I'm not saying that we should necessarily push this and break semantic versioning...again... but if there's some lower bar for this kind of change that I don't know about, then we might be ok.

As it turns out, our next planned release is 3.2, so no problem there anyway.

Update the docs, including the release notes, and I'll merge this.

@waylan waylan added confirmed Confirmed bug report or approved feature request. extension Related to one or more of the included extensions. feature Feature request. work-in-progress A partial solution. More changes will be coming. labels Sep 29, 2019
@waylan waylan added this to the Version 3.2 milestone Sep 29, 2019
@t-mart t-mart force-pushed the codehilite-always-wrap-in-code-tag branch from 947b4d5 to d629d6a Compare September 30, 2019 03:33
@t-mart
Copy link
Contributor Author

t-mart commented Sep 30, 2019

I updated the docs. Let me know if it's what you're looking for. (I also hope the force push was okay... do you prefer to squash on merge instead?)

Also, ...

As I've been working on this small section of CodeHilite, I have also noticed some gnarly other bits of it. There are options that only apply with/apply without/apply differently depending on Pygments/use_pygments, the Extension interface to it seems out of date with the underlying CodeHilite class (option set differs) and the docs in general could use a pass over.

I'd like to take some time to refactor it if you have the appetite for that.

I'd also like to make Pygments a required package for Python-Markdown to reduce complexity:

  • Aside from small things like <code> tags, there are a number of other differences between the code CodeHilite produces depending on Pygments, such as what classes get applied to what tags.
  • Instead of trying to replicate Pygments with a reduced feature set when it does not exist, I think we should be delegating to it.
  • There don't seem to be any other 3rd party packages apart from setuptools. How do you feel about requiring that?
  • This could also be an opportunity to round-out the header parsing logic since there's only one target: Pygments. This was kind of the issue with CodeHilite: Support Pygments options #822 .

Thoughts?

@waylan waylan added approved The pull request is ready to be merged. and removed work-in-progress A partial solution. More changes will be coming. labels Sep 30, 2019
@waylan waylan merged commit c6a9985 into Python-Markdown:master Sep 30, 2019
@waylan
Copy link
Member

waylan commented Sep 30, 2019

do you prefer to squash on merge instead

I squash except for those rare occasions were a PR contains a perfect commit. With your force push, your commit was good. However, I still used squash so I could add a technical note to the commit message. In other words, as long as the final product is good, I'll happy squash it. GitHub makes that easy with their web interface.

I'd like to take some time to refactor it if you have the appetite for that.

This extension is over a decade old, so yeah, it needs some updating/refactoring which I have not had time to work on. Reviewing some of the open issues/PRs (#334, #775, #652, #822 and perhaps others) will provide some insight into the planned direction. Feel free to ask questions or propose solutions in the various discussions in those issues/PRs.

Note that the extension serves two functions:

  1. Provide a mechanism for defining the language of indented code blocks. This allows the user to add their own code-highlighting (perhaps via a JavaScript lib) separately. The idea is that we replicate the output and behavior of fenced code blocks for indented code blocks.
  2. Highlight the code within all recognized code blocks, including any alternative code blocks provided by extensions. Obviously, this includes fenced code blocks, but should be able to work with code blocks produced by other third party extensions as well.

Note that the extension always performs function 1. However, function 2 can be optionally turned off by the user. Any refactor must maintain this division of functions.

I'd also like to make Pygments a required package for Python-Markdown

As long as CodeHilite is included with Python-Markdown I am not comfortable with this. We are proud of the fact that Python-Markdown has no dependencies (except Setuptools). Many users of Python-Markdown never use or need CodeHilite. Others may use CodeHilite for function 1 (as described above) but never need Pygments. Therefore, we should not require all users of Python-Markdown to have Pygments installed.

We have contemplated breaking extensions out into separate packages. If that were to happen at some point in the future, then yes, requiring Pygments as a dependency would make sense. Until then, I'd prefer to leave the requirement as-is. At this time, we are not planning on breaking out the extensions.

cuu508 added a commit to healthchecks/healthchecks that referenced this pull request Feb 18, 2020
devs-cloud added a commit to devs-cloud/django-healthcheck that referenced this pull request Feb 22, 2020
nikramakrishnan added a commit to freetype/docwriter that referenced this pull request Feb 29, 2020
See

  Python-Markdown/markdown#862

for more information.

* docs/reference/markdown/stylesheets/extra.css: Update 'pre' styles
to work for 'pre>code'.

* docwriter/tomarkdown.py: Add <code> tag to source code blocks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved The pull request is ready to be merged. confirmed Confirmed bug report or approved feature request. extension Related to one or more of the included extensions. feature Feature request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants