Skip to content

TOC:Anchor link written in Japanese does not work #1118

Closed
@juu7g

Description

@juu7g

I'm using the extension TOC with slugify_unicode for Japanese.
And I'm using anchor links.
In some cases, Japanese anchor link does not work.
That is when Japanese characters contains dakuon(for example 'ba')
or handakuon(for example 'pa').
I think this is because the characters in the generated ID
and the characters in the header are different.

Sample Markdown:

[TOC]
[anchor link to プログラム](#プログラム)  
[anchor link to ぷろぐらむ](#ぷろぐらむ)  

##プログラム

##ぷろぐらむ

Generated html:

<div class="toc">
<ul>
<li><a href="#フロクラム">プログラム</a></li>
<li><a href="#ふろくらむ">ぷろぐらむ</a></li>
</ul>
</div>
<p><a href="#プログラム">anker link to プログラム</a><br />
<a href="#ぷろぐらむ">anker link to ぷろぐらむ</a>  </p>
<h2 id="フロクラム">プログラム</h2>
<h2 id="ふろくらむ">ぷろぐらむ</h2>

The result I expect is:

<div class="toc">
<ul>
<li><a href="#プログラム">プログラム</a></li>
<li><a href="#ぷろぐらむ">ぷろぐらむ</a></li>
</ul>
</div>
<p><a href="#プログラム">anker link to プログラム</a><br />
<a href="#ぷろぐらむ">anker link to ぷろぐらむ</a>  </p>
<h2 id="プログラム">プログラム</h2>
<h2 id="ぷろぐらむ">ぷろぐらむ</h2>

As far as I can tell, this depends on how the unicodedata.normalize()
method arguments are used.
In other words, I think we need to change the first argument
from "NFKD" to "NFKC".

Reference: Difference Between NFD, NFC, NFKD, and NFKC Explained with Python Code | by Xu LIANG | Towards Data Science

I'm not familiar with unicode.
And this is my first post.
Please investigate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug report.extensionRelated to one or more of the included extensions.needs-confirmationThe alleged behavior needs to be confirmed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions