Isn't this functionality a part of what toc needs and does? Then probably the toc plugin's code could be split into the part that implements the headinganchors's functionality and the TOC generation itself. That will bring more order into the code and the set of available plugins. --Ivan Z.

Indeed it is. Except toc generates headings differently - and independently of this. Even if toc's functionality would be split, you'd probably want to retain backwards compatibility there, so it's unlikely that this will happen... Also see toc-with-human-readable-anchors. --anarcat

A patch to make it more like MediaWiki:

@@ -5,6 +5,7 @@
 use warnings;
 use strict;
 use IkiWiki 2.00;
+use URI::Escape;
 sub import {
         hook(type => "sanitize", id => "headinganchors", call => \&headinganchors);
@@ -14,9 +15,11 @@
         my $str = shift;
         $str =~ s/^\s+//;
         $str =~ s/\s+$//;
-        $str = lc($str);
-        $str =~ s/[&\?"\'\.,\(\)!]//mig;
-        $str =~ s/[^a-z]/_/mig;
+        $str =~ s/\s/_/g;
+        $str =~ s/"//g;
+        $str =~ s/^[^a-zA-Z]/z-/; # must start with an alphabetical character
+        $str = uri_escape_utf8($str);
+        $str =~ s/%/./g;
         return $str;


This was applied in 3.20110608 --smcv

I think using this below would let the source html clear for the browser without changing the render:

    #use URI::Escape

    #$str = uri_escape_utf8($str);
    $str = Encode::decode_utf8($str);
    #$str =~ s/%/./g;

Don't you think ? mathdesc

Older HTML and URI specifications didn't allow Unicode in IDs or fragments, but HTML5 and IRIs do. See also i18nheadinganchors and its discussion page.

I think we should probably try to make these autogenerated IDs punctuation-independent by stripping most non-word characters, like Pandoc does: I would not expect changing ## Headings, maybe with punctuation to ## Headings (maybe with punctuation) to have any effect on the generated "slug" headings-maybe-with-punctuation. --smcv