recently fixed bugs

From the source of usage:

<a href="">&#x6A;&#111;&#101;&#x79;&#64;i&#107;&#105;w&#105;&#107;&#x69;&#46;&#105;n&#x66;&#x6F;</a>

Text::Markdown obfuscates email addresses in the href= attribute and in the text. Apparently this can't be configured.

HTML::Scrubber doesn't set attr_encoded for its HTML::Parser, so the href= attribtute is decoded. Currently it seems it doesn't set attr_encoded for good reason: so attributes can be sanitized easily, e.g. as in htmlscrubber with $safe_url_regexp. This apparently can't be configured either.

So I can't see an obvious solution to this. Perhaps improvements to Text::Markdown or HTML::Scrubber can allow a fix.

One question is: how useful is email obfuscation? Don't spammers use HTML parsers?

I now see this was noted in the formatting discussion, and won't/can't be fixed. So I guess this is done. --Gabriel

I've patched to prevent Text::Markdown from obfuscating the emails. The relevant commits are on the master branch of my "fork" of ikiwiki on Github:

  • 7d0970adbcf0b63e7e5532c239156f6967d10158
  • 52c241e723ced4d7c6a702dd08cda37feee75531


Thanks for coming up with a patch, but overriding Text::Markdown::_EncodeEmailAddress gets into its internals more than I'm comfortable with.

It would probably be best to add an option to Text::Markdown to let the email address munging be disabled. --Joey

Email obfuscation is very useful -- in practice, spammers apparently don't use HTML parsers -- according to the only published study I have read ( a 2003 study by the Center for Democracy and Technology cited by ).

Posted Mon Jul 21 23:25:17 2008

Now that the rst plugin uses Python3, the test should test docutils existence also with Python3:

--- rst.t.orig  2018-02-28 10:41:06.000000000 +0000
+++ rst.t   2018-03-03 17:17:23.862702468 +0000
@@ -3,7 +3,7 @@
 use strict;

-   if (system("python -c 'import docutils.core'") != 0) {
+   if (system("python3 -c 'import docutils.core'") != 0) {
        eval 'use Test::More skip_all => "docutils not available"';

Applied, thanks. --smcv

Posted Sat Mar 3 13:21:45 2018

For around 2 weeks, I've been getting an increasing quantity of nonspecific reports from users of login problems on ikiwiki sites, mostly and A few users are still logging in successfully, but it seems to be hitting many users; post volume has gone down more than holidays would explain.

It doesn't seem limited to any login method; email and password have both been said not to work. (Openid too, but could be openid provider problem there.)

I have not managed to reproduce the problem myself, using firefox, firefox-esr, or chromium. --Joey

Otto Kekäläinen described to me a problem where email login to post a comment seemed to work; it displayed the comment edit form; but posting the form went back to the login page. Cookie problem?

Ok, to reproduce the problem: Log into using https. The email login link is a http link. The session cookie was set https-only. --Joey

The reason the edit form is able to be displayed is that emailauth sets up a session, in getsession(), and that $session is used for the remainder of that cgi call. But, a cookie for that session is not stored in the browser in this case. Ikiwiki does send a session cookie, but the browser seems to not let an existing https-only session cookie be replaced by a new session cookie that can be used with http. (If the emailed link, generated on https is opened in a different browser, this problem doesn't happen.) There may have been a browser behavior change here?

So what to do about this? Sites with the problem have redirect_to_https: 0 and the cgiurl is http not https. So when emailauth generates the url with cgiurl_abs, it's a http url, even if the user got to that point using https.

I suppose that emailauth could look at $ENV{HTTPS} same as printheader() does, to detect this case, and rewrite the cgiurl as a https url. Or, printheader() could just not set "-secure" on the cookie, but that does degrade security as MITM can then steal the cookie you're using on a https site.

Of course, the easy workaround, increasingly a good idea anyway, is to enable redirect_to_https.. --Joey

One of the users also reported a problem with password reset, and indeed, passwordauth is another caller of cgiurl_abs. (The only other caller, notifyemail, is probably fine.) The emailed password reset link also should be https if the user was using https. So, let's add a cgiurl_abs_samescheme that both can use. --Joey

fixed.. At least I hope that was the thing actually preventing most of the people from logging in. --Joey

Posted Thu Jan 4 19:00:45 2018

I'm having a hard time figuring out how the creation time, modification time, internal ctime and mtime fields (in indexdb) play along with the meta directive.

In some articles I write, I hardcode the creation and modification time, because they are imported from, as such:

[[!meta  title="The cost of hosting in the cloud"]]
[[!meta  date="2018-02-281T12:00:00-0500"]]
[[!meta  updated="2018-03-12T14:22:45-0500"]]

But strangely, that article does not show up as "created" on "february 28th": it shows up as "Created 6 days and 20 hours ago", ie. "march 12th" (2018-03-12T18:29:12Z). That is strange, because that's the modification date (meta updated), not the creation date. Similarly, the "edited" date is 2018-03-19T14:47:45Z (40 minutes ago), which is more or less accurate: the page was modified some time ago, but shouldn't the meta tag override that? Note that the edited date matches the file's mtime field in the source directory:

w-anarcat@marcos:~$ LANG=C stat source/blog/2018-03-12-cost-of-hosting.mdwn 
  File: source/blog/2018-03-12-cost-of-hosting.mdwn
  Size: 14022           Blocks: 32         IO Block: 4096   regular file
Device: fd05h/64773d    Inode: 7905532     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1026/w-anarcat)   Gid: ( 1026/w-anarcat)
Access: 2018-03-19 11:19:21.237074935 -0400
Modify: 2018-03-19 10:47:45.000000000 -0400
Change: 2018-03-19 11:19:20.509065456 -0400
 Birth: -

This wouldn't be so much of a problem if that stuff was consistent: but it's not really. What's supposed to be the following article actually shows up before in the blog index which is rather annoying. It's also backwards in the RSS feed, which will possibly break some feed readers who will miss the new article.

That newer article shows up as Created 12 days and 15 hours ago (2018-03-07T00:00:00Z) and also "edited 40 minutes ago" (2018-03-19T14:51:29Z). It has the following meta:

[[!meta title="Easy photo galleries with Sigal"]] [[!meta date="2018-03-07T00:00:00+0000"]] [[!meta updated="2018-03-19T10:26:12-0400"]]

So there the date meta tag works: the creation date is correct, but obviously, it means it comes before the other article, because that one doesn't get loaded correctly.

By now, clever folks will have noticed the problem: it's with the first timestamp:

[[!meta  date="2018-02-281T12:00:00-0500"]]

There's an extra one in there! Obviously, february 281 is not a valid date... What happened is that I sometimes modify those dates by hand, and I sometimes mess it up. I actually messed it up twice there, the original timestamps were:

[[!meta  date="2018-02-281T12:00:00-0500"]]
[[!meta  updated="2018-14-22T14:22:45-0500"]]

The error, in the second one, is that I put the time instead of the date (!). I must have been very distracted, but still it's kind of hard to edit those timestamps correctly. I think the fundamental problem here is that Ikiwiki doesn't say anything when those timestamps can't be parsed properly. It seems to me there should be an error somewhere, if not directly in the page, at least in the rendering logs or somewhere, if the date cannot be parsed correctly.

So, long story short: shouldn't invalid dates in meta tags yield an error of some sort instead of being silently ignored? I spent half an hour figuring this one out, going as far as regenerating the whole wiki to try and see if it was a caching issue in indexdb...


-- anarcat

If you're reporting a bug, it would be helpful to lead with the actual bug and save the account of how you tried to debug it for later (or omit it). I've moved this from a forum post into a bug report.

The meta plugin uses Date::Parse::str2time from the TimeDate Perl library, so it has whatever error handling that has. However, historically any error has essentially been ignored, which I think is a bug.

[[!meta date]] and [[!meta updated]] have two purposes:

  • they create <meta name="date" content="xxx"> and <meta name="updated" content="xxx">
  • they change the ctime/mtime used by ikiwiki for sorting, etc.

I think the historical assumption was that even if the date can't be parsed for the second purpose, you still want the first purpose. However, you're right that this is really fragile, and the first purpose seems fairly niche anyway. In ikiwiki git master (to be released as 3.20180321 or later) I've made [[!meta date=...]] and [[!meta updated=...]] produce an error message if you don't have Date::Parse or if the date/time is malformed.

In the unlikely event that someone really wants <meta name="date" content="xxx"> without parsing the date, they can still use [[!meta name="date" content="xxx"]].


To my defense, when I wrote this, I didn't consider this a bug: I was assuming the problem I was seeing was just some dumb mistake that I made and, indeed, there was one such formatting mistake.

But yeah, I could have re-edited this whole thing to make it look better. I'm sorry, but I was at the end of an already long yak-shaving session...

I wasn't sure if doing an error was the right way to go, as this might break rendering for existing sites... But I'm glad you fixed this anyways!

Thank you for the super-fast-response! :) I also tried updating the meta directive documentation so that it's a little more detailed about that stuff. I hope that's alright... -- anarcat

Posted Mon Mar 19 11:39:53 2018 Tags: done

What I did

Upgraded from 3.20180105 to 3.20180228 (from pkgsrc). No change to installed Text::Markdown::Discount (0.11nb4 from pkgsrc, using upstream's bundled Discount library).

What I expected to happen

Markdown-style links to continue being rendered as before.

What actually happened

Markdown-style links stopped being links. Instead, they render the part in square brackets as ordinary text.

Proximate cause

In f46e429, if I comment out the MKD_GITHUBTAGS if block, the problem goes away.

Further causes and possible solutions

Some guesses:

  • Sufficiently old versions of the Discount library may break when passed unrecognized flags, in which case ikiwiki might want to version-check before passing flags
  • The version of the Discount library bundled with upstream Text::Markdown::Discount may be extremely old, in which case pkgsrc might want to make it depend instead on an external Discount package

This appears to be because MKD_GITHUBTAGS and MKD_LATEX both have numeric values that were previously used for the internal flag IS_LABEL, which strips HTML (its value has changed a couple of times).

Having thought about this a bit, I realise I can probe for the values of these flags by processing HTML that should have different results with the flag set or unset, which would be safer than just blindly using them.

Orthogonally, pkgsrc should probably use an up-to-date version of Discount, and we already know that Text::Markdown::Discount needs updating. --smcv

This should be fixed in current git. The mdwn module now detects what your version of Discount supports by trying several short HTML fragments that render differently under the different flags. --smcv

Posted Thu Mar 8 13:36:29 2018

we've recently updated Imagemagick in NixOS from version 6.9.7-6 to 6.9.8-4, and this change causes the Ikiwiki test suite to fail in t/img.t, like so:

#   Failed test at t/img.t line 119.
#          got: 'no image'
#     expected: '10x10'

#   Failed test at t/img.t line 129.
#          got: 'no image'
#     expected: '12x12'

#   Failed test at t/img.t line 130.
#          got: 'no image'
#     expected: '16x2'

#   Failed test at t/img.t line 134.
#          got: 'no image'
#     expected: '8x8'

#   Failed test at t/img.t line 135.
#          got: 'no image'
#     expected: '4x4'

#   Failed test at t/img.t line 136.
#          got: 'no image'
#     expected: '6x6'

#   Failed test at t/img.t line 138.
#          got: 'no image'
#     expected: '11x11'

#   Failed test at t/img.t line 139.
#          got: 'no image'
#     expected: '12x12'

#   Failed test at t/img.t line 140.
#          got: 'no image'
#     expected: '13x13'
# Looks like you failed 9 tests of 62.
t/img.t ........................
Dubious, test returned 9 (wstat 2304, 0x900)
Failed 9/62 subtests

Is this is a known problem and is there maybe a fix for this issue?

This was not a known bug before your report. It looks as though every time we use Image::Magick->Read(":foo.png"), which is (or was) ImageMagick's syntax for opening a file of unknown type without interpreting a prefix containing : as a special directive instead of part of the filename, it fails.

Please try re-running the test with better diagnostics using commit 4ace7dbb7 and report what it says. --smcv

I see the same issue on Fedora, with ImageMagic 6.9.9-19:

#   Failed test at t/img.t line 119.
#          got: 'no image: Exception 435: unable to open image `:t/tmp/out/imgconversions/10x-redsquare.png': No such file or directory @ error/blob.c/OpenBlob/2701'
#     expected: '10x10'

So it seems, that an empty coder prefix is not accepted anymore. To me it seems that this commit changed the behavior. Unfortunately, the commit message doens't tell us about the reasons behind. The commit is included from version 6.9.8-3 on.

This should now be fixed in git and in the next release. The test failure does not indicate a loss of functionality, unless you are using uncommon image formats enabled with img_allowed_formats: [everything], which is a potential security vulnerability because it exposes the attack surface of all ImageMagick decoder modules. --smcv

Posted Thu Jun 22 09:55:52 2017


While working on Reproducible Builds for Tails, we noticed that the img plugin's output is not deterministic: PNG images embed timestamps.

The img-determinism branch in the Git repository has a fix for this problem + a new test (that fails without this change, and succeeds with the branch merged).


Thanks, merged --smcv

Posted Fri Sep 1 15:38:29 2017 Tags:

The new way to confirm ownership of a domain on Flattr is to add a meta tag to the page head. For example:

 <meta name="flattr:id" content="4j6y0v">

However, the meta directive doesn't allow setting of names with colons.

If I do this:

[[!meta  flattr:id="4j6y0v"]]

it gets rendered as:

<meta name="flattr:id=&quot;4j6y0v&quot;" content="" />

I tried a number of possibilities and they all failed to produce the correct output.

Directive syntax doesn't allow named arguments containing colons, so we would have to add a different syntax for weird names. However, we already have that:

[[!meta  name="flattr:id" content="4j6y0v"]]

This was missing from the documentation, but I have now added it. This feature was broken until 2015, but we now have an automated test to make sure it keeps working; the test includes a check for twitter:card which is essentially equivalent to what you're doing here. done --smcv

Posted Thu May 18 13:33:44 2017

Bug Description

If color and toc plugins are enabled and you use colored headers, those headers are never colored but sometimes are prefixed with text artifacts like "color: red".

Example: The header

# [[!color   foreground=red text="Testing"]]

would sometimes be seen in the toc as

color: redTesting

Reason for this behaviour is:

  1. the color plugin uses a special syntax to preserve the color through sanitize and that syntax has a plain text component.
  2. the toc plugin removes everything except plain text from headers in the toc
  3. if the toc plugin is executed before the color plugin in the format hook it sees the special syntax and clobbers the toc, otherwise it just removes the color markup

The bug here is that the color plugin's special syntax does not gracefully degrade to "render nothing", which I have now fixed by putting the color bits through a value attribute instead of character data. --smcv


There are a few possible solutions to this depending on how it should work:

  1. The easiest thing would be to just add a "last" parameter to the toc plugin format hook (or "first" to the color plugin). Result: No color in tocs at all
  2. Adding seven lines to would make it preserve ALL markup in headers, color as well as html markup or markdown (emphasize for example). Execution order of the plugins would not matter at all
  3. A bit more code would be necessary to just specifically preserve the color, but nothing else

I would propose implementing the second option because visual markers in headers are useful to convey additional information very fast and this information should be preserved in the toc. Example: Bug or task/project tracker with color conveying status of the bug or task.

This is really a separate feature request: copy non-<a> markup in headings into the TOC. I don't think this necessarily makes sense in general. In particular, any id attributes on child elements must not be passed through because that would make the ID non-unique. --smcv

It seems you can stuff anything into ordered lists (according to w3.orgs doku), so apart from stylistic reasons and suboptimal display of links in headers (see below) I don't see any problems with markup in the toc.


This is the proposed patch to the second solution. Tested with the latest version. It works with all markup and markdown I could think of. The only case not handled optimal is if the header is just a link and nothing else, then there is no text left for the local link, the toc links directly to a different page. Is that acceptable or not?

diff --git a/IkiWiki/Plugin/ b/IkiWiki/Plugin/
index ac07b9a..5c2b056 100644
--- a/IkiWiki/Plugin/
+++ b/IkiWiki/Plugin/
@@ -57,6 +57,7 @@ sub format (@) {
    my $startlevel=($params{startlevel} ? $params{startlevel} : 0);
    my $curlevel=$startlevel-1;
    my $liststarted=0;
+   my $headercollect=0;
    my $indent=sub { "\t" x $curlevel };
    $p->handler(start => sub {
        my $tagname=shift;
@@ -107,6 +108,7 @@ sub format (@) {
            $index.=&$indent."<li class=\"L$curlevel\">".
                "<a href=\"#$anchor\">";

+           $headercollect=1;
            $p->handler(text => sub {
                $page.=join("", @_);
                $index.=join("", @_);
@@ -117,12 +119,17 @@ sub format (@) {
                    $p->handler(text => undef);
                    $p->handler(end => undef);
+                   $headercollect=0;
+               }
+               else {
+                   $index.=join("",@_);
                $page.=join("", @_);
            }, "tagname, text");
        else {
+           $index.=$text if ($headercollect);
    }, "tagname, text");
    $p->handler(default => sub { $page.=join("", @_) }, "text");
Posted Fri Mar 11 11:29:55 2016 Tags:

Hi! While working on Reproducible Builds for Tails, we noticed that the pagestats plugin's output is not deterministic: pages that have the same number of hits (counts) are sorted in hash order.

The pagestats-determinism branch in the Git repository has a fix for this problem.


Merged in 3.20161219 --smcv

Posted Sun Nov 20 03:00:35 2016 Tags: