On this wiki, editing templates/gitbranch.mdwn causes a really slow refresh, orders of magnitude slower than a full rebuild: a large number of pages depend on that template, or link to a page that embeds that template, and so on.

I suspect that, as with my optimization pass for album's benefit, the costly thing is evaluating lots of pagespecs. I'm profiling it to see whether the problem is there are any low-hanging fruit.

Easy to reproduce offline:

  • comment out the exclude option in docwiki.setup
  • /usr/bin/perl -Iblib/lib ikiwiki.in -setup docwiki.setup -rebuild
  • touch templates/gitbranch.mdwn
  • /usr/bin/perl -Iblib/lib ikiwiki.in -setup docwiki.setup -refresh

NYTProf says:

# spent 279s (237+41.8) within IkiWiki::bestlink which was called 13988949 times, avg 20µs/call:
# 13150827 times (222s+37.2s) by IkiWiki::PageSpec::match_link at line 2692, avg 20µs/call
# 829606 times (14.9s+4.51s) by IkiWiki::PageSpec::match_link at line 2687, avg 23µs/call
sub bestlink ($$) {

which is about half the execution time (458s on my laptop).

Adding code to log each call to match_backlink indicates that a large part of the problem is that it evaluates the pagespec backlink(plugins/goodstuff) up to a million times, with various pages and locations.


Available in a git repository branch.
Branch: smcv/ready/perf
Author: Simon McVittie

Previously, if a page like plugins/trail contained a conditional like

[[!if  test="backlink(plugins/goodstuff)" all=no]]

(which it gets via templates/gitbranch), then the conditional plugin would give plugins/trail a dependency on (backlink(plugins/goodstuff)) and plugins/trail. This dependency is useless: that pagespec can never match any page other than plugins/trail, but if plugins/trail has been modified or deleted, then it's going to be rendered or deleted anyway, so there's no point in spending time evaluating match_backlink for it.

Conversely, the influences from the result were not taken into account, so plugins/trail did not have the { "plugins/goodstuff" => $DEPEND_LINKS } dependency that it should.

We should invert that, depending on the influences but not on the test.

This is at least an order of magnitude faster: when I edit the docwiki as described above, a refresh takes 37s with nytprof overhead, compared with 458s with nytprof overhead before this change. Without nytprof, that refresh takes 14s, which is faster than the 24s rebuild again. I didn't record how long the refresh took without nytprof before this change, but it was something like 200s.

bestlink is still the single most expensive function in this refresh at ~ 9.5s, with match_glob at ~ 5.2s as the runner-up. --smcv

merged --smcv