Syndication autodiscovery for comment feeds

A standard [[!inline ]] directive adds links to the autogenerated syndication feeds using link tags in the header:

<link rel="alternate" type="application/rss+xml" title="$title" href="$page.atom" />
<link rel="alternate" type="application/atom+xml" title="$title" href="$page.atom" />

These links aren't added to my pages that include comments even though comments generate syndication feeds. How can I configure the comments plugin to add these links to the header? (These links are required for user-agent autodiscovery of syndication feeds.) --?anderbubble

Moderating comments from the CLI

How do you do this, without using the UI in the Preferences?

Please put this info on the page. Many thanks --Kai Hendry

Why internal pages? (unresolved)

Comments are saved as internal pages, so they can never be edited through the CGI, only by direct committers.

So, why do it this way, instead of using regular wiki pages in a namespace, such as $page/comments/*? Then you could use lockedit to limit editing of comments in more powerful ways. --Joey

Er... I suppose so. I'd assumed that these pages ought to only exist as inlines rather than as individual pages (same reasoning as aggregated posts), though.

lockedit is actually somewhat insufficient, since check_canedit() doesn't distinguish between creation and editing; I'd have to continue to use some sort of odd hack to allow creation but not editing.

I also can't think of any circumstance where you'd want a user other than admins (~= git committers) and possibly the commenter (who we can't check for at the moment anyway, I don't think?) to be able to edit comments - I think user expectations for something that looks like ordinary blog comments are likely to include "others can't put words into my mouth".

My other objection to using a namespace is that I'm not particularly happy about plugins consuming arbitrary pieces of the wiki namespace - /discussion is bad enough already. Indeed, this very page would accidentally get matched by rules aiming to control comment-posting... :-) --smcv

Thinking about it, perhaps one way to address this would be to have the suffix (e.g. whether commenting on Sandbox creates sandbox/comment1 or sandbox/c1 or what) be configurable by the wiki admin, in the same way that recentchanges has recentchangespage => 'recentchanges'? I'd like to see fewer hard-coded page names in general, really - it seems odd to me that shortcuts and smileys hard-code the name of the page to look at. Perhaps I could add discussionpage => 'discussion' too? --smcv

(I've now implemented this in my branch. --smcv)

The best reason to keep the pages internal seems to me to be that you don't want the overhead of every comment spawning its own wiki page. --Joey

Formats (resolved)

The plugin now allows multiple comment formats while still using internal pages; each comment is saved as a page containing one [[!comment ]] directive, which has a superset of the functionality of ?format.

Access control (unresolved?)

By the way, I think that who can post comments should be controllable by the existing plugins opendiscussion, anonok, signinedit, and lockedit. Allowing posting comments w/o any login, while a nice capability, can lead to spam problems. So, use check_canedit as at least a first-level check? --Joey

This plugin already uses check_canedit, but that function doesn't have a concept of different actions. The hack I use is that when a user comments on, say, sandbox, I call check_canedit for the pseudo-page "sandbox[postcomment]". The special postcomment(glob) pagespec returns true if the page ends with "[postcomment]" and the part before (e.g. sandbox) matches the glob. So, you can have postcomment(blog/*) or something. (Perhaps instead of taking a glob, postcomment should take a pagespec, so you can have postcomment(link(tags/commentable))?)

This is why anonok_pagespec => 'postcomment(*)' and locked_pages => '!postcomment(*)' are necessary to allow anonymous and logged-in editing (respectively).

I changed that to move the flag out of the page name, and into a variable that the match_postcomment function checks for. Other ugliness still applies. :-) --Joey

This is ugly - one alternative would be to add check_permission() that takes a page and a verb (create, edit, rename, remove and maybe comment are the ones I can think of so far), use that, and port the plugins you mentioned to use that API too. This plugin could either call check_can("$page/comment1", 'create') or call check_can($page, 'comment').

One odd effect of the code structure I've used is that we check for the ability to create the page before we actually know what page name we're going to use - when posting the comment I just increment a number until I reach an unused one - so either the code needs restructuring, or the permission check for 'create' would always be for 'comment1' and never 'comment123'. --smcv

Now resolved, in fact --smcv

Another possibility is to just check for permission to edit (e.g.) sandbox/comment1. However, this makes the "comments can only be created, not edited" feature completely reliant on the fact that internal pages can't be edited. Perhaps there should be a editable_pages pagespec, defaulting to '*'? --smcv

comments directive vs global setting (resolved?)

When comments have been enabled generally, you still need to mark which pages can have comments, by including the [[!comments ]] directive in them. By default, this directive expands to a "post a comment" link plus an [[!inline ]] with the comments. [This requirement has now been removed --smcv]

I don't like this, because it's hard to explain to someone why they have to insert this into every post to their blog. Seems that the model used for discussion pages could work -- if comments are enabled, automatically add the comment posting form and comments to the end of each page. --Joey

I don't think I'd want comments on every page (particularly, not the front page). Perhaps a pagespec in the setup file, where the default is "*"? Then control freaks like me could use "link(tags/comments)" and tag pages as allowing comments.

Yes, I think a pagespec is the way to go. --Joey

Implemented --smcv

The model used for discussion pages does require patching the existing page template, which I was trying to avoid - I'm not convinced that having every possible feature hard-coded there really scales (and obviously it's rather annoying while this plugin is on a branch). --smcv

Using the template would allow customising the html around the comments which seems like a good thing? --Joey

The [[!comments ]] directive is already template-friendly - it expands to the contents of the template comments_embed.tmpl, possibly with the result of an [[!inline ]] appended. I should change comments_embed.tmpl so it uses a template variable INLINE for the inline result rather than having the perl code concatenate it, which would allow a bit more customization (whether the "post" link was before or after the inline). Even if you want comments in page.tmpl, keeping the separate comments_embed.tmpl and having a COMMENTS variable in page.tmpl might be the way forward, since the smaller each templates is, the easier it will be for users to maintain a patched set of templates. (I think so, anyway, based on what happens with dpkg prompts in Debian packages with monolithic vs split conffiles.) --smcv

I've switched my branch to use page.tmpl instead; see what you think? --smcv

Raw HTML (resolved?)

Raw HTML was not initially allowed by default (this was configurable).

I'm not sure that raw html should be a problem, as long as the htmlsanitizer and htmlbalanced plugins are enabled. I can see filtering out directives, as a special case. --Joey

Right, if I sanitize each post individually, with htmlscrubber and either htmltidy or htmlbalance turned on, then there should be no way the user can forge a comment; I was initially wary of allowing meta directives, but I think those are OK, as long as the comment template puts the [[!meta author]] at the end. Disallowing directives is more a way to avoid commenters causing expensive processing than anything else, at this point.

I've rebased the plugin on master, made it sanitize individual posts' content and removed the option to disallow raw HTML. Sanitizing individual posts before they've been htmlized required me to preserve whitespace in the htmlbalance plugin, so I did that. Alternatively, we could htmlize immediately and always save out raw HTML? --smcv

There might be some use cases for other directives, such as img, in comments.

I don't know if meta is "safe" (ie, guaranteed to be inexpensive and not allow users to do annoying things) or if it will continue to be in the future. Hard to predict really, all that can be said with certainty is all directives will contine to be inexpensive and safe enough that it's sensible to allow users to (ab)use them on open wikis. --Joey

I have a test ikiwiki setup somewhere to investigate adopting the comments plugin. It is setup with no auth enabled and I got hammered with a spam attack over the last weekend (predictably). What surprised me was the scale of the attack: ikiwiki eventually triggered OOM and brought the box down. When I got it back up, I checked out a copy of the underlying git repository, and it measured 280M in size after being packed. Of that, about 300K was data prior to the spam attack, so the rest was entirely spam text, compressed via git's efficient delta compression.

I had two thoughts about possible improvements to the comments plugin in the wake of this:

  • comment pagination - there is a hard-to-define upper limit on the number of comments that can be appended to a wiki page whilst the page remains legible. It would be useful if comments could be paginated into sub-pages.

  • crude flood control - asides from spam attacks (and I am aware of blogspam), people can crap flood or just aggressively flame repeatedly. An interesting prevention measure might be to not let an IP post more than 3 sequential comments to a page, or to the site, without at least one other comment being interleaved. I say 3 rather than 2 since correction follow-ups are common.

-- Jon

Comment threads

Any thoughts about implementing some simple threading in the comments?

Or at least a reply functionality that quotes the subject/contents?

-- iustin

Disabling certain formats for comments

It seems that comments plugin allows using all enabled formats and there is not way to disable some of them. For my blog, I want to use additional formats for writing posts but I do not want commenters to use those formats because it would be a security problem.

Any suggestions or hints how to implement this?

-- wentasah

I've implemented this. See Restrict formats allowed for comments. --wentasah

URLs in anonymous-style comments committed directly via VCS

Available in a git repository branch.
Branch: schmonz/comments-anonymous-url-vcs
Author: schmonz

I recently imported my site from Textpattern into ikiwiki (using an ikiwiki-import program that may someday make its way into ikiwiki proper). Textpattern's comments behave much like ikiwiki's anonymous comments, so piping each imported comment through ikiwiki-comment and regenerating the site with comments_allowauthor=1 preserved almost all the information.

What's missing: if a comment directive has a url param, I'd expect the rendered page to href the author's name to that URL. This works as I expect for new comments added via the CGI, but not for imported comments added via the VCS directly.

My branch has a fix that doesn't break t/comments.t, doesn't appear to break anonymous or signed-in comments via the CGI in any way I've tried, and lets me render my (incredibly valuable ;-) imported blog comments with full fidelity. OK to commit?

Ship it. --smcv

Thanks, have done. --schmonz