Zoned ikiwiki

The idea

The idea behind this would be to have one ikiwiki behave as a dynamic private wiki in a specified area and a more static publiczone wiki.

Use cases

This can be more or less difficult depending on the use case.

Purely static public zone with a single, controlled-access inward zone

For this case, only a known set of people are authorized to see the inward zone or edit anything. Everybody else sees only the public zone. This use case is mostly easy to handle now, as long as access to things like the recentchanges page and repository browser are not granted for the public zone. In particular, the features that allow information exposure via edit access are not of concern in this case.

Static public zone, more than one controlled inward zone

In this case, the known, controlled set of people with special access are divided into groups with access to different (or overlapping) zones. The public still sees only a static zone.

Here, some of the harder issues, like information disclosure via edit access, do apply, but only to members of the known, controlled groups. How much of a problem that is depends on how sensitive the information is that each group might reveal from another zone. The rcs logs will show when a page has been edited to contain an inline directive or other trick to reveal information, so if it is enough to treat the trusted users' conduct as a management issue ("don't do that, please") then the risks can be acceptable in this case.

Public zone allows contribution/editing by external users

This case is the most difficult to cover at present, and probably shouldn't be attempted without solutions to most or all of the obstacles identified here.

Implementation techniques

Edit control by user and pagespec: lockedit

This works today, using the lockedit plugin. Because the user predicate can be part of a PageSpec, this is all we need to flexibly control edit access using any authentication method ikiwiki supports.

View control in the http server

We already can more or less do this for example with httpauth, .htaccess files and a proper httpauth_pagespec.

Drawbacks: might be fiddly to configure and require maintaining two different user/pass logbases (native ikiwiki signin), or impractical if ikiwiki is using an authentication method not natively supported in the http server (e.g., OpenID).

View control in ikiwiki CGI

By requiring access to private zones to go through an ikiwiki CGI wrapper, any ikiwiki-supported authentication method can be used, and the accessible pages can be specified using the user predicate with PageSpecs, just as with the lockedit plugin.

The signinview plugin implements this idea, using very simple configuration that is possible even in shared-hosting environments without complete access to the http server configuration, as long as .htaccess files or their equivalent can be created. The top directory of a private zone needs only a .htaccess file with Deny from All or Require all denied (or other equivalent directive for the http server in use), and a 403 error handler of {$cgiurl}?do=view.

The plugin emits response headers intended to discourage non-private caches from retaining the retrieved content. (They are already supposed to avoid caching any response to a request with an Authorization header, but this plugin can be used with any ikiwiki-supported auth method, not all of which require that header.)

A plugin like pagespec alias can be very useful for defining a group of authorized users:

us: user(alice) or user(bob) or user(clotaldo)

so that zone access can be a simple PageSpec:

us() and ours/*

Drawbacks: The private zones no longer reap all the benefits of a static wiki generator, as a (fairly heavy) ikiwiki CGI wrapper must be started for each access. (On the other hand, all it needs to do after confirming authorization is basically cat the statically-generated page with appropriate response headers, keeping the code simple and easy to audit.)

This can be adequate for a case where the static, public zone could receive a lot of traffic, with the private zone(s) accessed only by a known small group of people.

View control with a FastCGI Authorizer

A plugin implementing a FastCGI Authorizer could provide the same benefits as signinview (any ikiwiki-supported auth method, simple zone definition with PageSpecs) with less overhead per access. It would also be simpler than signinview by leaving it as the http server's responsibility to generate the proper headers and serve the content.

Caching proxies are already supposed to avoid caching any response to a request that included an Authorization header. For some ikiwiki-supported auth methods, that header might not be needed in the request, and care may be needed to configure the server to emit other necessary response headers to discourage caching of content from a private zone.

Drawbacks: Not yet implemented, someone would have to do it. It's not clear what code changes fastcgi would require in ikiwiki. An Authorizer seems like a good place to start because of its limited, simple functionality--but as it could make use of any ikiwiki-supported auth method, evaluate PageSpecs, and the like, it could still run a non-trivial amount of the code.

Obstacles

A number of ikiwiki features aren't (yet) designed with zoning in mind, and it will take some effort both to identify them all, and to think out how they could be addressed. This section invites brainstorming of both kinds. This might eventually merit a separate page ?Zoned ikiwiki obstacles but I'll begin it here.

Note that not all of these issues will be problems for all zoned ikiwiki use cases.

Leakage of page existence by do=goto

An unauthorized client can use a do=goto request to find out whether a page exists (will be forbidden to view it) or not (will be forbidden to create it).

In signinview this is handled by hooking cgi first and checking for goto and a non-public page. If the requested page (existing or not) matches the public_pages PageSpec, it is handed off for the goto plugin to handle normally. Otherwise, the do parameter is changed to signingoto so the goto plugin's cgi hook will not handle it, and the sessioncgi hook takes care of it when the user's identity is available.

Backlinks

What is problematic is when you link a public page in a private page : a backlink will be generated from the public page to the private page.

As noted in per page ACLs in the end users through backlink navigation will frequently hit HTTP/401 deterring browsing as well as for the admin at false-positive logwatching.

One can radically disable backlinks feature but then no more neat backlink navigation that is really good to have in both area.

Another way of just preventing this backlink leak in that case would be sufficient via i.e a privatebacklinks config and a patch like this one (show).

Comments are welcome.

mathdesc

diff --git a/IkiWiki.pm b/IkiWiki.pm
--- a/IkiWiki.pm
+++ b/IkiWiki.pm
@@ -294,6 +294,14 @@ sub getsetup () {
                safe => 1,
                rebuild => 1,
        },
+       privatebacklinks => {
+               type => "pagespec",
+               example => "",
+               description => "PageSpec controlling which backlinks are private (ie users/*)",
+               link => "ikiwiki/PageSpec",
+               safe => 1,
+               rebuild => 1,
+       },
        hardlink => {
                type => "boolean",
                default => 0,
diff --git a/IkiWiki/Render.pm b/IkiWiki/Render.pm
--- a/IkiWiki/Render.pm
+++ b/IkiWiki/Render.pm
@@ -52,7 +52,8 @@ sub backlinks ($) {
                        $p_trimmed=~s/^\Q$dir\E// &&
                        $page_trimmed=~s/^\Q$dir\E//;
                               
-               push @links, { url => $href, page => pagetitle($p_trimmed) };
+               push @links, { url => $href, page => pagetitle($p_trimmed) }
+               unless defined $config{privatebacklinks} && length $config{privatebacklinks} && pagespec_match($p, $config{privatebacklinks}) && !pagespec_match($page, $config{privatebacklinks}) ;
        }
        return @links;
 }

In use cases where the main concern about backlinks is only the bad user experience when links are shown that lead to access denial when clicked, a workable solution could be to make the backlinks div invisible in local.css.

recentchanges page

An accessible recentchanges page can include links to changes to pages that should not be accessible. Even if the links cannot be followed, the existence of the pages and their edit history are leaked. If rcs integration is configured, those links on the recentchanges page can leak complete contents through the rcs browser.

It can be helpful to generate separate recentchanges pages for different zones. The recentchanges plugin already allows this--a recentchanges page can be created anywhere, just by using the recentchanges directive with the right PageSpec for the zone it should cover--except that it cannot yet be configured to generate a different recentchanges link destination into pages in different zones. So, it would be helpful if its configuration could allow multiple recentchangespage values, paired with PageSpecs for the pages on which they should be used.

rcs browser

If the repository browser is accessible, potentially all content can be exposed. Even if links to the repository browser are not generated into public wiki pages, if a user can obtain or guess the repository browser URL and construct arbitrary requests, information can be revealed.

Solutions could involve authnz features of the revision control systems themselves and their associated repository browsers; for example, svn supposedly has such features, and recent versions of viewvc supposedly honor them. But such features may not be available for every rcs, and where they are available, they'll have to be configured separately and differently from ikiwiki itself. They might not support the same auth methods (e.g. OpenID) being used by the wiki itself.

Another approach would be for ikiwiki's own rcs plugin to generate a CGI wrapper that invokes the repository browser CGI (which itself would not be made executable via http request). The historyurl and diffurl would then refer to this wrapper. (In fact, they would not have to be specified in the config file, as the plugin would know where it generated them. Instead, what would need to be specified would be the filesystem path for the rcs browser being wrapped). The wrapper could dissect the request parameters, identify the pages being accessed, and subject them to the same accessibility tests used for the wiki. The rcs browser itself needs to be configured to use the wrapper URL in all its generated links,

This might not be very hard to do with gitweb as it is already implemented in Perl. The wrapper could probably import it and use its already-supplied routines to parse the request into the affected file names, and probably complete the whole request without a second exec. Other rcs backends might or might not be as easy.

Search

If search is enabled, private content is indexed and searchable to the public.

Information leaks allowed by edit access

Have you considered all the ways that anyone with edit access to the public wiki could expose information from the public wiki? For example, you could inline all the private pages into a public page. --Joey

Many ikiwiki features could give information exposure opportunities to someone with edit access. The list here is surely incomplete, and would take a purposeful review of the code and plugins (including third-party plugins) to complete.

Note that, with the right controls on who can edit the pages and insert the directives, the fact that a public page can inline stuff from private pages can be very useful. Public pages can be created that are populated by selected content that's maintained on the private side. The if directive can be used in the private content to control what parts can be inlined into public pages. All of this is in ikiwiki today.