We're accumulating a significant number of bugs related to cross-linking between the content and the CGI not being as relative as we would like. This is an attempt to design a solution for them all in a unified way, rather than solving one bug at the cost of exacerbating another. --smcv
Terminology
Absolute: starts with a scheme, like
http://example.com/ikiwiki.cgi
,https://www.example.org/
Protocol-relative: starts with
//
like//example.com/ikiwiki.cgi
Host-relative: starts with
/
like/ikiwiki.cgi
Relative: starts with neither
/
nor a scheme, like../ikiwiki.cgi
What we need
Static content must be able to link to other static content
Static content must be able to link to the CGI
CGI-generated content must be able to link to arbitrary static content (it is sufficient for it to be able to link to the "root" of the
destdir
)CGI-generated content must be able to link to the CGI
Constraints
URIs in RSS feeds must be absolute, because feed readers do not have any consistent semantics for the base of relative links
If we have a
<base href>
then HTML 4.01 says it must be absolute, although HTML 5 does relax this by defining semantics for a relative<base href>
- it is interpreted relative to the "fallback base URL" which is the URL of the page being viewed (trouble with base in search, preview base url should be absolute)It is currently possible for the static content and the CGI to be on different domains, e.g.
www.example.com
vs.cgi.example.com
; this should be preservedIt is currently possible to serve static content "mostly over HTTP" (i.e. advertise a http URI to readers, and use a http URI in RSS feeds etc.) but use HTTPS for the CGI
If the static content is served over HTTPS, it must refer to other static content and the CGI via HTTPS (to avoid mixed content, which is a vulnerability); this may be either absolute, protocol-relative, host-relative or relative
If the CGI is served over HTTPS, it must refer to static content and the CGI via HTTPS; again, this may be either either absolute, protocol-relative, host-relative or relative (Protocol relative urls for stylesheet linking)
Because reverse proxies and
w3mmode
exist, it must be possible to configure ikiwiki to not believe theHTTPS
, etc., CGI variables, and force a particular scheme or host (W3MMode still uses http://localhost?, Using reverse proxy; base URL is http instead of https, Dot CGI pointing to localhost. What happened?)For relative links in page-previews to work correctly without having to have global state or thread state through every use of
htmllink
etc.,cgitemplate
needs to make links in the page body work as if we were on the page being previewed.
"Would be nice"
In general, the more relative the better
schmonz wants to direct all CGI pageviews to https even if the visitor comes from http (but this can be done at the webserver level by making http://example.com/ikiwiki.cgi a redirect to https://example.com/ikiwiki.cgi, so is not necessarily mandatory)
smcv has some sites that have non-CA-cartel-approved certificates, with a limited number of editors who can be taught to add SSL policy exceptions and log in via https; anonymous/read-only actions like
do=goto
should not go via HTTPS, since random readers would get scary SSL warnings (want to avoid ikiwiki using http or https in urls to allow serving both, CGI script and HTTPS)It would be nice if the CGI did not need to use a
<base>
so that we could use host-relative URI references (/sandbox/
) or scheme-relative URI references (//static.example.com/sandbox/
) (see trouble with base in search)
As a consequence of the "no mixed content" constraint, I think we can make some assumptions:
if the
cgiurl
is http but the CGI discovers at runtime that it has been reached via https, we can assume that the https equivalent, or a host- or protocol-relative URI reference to itself, would work;if the
url
is http but the CGI discovers at runtime that it has been reached via https, we can assume that the https equivalent of theurl
would work
In other words, best-practice would be to list your url
and cgiurl
in the setup file as http if you intend that they will most commonly
be accessed via http (e.g. the "my cert is not CA-cartel approved"
use-case), or as https if you intend to force accesses into
being via https (the "my wiki is secret" use-case).
Regression test
I've added a regression test in t/relativity.t
. We might want to
consider dropping some of it or skipping it unless a special environment
variable is set once this is all working, since it's a bit slow.
--smcv
Remaining bugs
Arguable
Configure the url and cgiurl to both be https, then access the CGI via a non-https address. The stylesheet is loaded from the http version of the static site, but maybe it should be forced to https?
Configure url = "http://static.example.com/", cgiurl = "http://cgi.example.com/ikiwiki.cgi" and access the CGI via staging.example.net. Self-referential links to the CGI point to cgi.example.com, but maybe they should point to staging.example.net?
(possibly incomplete, look for TODO and ??? in relativity.t)