SoC Proposal for Implementation of a File Upload Interface

I intend to extend Ikiwiki such that it accepts file uploads, subject to access control, and integrates said uploads with the interface. What follows is a very rough draft of my thoughts on the matter. Comments are welcomed, either on the discussion page or via e-mail (me at inelegant.org).

I suggest we adopt the Trac/Wikipedia concept of "attaching" files to a given page. In this scenario, each page for which file upload has been enabled, will sport an <input type="file"> construct along with an Attach button. Upon successfully attaching a file, its name will be appended to an "Attachments" list at the bottom of the page. The names in the list will link to the appropriate files. Architecturally, this means that after a file has been attached to a page, the page will have to be rebuilt.

Files will be uploaded in a background thread via XMLHTTPRequest. This allows us to provide visual indicators of upload status, support multiple uploads at a time, and reduces the amount of template code we must write.

After an upload has been started, another text entry field will be rendered, enabling the user to commence a new upload.

Metadata

It is necessary to associate metadata with the uploaded file. The IkiWiki index file already associates rudimentary metadata with the files it renders, but there has been interest from multiple sources in creating a general purpose metadata layer for IkiWiki which supports the association of arbitrary metadata with a file. This work is outside the scope of the file upload feature, but I will attempt a basic implementation nevertheless.

A key decision involves the storage of the metadata. IkiWiki must be as usable from the CLI as from the web, so the data being stored must be easily manipulatable using standard command line tools. It is infeasible to expect users to embed arbitrary metadata in arbitrary files, so we will use a plaintext file consisting of name-value pairs for recording metadata. Each file in the IkiWiki source directory may have its own metadata file, but they are always optional. The metadata for a file, F, will be stored in a file named F.meta. For example, the metadata for this page would be in todo/fileupload/soc-proposal.mdwn.meta.

For instance: cat "license: gpl\n" >> software.tar.gz.meta. It would be trivial to distribute a tool with IkiWiki that made this even easier, too, e.g. ikiwiki-meta license gpl software.tar.gz. An open issue is how this metadata will be added from the web interface.

For source files, this approach conflicts with the meta plugin, so there needs to be some integration between the two.

In keeping with the current architecture of IkiWiki, we can make this metadata available to plugins by using a hash keyed on the filename, e.g. $metadata{'software/software.tar.gz'}{'license'} eq 'gpl'.

In general, we will only use the .meta files to store data that cannot be automatically determined from the file itself. For uploaded files this will be probably include the uploader's IP address, for example.

Configuration

In fileupload it is specified that the upload feature must be highly configurable. Joey suggests the use of the preferences page to specify some of these options, but it is not yet clear which ones are important enough to expose in this way. All options will be configurable via the config file.

We will (or do) support configuring:

  • The allowable MIME types of uploaded files.
  • The maximum size of the uploaded file.
  • The maximum size of the upload temporary directory.
  • The maximum size of the source directory.
  • The IP addresses allowed to upload.
  • The pages which can have files attached to them.
  • The users who are allowed to upload.
  • The users who are prohibited from uploading.

etc.

Operation

  1. File upload forms will be rendered on all wiki pages which have been allowed in the global configuration file. By default, this will probably be none of them.
  2. The forms will interface with ikiwiki.cgi, passing it the filename, the file contents, and the name of the page to which it is being attached.
  3. The CGI will consult the config file and any embedded pagespecs in turn, to determine whether the access controls permit the upload. If they don't, an error message will be displayed to the user, and the process will abort.
  4. The uploaded file will be saved to a temporary upload directory.
  5. Access controls which work on the entire file will be ran. The process will abort if they fail, or if the upload appears to have been aborted. Before the process is aborted, the file will be deleted from the temp directory.
  6. The file is moved to the appropriate directory.
  7. The $file.meta file will be created and populated.
  8. The uploaded file will be committed to the RCS.
  9. .ikiwiki/index will be modified to reflect the new upload (as above).
  10. The page to which the file is attached (and any other affected pages) will be regenerated.

--Ben