IIUC, the current version of HTML::Scrubber allows for the object tags to be either enabled or disabled entirely. However, while object can be used to add code (which is indeed a potential security hole) to a document, reading Objects, Images, and Applets in HTML documents reveals that the “dangerous” are not all the objects, but rather those having the following attributes:

classid     %URI;          #IMPLIED  -- identifies an implementation --
codebase    %URI;          #IMPLIED  -- base URI for classid, data, archive--
codetype    %ContentType;  #IMPLIED  -- content type for code --
archive     CDATA          #IMPLIED  -- space-separated list of URIs --

It seems that the following attributes are, OTOH, safe:

declare     (declare)      #IMPLIED  -- declare but don't instantiate flag --
data        %URI;          #IMPLIED  -- reference to object's data --
type        %ContentType;  #IMPLIED  -- content type for data --
standby     %Text;         #IMPLIED  -- message to show while loading --
height      %Length;       #IMPLIED  -- override height --
width       %Length;       #IMPLIED  -- override width --
usemap      %URI;          #IMPLIED  -- use client-side image map --
name        CDATA          #IMPLIED  -- submit as part of form --
tabindex    NUMBER         #IMPLIED  -- position in tabbing order --

Should the former attributes be scrubbed while the latter left intact, the use of the object tag would seemingly become safe.

Note also that allowing object (either restricted in such a way or not) automatically solves the svg issue.

For Ikiwiki, it may be nice to be able to restrict URI's (as required by the data and usemap attributes) to, say, relative and data: (as per RFC 2397) ones as well, though it requires some more consideration.

— Ivan Shmakov, 2010-03-12Z.

wishlist

SVG can contain embedded javascript.

Indeed.

So, a more general tool (XML::Scrubber?) will be necessary to refine both XHTML and SVG.

… And to leave MathML as is (?.)

— Ivan Shmakov, 2010-03-12Z.

The spec that you link to contains examples of objects that contain python scripts, Microsoft OLE objects, and Java. And then there's flash. I don't think ikiwiki can assume all the possibilities are handled securely, particularly WRT XSS attacks. --Joey

I've scanned over all the object examples in the specification and all of those that hold references to code (as opposed to data) have a distinguishing classid attribute.

While I won't assert that it's impossible to reference code with data (and, thanks to text/xhtml+xml and image/svg+xml, it is not impossible), throwing away any of the “insecure” attributes listed above together with limiting the possible URI's (i. e., only local and certain data: ones for data and usemap) should make object almost as harmless as, say, img.

But with local data, one could not embed youtube videos, which surely is the most obvious use case?

Allowing a “remote” object to render on one's page is a security issue by itself. Though, of course, having an explicit whitelist of URI's may make this issue more tolerable. — Ivan Shmakov, 2010-03-12Z.

Note that youtube embedding uses an object element with no classid. The swf file is provided via an enclosed param element. --Joey

I've just checked a random video on YouTube and I see that the .swf file is provided via an enclosed embed element. Whether to allow those or not is a different issue. — Ivan Shmakov, 2010-03-12Z.

(Though it certainly won't solve the SVG problem being restricted in such a way.)

Of the remaining issues I could only think of recursive object — the one that references its container document.

— Ivan Shmakov, 2010-03-12Z.

See also