Entire novels have been written about the security considerations that apply to HTML
documents. Many are listed in this document, to which the reader is referred for more details.
Some general concerns bear mentioning here, however:
HTML is scripted language, and has a large number of APIs (some of which are described in
this document). Script can expose the user to potential risks of information leakage, credential
leakage, cross-site scripting attacks, cross-site request forgeries, and a host of other
problems. While the designs in this specification are intended to be safe if implemented
correctly, a full implementation is a massive undertaking and, as with any software, user agents
are likely to have security bugs.
Even without scripting, there are specific features in HTML which, for historical reasons,
are required for broad compatibility with legacy content but that expose the user to unfortunate
security problems. In particular, the img element can be used in conjunction with
some other features as a way to effect a port scan from the user's location on the Internet.
This can expose local network topologies that the attacker would otherwise not be able to
determine.
HTML relies on a compartmentalization scheme sometimes known as the same-origin
policy. An origin in most cases consists of all the pages served from the same
host, on the same port, using the same protocol.
It is critical, therefore, to ensure that any untrusted content that forms part of a site be
hosted on a different origin than any sensitive content on that site. Untrusted
content can easily spoof any other page on the same origin, read data from that origin, cause
scripts in that origin to execute, submit forms to and from that origin even if they are
protected from cross-site request forgery attacks by unique tokens, and make use of any
third-party resources exposed to or rights granted to that origin.
Interoperability considerations:
Rules for processing both conforming and non-conforming content
are defined in this specification.
Published specification:
This document is the relevant specification. Labeling a resource
with the text/html type asserts that the resource is
an HTML document using
the HTML syntax.
Applications that use this media type:
Web browsers, tools for processing web content, HTML authoring
tools, search engines, validators.
Additional information:
Magic number(s):
No sequence of bytes can uniquely identify an HTML
document. More information on detecting HTML documents is
available in MIME Sniffing. [MIMESNIFF]
File extension(s):
"html" and "htm"
are commonly, but certainly not exclusively, used as the
extension for HTML documents.
Macintosh file type code(s):
TEXT
Person & email address to contact for further information:
Subresources of a multipart/x-mixed-replace
resource can be of any type, including types with non-trivial
security implications such as text/html.
Interoperability considerations:
None.
Published specification:
This specification describes processing rules for web browsers.
Conformance requirements for generating resources with this type are the same as for multipart/mixed. [RFC2046]
Applications that use this media type:
This type is intended to be used in resources generated by web servers, for consumption by web browsers.
Labeling a resource with the application/xhtml+xml type asserts that the
resource is an XML document that likely has a document element from the HTML
namespace. Thus, the relevant specifications are XML, Namespaces in
XML, and this specification. [XML][XMLNS]
This registration is for community review and will be submitted to the IESG for review,
approval, and registration with IANA.
Type name:
text
Subtype name:
ping
Required parameters:
No parameters
Optional parameters:
charset
The charset parameter may be provided. The parameter's value must be
"utf-8". This parameter serves no purpose; it is only allowed for
compatibility with legacy servers.
Encoding considerations:
Not applicable.
Security considerations:
If used exclusively in the fashion described in the context of
hyperlink auditing, this type introduces no new
security concerns.
Interoperability considerations:
Rules applicable to this type are defined in this specification.
Published specification:
This document is the relevant specification.
Applications that use this media type:
Web browsers.
Additional information:
Magic number(s):
text/ping resources always consist of the four
bytes 0x50 0x49 0x4E 0x47 (`PING`).
File extension(s):
No specific file extension is recommended for this type.
Macintosh file type code(s):
No specific Macintosh file type codes are recommended for this type.
Person & email address to contact for further information:
Ian Hickson <ian@hixie.ch>
Intended usage:
Common
Restrictions on usage:
Only intended for use with HTTP POST requests generated as part
of a web browser's processing of the ping attribute.
Labeling a resource with the application/microdata+json type asserts that the
resource is a JSON text that consists of an object with a single entry called "items" consisting of an array of entries, each of which consists of an object
with an entry called "id" whose value is a string, an entry called "type" whose value is another string, and an entry called "properties" whose value is an object whose entries each have a value consisting
of an array of either objects or strings, the objects being of the same form as the objects in
the aforementioned "items" entry. Thus, the relevant specifications are
JSON and this specification. [JSON]
Applications that use this media type:
Applications that transfer data intended for use with HTML's microdata feature, especially in
the context of drag-and-drop, are the primary application class for this type.
This registration is for community review and will be submitted to the IESG for review,
approval, and registration with IANA.
Type name:
text
Subtype name:
event-stream
Required parameters:
No parameters
Optional parameters:
charset
The charset parameter may be provided. The parameter's value must be
"utf-8". This parameter serves no purpose; it is only allowed for
compatibility with legacy servers.
Encoding considerations:
8bit (always UTF-8)
Security considerations:
An event stream from an origin distinct from the origin of the content consuming the event
stream can result in information leakage. To avoid this, user agents are required to apply CORS
semantics. [FETCH]
Event streams can overwhelm a user agent; a user agent is expected to apply suitable
restrictions to avoid depleting local resources because of an overabundance of information from
an event stream.
Servers can be overwhelmed if a situation develops in which the server is causing clients to
reconnect rapidly. Servers should use a 5xx status code to indicate capacity problems, as this
will prevent conforming clients from reconnecting automatically.
Interoperability considerations:
Rules for processing both conforming and non-conforming content are defined in this
specification.
Published specification:
This document is the relevant specification.
Applications that use this media type:
Web browsers and tools using web services.
Additional information:
Magic number(s):
No sequence of bytes can uniquely identify an event stream.
File extension(s):
No specific file extensions are recommended for this type.
Macintosh file type code(s):
No specific Macintosh file type codes are recommended for this type.
Person & email address to contact for further information:
Ian Hickson <ian@hixie.ch>
Intended usage:
Common
Restrictions on usage:
This format is only expected to be used by dynamic open-ended streams served using HTTP or a
similar protocol. Finite resources are not expected to be labeled with this type.
This section describes a convention for use with the IANA URI scheme registry. It does not
itself register a specific scheme. [RFC7595]
Scheme name:
Schemes starting with the four characters "web+" followed by one or more letters in the range
a-z.
Status:
Permanent
Scheme syntax:
Scheme-specific.
Scheme semantics:
Scheme-specific.
Encoding considerations:
All "web+" schemes should use UTF-8 encodings where relevant.
Applications/protocols that use this scheme name:
Scheme-specific.
Interoperability considerations:
The scheme is expected to be used in the context of web applications.
Security considerations:
Any web page is able to register a handler for all "web+" schemes. As
such, these schemes must not be used for features intended to be core platform features (e.g.,
HTTP). Similarly, such schemes must not store confidential information in their URLs, such as
usernames, passwords, personal information, or confidential project names.