[an error occurred while processing the directive]
[Russian Version]
Some Words about the <META HTTP-EQUIV="Content-Type" ...> Tag
The <META HTTP-EQUIV=...> tag was
developed and included in the HTML Standard (at least, it is
mentioned in the HTML 3.2 Draft) in order to inform the HTTP
server about the kinds of HTTP headers it may provide. As is
said in the HTML 3.2 Draft, this is just a wish to the server,
nothing more.
The HTML 4.0 Draft is more definite in this sense: it enables the WWW
client to make allowance for the <META HTTP-EQUIV> tag when
determining the coding, but, if the the HTTP header
"Content-Type:... charset=..."
is present, the coding must be determined according to this header.
Unfortunately, the authors of WWW browsers (both Netscape
Navigator and Microsoft Internet Explorer) either do not read
the standards or interpret them in some strange way. Maybe
they even have good intentions. When such browsers encounter
a document tag like
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=koi8-r">
,
they fail to consider the correct Content-Type HTTP
header (the one corresponding to the actual coding) and show
the document according to their own ideas about this charset.
That is, Netscape changes the font to the one installed for
koi8-r, and MS IE recodes the document from koi8 to cp1251.
Moreover, these browsers pay attention to this tag even if it
is placed between the <BODY> and </BODY> tags,
which is simply nonsense! (I just wonder what they do if a
document includes some <META HTTP-EQUIV="Content-Type" ...> tags
inconsistent with each other...)
Actually, it is clear why browser developers act like this.
They are guided by an understandable wish not only to fulfill
the standard but to overcome it and simultaneously help the
unfortunate owners of servers with HTTP/0.9 (this protocol
has no Content-Type header).
This problem would have remained purely theoretical,
but, unfortunately, many popular HTML editors (above all, MS
Front Page, MS Internet Word Assistant, and many others) like
to include this tag in documents, and convincing them to do
otherwise is a hard task. So, the problem is real, and there
are several possible approaches to its solution:
- To adjust the contents of the
<META HTTP-EQUIV="Content-Type" CONTENT="...;charset=...">
tag to the actual coding. At first sight, this solution
seems perfect, but there are some drawbacks. Two of
them are the most important:
- Such a solution requires analysis (parsing) of
all HTML files output by the server, from
beginning to end. It is not enough to parse
only the part from <HEAD> to </HEAD>, because
browsers process <META> within the
<BODY> as well. Parsing of all
HTML files is a very time-consuming procedure
both for the computer and for the programmer,
and the efficiency of the HTTP server may
notably decrease. However, this problem may be
solved, at least theoretically
(except for the case of infinite-length documents).
- HTTP allows a request to, and a response from, the client
to the server to be processed by an arbitrary
number of Proxy-servers. The standard explicitly
allows the Proxy to modify the contents of the
request and response considering only the
data in the HTTP header. Therefore, the
recoding Proxy can recode the HTTP
response so that its form would be acceptable
(as the Proxy thinks) for the client, but the
actual data body is not analyzed! As a result,
we arrive at the same problem: the HTTP header
says one thing, and <META...> says
another. The Proxy acts in full conformity with
the HTTP standard, so we cannot blame the author of
the Proxy server. And it is also impossible
to control all Proxy servers that may be
encountered on the way from the client
to the server.
- Thus, we arrive at the next evident solution:
<META HTTP-EQUIV..> must be simply removed.
In this case, all popular browsers will use the
Content-Type value from the HTTP header, and all Proxy
servers that are true to this name (and to
HTTP) will also behave properly. This method is
implemented in
Russian Apache starting from ver. PL18.
- Generally speaking, there is also the third way:
as soon as we see <META HTTP-EQUIV...> in a
document, we abandon all attempts to recode this
document, make the Content-Type HTTP header consistent
with the META tag (as is suggested by the HTML
standard) and output the document "as is." Although
this method conforms to the standards, its actual use
in Russian Apache is impossible.
For example, we
should not send a document in the Windows-1251 coding
to a user working under Unix: in all probability, he or
she will not have the fonts required for reading this
document.
[an error occurred while processing the directive]