[an error occurred while processing the directive]
Brief Introduction
The present-day Russian-language (at least) Internet
space is confronted by the problem of multiple encodings.
There are at least five widespead cyrillic encodings (code tables), and users
should receive the "information resources" in the form
understandable to them, that is, in the encoding supported by their software.
Therefore, a WWW server using the Russian language should support several
cyrillic encodings, or, at least, this is highly desirable (the problem of
the only right encoding, as well as the only right end at which an eggshell
should be broken, is the subject of religion and will not be considered here).
It is also desirable to make this support as clear as possible for a user and
easily configurable for a webmaster.
A number of methods have already been developed to solve this problem, and
the method and software proposed below are not unique, although they have
gained some popularity.
This software product is based on the popular HTTP
server Apache, with additional
functionality required for simultaneous support of several cyrillic encodings.
Unfortunately, this functionality cannot be ensured by a fully independent
module, and some changes were introduced into the Apache source code.
The latest stable version of Russian Apache is
Apache 1.3.41 rus/PL30.24. Versions, based on Apache 1.2.x and Apache 1.1.x are
still avaliable, but unsupported.
Before installing the server, we strongly recommend you to study the following
sections in detail:
How It Works,
How To Configure, and
Some Recommendations
(Note. Unfortunately, english version of documentation is not
always up-to-date.).
The specific features of the server are as follows:
- It ensures agreement between the client and server encodings both when
documents are being sent to the client and when the user input is being
processed (in the input, both GET and POST are supported).
- In accordance with this agreement, it provides the correct
Content-type:...;charset=... headers.
- If required, it provides the Expires: header for proxy servers.
- It provides the correct Vary: and ETag headers; as a
result, correct document caching becomes possible (if the proxy cache is
compatible with HTTP/1.1).
The server simultaneously implements several methods for achieving agreement
between the client and server encodings (for details, see section
How It Works), namely:
- Using the client's headers Accept-Charset: and/or
Accept: text/x-cyrillic.... If the server knows the charset requested
by the client, these headers have the highest priority for the server,
independently of its native charset configuration.
- Searching for the name of one of the configured code pages in the server
name (like www-koi8-r.stack.serpukhov.su or
www-windows-1251.stack.serpukhov.su).
- Searching for the name of one of the configured code pages in the prefix
of the requested URI
(like http://www.stack.net/windows-1251/file.html).
- Explicitly specifying the correspondence between the port number and the
encoding.
- Analyzing the default configuration of code pages for various types of
the client's programs in case the server can recognize the client's program
(or sometimes the environment in which the client's program is working).
- Determining the encoding separately for different (virtual) servers
and/or directories. That is, one may specify a separate set of
directives for each directory or virtual server
(using another hostname or port number); these directives will be
valid down from this node of the tree until they are cancelled by directives
at a more detailed level.
- If the net community adopts some unified transport encoding, it will be
necessary to retain only this encoding at the server, independently of the
physical representation of the given documents.
[an error occurred while processing the directive]
Some Specific Features
- The administrator of the server is the one to determine the cases when
different code pages are to be used. The administrator may also create
recoding tables and add them to the server to reach an agreement between
the code page of the server and the required client's encoding.
The server correctly recodes the text flowing from/to the client, including
sequences like %xx%yy%zz.
- The server administrator may specify the characteristic features of the
client's program (the User-Agent: header) and the default charset
corresponding to the given client's program.
- The administrator may specify the client's programs that inadequately
process the MIME header; for these programs, such headers will not be
provided by the server.
- The administrator may specify the priority of charset selection by the
server between the URL and User-Agent.
- The administrator may specify the reaction of the server to an
incorrectly requested charset (that is, specify if the requested document
should be sent using the default charset in this case).
[an error occurred while processing the directive]
[an error occurred while processing the directive]