http-proxy

http-proxy
Prev	Kernun UTM Reference (8)	Next

Name

http-proxy, test-http — HyperText Transfer Protocol (HTTP) proxy

Synopsis

http-proxy [-hv] [-d dbglev] -f cfgfile

test-http [-hv] [-d dbglev] -f cfgfile [-r] [-t test_expr]

Description

Program http-proxy is the proxy daemon for HyperText Transfer Protocol (RFCs 1945, 2616). It supports HTTP versions 0.9, 1.0, and 1.1 clients and HTTP 1.0 and 1.1 servers. The proxy supports secure communication via SSL/TLS protocols, see ssl(5).

Startup and Configuration

The proxy reads its configuration and starts listening on TCP sockets (address/port couples) specified by listen-on configuration section, see listen-on(5). If support of transparent connections (i.e., connections made directly to a HTTP server and redirected to the proxy by IP Filter as described in transparency(7)) is requested by item transparent in section listen-on, the corresponding NAT redirections are established during proxy startup and removed upon exit.

Format of the configuration file is described in http-proxy.cfg(5). General syntax of Kernun configuration files is explained in configuration(7). Program test-http tests syntax and partially semantics of configuration; for test expression syntax, see test-expr(5).

Access Control

Http-proxy uses three-phase ACLs, see access-control(7). The first phase, session-acl is checked once for each client connection. It permits or denies client access and sets some connection parameters. The second phase, request-acl is checked for each HTTP request after the request headers are received from the client, but before anything is sent to the server. It decides about permitting/denying the request and it can also set some request parameters. Note that there can be several requests per connection if persistent connections are used. The third phase, doc-acl is checked for each HTTP request after the response headers are received from the server, but before the response is sent to the client.

Connection Establishment

When a connection from an HTTP client arrives, the configuration is searched for a matching session-acl. If the ACL says that the connection should be denied or there is no matching ACL, the proxy does not communicate with the client and closes the connection immediately. In addition to the generic ACL conditions and actions described in access-control(7), some Http-proxy-specific conditions and parameters can be set.

If the proxy sends a response to the client, but the client is still sending some data, it may be necessary not to close the read side of the connection for some time (see RFC2616 sect. 10.4 for details). Configuration item linger-time sets the time which the proxy should wait before closing the read side of the client connection.

It is possible to set idle timeouts for request and response by item idle-timeout. If no data are received from the client or the server for more than idle-timeout seconds, the request fails.

Items client-keepalive and server-keepalive define whether persistent connections are to be used on the client and server sides of the proxy. It is possible to limit the number of requests per connection or to disable persistent connections entirely. Timeouts for closing an idle connection can be set too.

The language used by the proxy in error messages sent to the client is defined by item language.

It is possible to forward all requests to some other proxy instead of sending them directly to origin servers. The next-hop proxy is set by hand-off.

Item ssl switches on SSL/TLS on the client connection and sets various SSL/TLS parameters. If the connection from the client uses SSL/TLS, item client-cert-match defines the acceptable client certificates. If the client certificate does not pass the test, SSL/TLS connection establishment fails and the connection is closed.

When http-proxy is used as an authentication proxy for accessing protected HTTP servers, parameters of the client authentication are set by item aproxy.

Request Processing

For each request from the client, the proxy reads the request line and headers. Then it finds the appropriate request-acl. If the ACL says that the request should be denied, the user should authorize, or there is no matching ACL, the proxy sends back a reply informing the user that the request has been denied. In addition to the generic ACL conditions and actions described in access-control(7), many Http-proxy-specific conditions and parameters can be set.

Items request-uri, request-method, request-scheme, request-path, and request-version match values from the first line of the request. Note that the server hostname and port is matched by the standard ACL condition server. In a non-transparent proxy, the server hostname and port for matching both request-uri and server are taken from the request URI received from the client. If request-acl.host-hdr-transp is set, the Host header is used instead. This allows a request URI without a server name in a non-transparent request, which can occur if a request is transparently redirected differently than by the Kernun's own transparency. In a transparent proxy, the server hostname and port are taken from the Host request header. If Host is missing in the request (possible in HTTP/1.0) then the original destination IP address and port of the connection from the client is used instead. The server address and port used for matching is also the address where the proxy connects to. Especially, in a transparent proxy this may differ from the original destination address used by the client. Combination of a server and an initial part of path can be matched against a blacklist using blacklist item. The name of the blacklist database file is specified by blacklist-db. Utilities for working with blacklists are mkblacklist(1), printblacklist(1), and resolveblacklist(1). The set of categories assigned to the request URI can be matched by item clear-web-db.

Sometimes different settings are needed for some types of clients due to various errors and incompatibilities in web browsers. Therefore, the value of User-Agent HTTP header may be used for ACL matching (item user-agent).

If the client connection uses SSL/TLS, the values from the client certificate are compared to item client-cert-match during ACL matching. An ACL with client-cert-match is never used for plaintext HTTP connections.

If aproxy is configured in the session-acl, it is possible to use item aproxy-user in order to match the user and the group authenticated by AProxy. If aproxy is not configured, aproxy-user none matches.

It is possible to change the whole request URI (e.g., http://www.tns.cz/index.html). The URI is matched with a regular expression. A matching URI can be rewritten to some other string, as defined by item rewrite. The request processing continues with the current ACL, even if the new URI does not satisfy the conditions of this ACL, because request-acl matching is done only once and with the original URI. It is possible to specify a redirect permanent or redirect temporary in a rewrite. Then the proxy will not fetch the rewritten URI, but it will return a HTTP redirect response (status code 301 or 302) to the client.

Item plug-to changes the server address which the proxy connects to, but it does not change the content of Host header. It is possible to change Host header (and also the server in request URI in the case of hand-off) by http-host.

Items hand-off and language allow to overwrite the values from session-acl for a single request.

Items file-response and program-response generate a response locally by the proxy. Unlike the replace-response item in doc-acl described below, the proxy does not contact the origin server and generates the response immediately. See the section called “Program-Generated Responses” for details about the proxy-to-program interface.

Item select-optimization influences internal handling of network communication. The proxy repeatedly checks its client and server network connections for a possibility to read or write data. When a connection is ready, a piece of data is sent or received. It is more efficient to try several send or receive operations on the connections that have been ready recently before checking all existing connections again. The number of such retries is controlled by select-optimization. It may improve the proxy performance if set to a small positive value, for example 10.

Request and response headers may be filtered by items allow-req-hdr and allow-resp-hdr. These items define names of headers that will be passed by the proxy. All other headers will be deleted from the request or response. It is also possible to reject requests with request, status, and header lines not matching items req-line-check, req-hdr-check, status-line-check, and resp-hdr-check.

The proxy may add Via HTTP headers to requests and responses. These headers inform about proxies which the request or response passed through. It may be useful to track problems with a proxy, but sometimes the administrator wants to hide information that the proxy is present. Hence the use of Via headers may be configured by items request-via and response-via.

Item request-time places an upper limit on the request handling time. It eliminates stuck clients and servers and help against some DoS attacks.

Item auth-req causes that the proxy responds with 407 Proxy Authentication Required and sets the authentication realm sent to the client.

Maximum amount of data transferred between the client and the server may be limited by item max-bytes separately for each direction (client to server and server to client). Data filtration can change the size of data significantly, therefore the limits are set separately for client and server connections.

In some situations, e.g., when chunked transfer encoding is in use, the proxy buffers incoming data and sends them only when the buffer is full. There are applications which require all data to be forwarded immediately. For such situations, item flush switches buffering off.

Item ssl switches on SSL/TLS on the server connection and sets various SSL/TLS parameters. If the connection to the server uses SSL/TLS, item server-cert-match defines the acceptable server certificates. If the server certificate does not pass the test, SSL/TLS connection establishment fails and the request terminates with an error.

After request-acl is processed, the request is forwarded to a server. When the servers answers with a status line and response headers, the proxy finds the appropriate doc-acl. In addition to the generic ACL conditions and actions described in access-control(7), some Http-proxy-specific conditions and parameters can be set.

Items request-scheme, request-path, and blacklist have the same meaning as in request-acl. Item mime-type provides matching of response document content type. The proxy provides three methods of detecting the content type: content-type (from the Content-Type response header), extension (matching request URI suffix with information from mime-types), and magic (guessing the type from an initial part of response data, using the same algorithm as in the standard utility file). Selection and priorities of the methods are defined by http-proxy.doctype-identification, session-acl.doctype-ident-order, and request-acl.doctype-ident-order. The first successful method defines the type. If no method succeeds, the type will be represented by an empty string. Maximum size of data scanned by magic method can be changed by http-proxy.doctype-identification.magic.

The content type, i.e., the Content-Type header sent to the client, can be forced by set-mime-type, or set to the content type discovered for mime-type matching by force-doctype-ident. Otherwise, the header is left unchanged.

It is possible to discard some responses and to replace them with a local file. Item replace-response defines this replacement.

GIF, JPEG, and PNG images may be filtered according to the image dimensions. A local image is returned to the client instead. The image substitution is defined by filter-images. This feature can be used, for example, to filter advertisement banners, because thy often have known characteristic dimensions. Dimensions of GIF and PNG images are stored at fixed offset near the beginning of the respective files, but JPG dimensions may be far from the file beginning. Item jpeg-scan-sz restrict the size of initial part of JPEG files scanned for dimensions.

Http-proxy can filter data through an antivirus. Antivirus checking is defined by item antivirus which selects a top-level antivirus section. See antivirus(5) for details about configuration of virus checking.

Http-proxy provides HTML filtering. It is usually used to delete potentially dangerous parts of HTML data passed to client, e.g., scripts or Java applets. Features of the HTML filter are controlled by item html-filter, which selects a top-level html-filter section. See mod-html-filter(5) for details about configuration of HTML filtration.

HTTP request and response data can be processed and actions can be taken accordingly. Matching in request data is configured by item request-acl.request-body-match, response data matching is controlled by doc-acl.response-body-match. See data-matching(7) for detailed description of the data matching and processing feature.

The maximum size of the HTTP request body can be limited by setting request-acl.request-body-max-size.

If the request URI is categorized by clear-web-db, Bypass function can be enabled by clear-web-db-bypass. When accessing a matching page, the user gets an error page. By clicking a link on the page, access is enabled for a limited time to the blocked web server, or all servers belonging to the categories specified by clear-web-db-match.

Using CONNECT Method

HTTP method CONNECT is reserved for tunneling other protocols through http-proxy. It is usually utilized for SSL/TLS access to HTTPS servers. When a user sets its browser to use the proxy in the nontransparent mode, an HTTPS request causes a CONNECT request to be sent to the proxy. The proxy then creates a tunnel between the client and the server.

Note that data passed through the tunnel are encrypted and thus inaccessible to the proxy. After the tunnel is established, it is not possible to deny any HTTPS requests nor to perform data checking like HTML filtering or antivirus testing. It is therefore appropriate to limit the servers accessible via CONNECT.

In a transparent proxy configuration and HTTPS, the client does not use CONNECT, but it starts SSL/TLS immediately after establishing a TCP connection. Although it is possible to utilize http-proxy in this case (see the description of session-acl.simulate-connect below), it is usually easier to use tcp-proxy(8). An exception that requires http-proxy is when some HTTPS connections should be just passed via a TCP tunnel, but other should be decrypted by the proxy.

An alternative to a simple HTTPS tunneling is to use SSL/TLS decryption/encryption functionality of the proxy. A transparent http-proxy — which does not use CONNECT — can be configured to decrypt the connection from a client (by session-acl.ssl), process the encapsulated HTTP, and encrypt the connection to a server (by request-acl.ssl). A more complicated situation arises in the non-transparent mode. As mentioned above, the browser tries to establish a TCP tunnel through the proxy using the CONNECT method. If the request-acl contains item capture-connect, the proxy captures the CONNECT request, that is, it responds to the request as if the tunnel was established, but does not open the connection to the server. Instead, it restarts the session in transparent mode. End of the CONNECT request and session is logged, and a new session is started, which behaves as a transparent session to the server specified in the CONNECT. New session-acl and request-acl are selected that can, among other things, enable SSL/TLS decryption and encryption, in the same way as in a normal transparent http-proxy configuration. To be chosen for a new session emerged from a captured CONNECT request, a session-acl must contain item captured-connect. The session-acl selection can be based on the ACLs used for handling the CONNECT. Those ACLs are specified by items connect-session-acl and connect-request-acl.

If some connections should be decrypted and re-encrypted, but other ones are to be just passed, it is possible to set simulate-connect in a session-acl matching connection that will not be decrypted. This option behaves as if the data from the client were preceded by a CONNECT request to the destination address of the TCP connection from the client. That is, the proxy just establishes a tunnel and passes data unmodified between the client and the server. The proxy must learn the server address somehow, hence simulate-connect requires either a transparent proxy mode, or a plug-to item specifying the server address explicitly in an ACL.

The last option is to perform full inspection of the HTTPS. In this case, the http-proxy interrupts the initial phase of establishing the SSL connection from the client, it tries to contact the server and to get its certificate. If it fails, the connection to client will be reset. If the server is connected and its certificate is verified, the proxy generates a new certificate with all attributes (except some unwanted ones) from the original server's one, subscribes it by own certifiate authority and uses this new certificate for completing the connection to the client. If the original server certificate cannot be verified, then several options are available:

error: The new certificate is signed by proper Kernun CA certificate and after establishing the client connection, an error message is sent as a reply.
pass: The new certificate is signed by a special Kernun CA certificate which is intended not to be added among client's trusted key ring. Thus, the user gets a warning from the browser and he or she can decide how to continue.
fail: The connection establishing fails.
ignore: The verification failure is ignored. Highly unrecommended option!

The new certificate is stored in the cache (a file in the /data/fake-cert directory) for later re-using. Correct certificates have names starting by the C letter followed by the certificate hash and distinguishing number. Wrong certificates (used in the pass case) have the F letter on the beginning, instead. See the ssl(5) manual page for further details.

Using FTP Scheme

When the client uses the proxy in the nontrasparent mode and the user requests data from an FTP server by entering a URL starting with ftp:, the client sends an HTTP request with that URL to the proxy. The proxy is then expected to fetch the document from an FTP server and return it as an HTTP response to the client. Http-proxy does not communicate directly with FTP servers. Instead, it asks ftp-proxy(8) for doing the work. Communication between http-proxy and ftp-proxy is done in a private protocol created specially for Kernun firewall. Parameters needed for connecting to ftp-proxy are specified by item ftp-proxy.

When the firewall works in the transparent mode, HTTP clients talk directly to FTP servers. Appropriately configured ftp-proxy is needed in such situation.

User Authentication

User authentication on proxies works in HTTP in a similar way as authentication on origin servers. The difference is in status codes (407 instead of 401) and headers (Proxy-Authenticate and Proxy-Authorization instead of WWW-Authenticate and Authorization). When the proxy requires authentication and a request does not contain valid credentials, the proxy replies with 407 and sends an authentication method and a realm to the client in header Proxy-Authenticate. The client then obtains user's credentials and repeats the request with them in header Proxy-Authorization. The credentials are sent automatically in all subsequent requests. Only the Basic, Kerberos (Negotiate), and NTLM authentication schemes are supported by http-proxy. The proxy can be configured for one or both of them. If both authentication schemes are enabled, a client can choose which scheme it will use. Typically, Kerberos/NTLM-capable web browsers will use Kerberos/NTLM, other browsers will use Basic.

Basic Authentication

In order to enable user authentication, item auth must be present in session-acl, see auth(5) and auth(7). It defines authentication database (for example, a file or a RADIUS server) which will verify credentials from users. All of the authentication methods mentioned in the man page auth(7) are supported in Http-proxy. The item user is used to match user names in request-acl and doc-acl. A user name is matched if it is present with a valid password and is successfully verified. Otherwise, user none is matched.

A typical setting of user authentication involves at least two request ACLs. One is for permitting access to the authenticated users, the other one denies access, sends a realm, and asks for credentials. Example:

# Switch checking credentials on and choose user database.
session-acl SET-AUTH {
    auth passwd "/usr/local/kernun/etc/passwd";
}

# Permit any successfully authenticated user.
request-acl OK {
    user *;
}

# Not authenticated, ask for credentials.
request-acl ASK-AUTH {
    user none;
    auth-req "Kernun http-proxy";
}

Kerberos Authentication

Kerberos authentication is intended primarily for Active Directory environment, although it can be used with any Kerberos server. When using Kerberos authentication, the proxy obtains the user name from the Kerberos ticket received from the client, but the ticket does not contain information about group membership. The list of groups, which is usable in request-acl.user matching, can be obtained via LDAP.

Kerberos authentication is enabled by item kerberos-auth in a session-acl. It references a section kerberos-auth on the system level. The section specifies the Active Directory domain name and the domain controller address. In the case of a generic Kerberos, not being in an Active Directory environment, the Kerberos realm is defined by domain and the Kerberos ticket granting server by ad-controller. The kerberos-auth section can reference an LDAP server by item ldap. As the Active Directory controller contains group membership data and provides LDAP access, it is typically used also as the LDAP server.

Kerberos can be also utilized to authenticate LDAP requests by adding kerberos instead of bindinfo into an ldap-client-auth section. Then the proxy authenticates itself (obtains a TGT) upon startup using the machine account of the Kernun system in the Active Directory. Hence the machine account must have enough rights to read user group information from the Active Directory database.

As in Basic authentication, at least two request ACLs are used for Kerberos authentication. One of them permits access to authenticated users, the other one denies access and requests authentication. Example:

system ... {
    # Active Directory controller used as an LDAP server
    ldap-client-auth LDAP-AD {
        server "ldap://ad.tns.cz";
        # Authenticate to LDAP using Kerberos
        kerberos;
        active-directory "tns.cz";
    }

    # Kerberos authentication by the Active Directory Controller
    kerberos-auth KERBEROS {
        domain "TNS.CZ";
        ad-controller "ad.tns.cz";
        ldap LDAP-AD;
    }

    http-proxy HTTP {
        ...
        session-acl AUTH {
            accept;
            auth none;
            kerberos-auth KERBEROS;
        }
        ...
        request-acl KERBEROS-OK {
            user *;
            accept;
        }
        request-acl KERBEROS-ASK {
            user none;
            accept;
            auth-req "Kernun http-proxy";
        }
        ...
    }
}

After applying the Kerberos authentication configuration for the first time, the Kernun system must become a member of the Active Directory domain. Its machine account is created by the shell command

# kinit user
# msktutil -c --computer-name `hostname -s` -s HTTP/`hostname` \
--server ADC --no-pac
# chown kernun /etc/krb5.keytab

where user is a user with Domain Admins rights and ADC is the address of the Active Directory Controller. If the system is to be removed from the domain later (when Kerberos authentication is no more required or if the system will be moved to another domain), remove file /etc/krb5.keytab and delete the machine account on the Active Directory Controller.

A proxy with Kerberos authentication enabled needs access to the Kerberos configuration files /etc/krb5.conf and /etc/krb5.keytab. Hence, the proxy cannot be run chrooted unless the chroot environment is appropriately extended.

Group membership information of users authenticated by Kerberos can be cached in order to decrease load of the LDAP server. Configuration of caching consists of adding the global section oob-auth OOB and referencing it by item http-proxy.oob-auth-srv. Cached group membership information for a user name expires after a timeout controlled by items kerberos-auth.timeout-idle (expiration after a period of inactivity) and kerberos-auth.timeout-unauth (unconditional expiration).

More details about Kerberos authentication can be found in the Kernun Handbook.

NTLM Authentication

The NTLM authentication is enabled by item ntlm-auth in a session-acl. It references a section ntlm-auth on the system level. The section specifies the Active Directory domain name and the domain controller address. The proxy obtains the user name from the NTLM authentication process, but it does not get any information about group membership. The list of groups, which is usable in request-acl.user matching, can be obtained via LDAP. The ntlm-auth section can therefore reference a LDAP server by item ldap. As the Active Directory controller contains group membership data and provides LDAP access, it is typically used also as the LDAP server.

As in Basic authentication, at least two request ACLs are used for NTLM authentication. One of them permits access to authenticated users, the other one denies access and requests authentication. Example:

system ... {
    # Active Directory controller used as a LDAP server
    ldap-client-auth LDAP-AD {
        server "ldap://ad.tns.cz";
        bindinfo "cn=ADUser,dc=tns,dc=cz" "ldap-password";
        active-directory "tns.cz";
    }

    # NTLM authentication by the Active Directory Controller
    ntlm-auth NTLM {
        domain "tns.cz";
        ad-controller "ad.tns.cz";
        ldap LDAP-AD;
    }

    http-proxy HTTP {
        ...
        session-acl AUTH {
            accept;
            auth none;
            ntlm-auth NTLM;
        }
        ...
        request-acl NTLM-OK {
            user *;
            accept;
        }
        request-acl NTLM-ASK {
            user none;
            accept;
            auth-req "Kernun http-proxy";
        }
        ...
    }
}

After applying the NTLM authentication configuration for the first time, the Kernun system must become a member of the Active Directory domain. It is done by issuing the shell command

# net ads join -U user

where user is a user with Domain Admins rights, and rebooting the system. If the system is to be removed from the domain later (when NTLM authentication is no more required or if the system will be moved to another domain), it can be done by the command

# net ads leave -U user

A proxy with NTLM authentication enabled needs access to the utility ntlm_auth(1), which in turn accesses contents of directory /var/db/samba/winbindd_privileged. Hence, the proxy cannot be run chrooted unless the chroot environment is appropriately extended.

Results of NTLM authentication can be cached by out-of-band authentication, in order to decrease load of Active Directory and LDAP servers. Each new client is authenticated by NTLM. The combination of the client IP address, the user name and the list of groups is remembered in the OOB session table. Following requests from the same IP address will be authenticated as the same user and groups, without contacting the AD controller and the LDAP server.

Configuration of NTLM caching consists of adding the global section oob-auth OOB, referencing it by item http-proxy.oob-auth-srv, and adding auth oob OOB to each session-acl that contains item ntlm-auth. Cached user and group information for a client IP address expires after a timeout controlled by items ntlm-auth.timeout-idle (expiration after a period of inactivity) and ntlm-auth.timeout-unauth (unconditional expiration).

Combined Authentication Methods

In order to support clients incapable of NTLM authentication, it is possible to enable both authentication schemes by configuring the NTLM authentication and simultaneously using item auth in session-acl with a method other than none. The above NTLM example can be modified by simply changing auth none to auth passwd "...".

Cookie Modification

The proxy can be cofigured to perform modification of cookies passed between a client and a server. The value of a cookie received from a server is replaced by a new value and passed to the client. If the client sends the cookie back to the server, the proxy restores its original value before passing it to the server.

This feature reduces exploitability of stolen cookies, especially session-identification cookies in various web applications. A cookie stolen from the client is useless outside the network protected by Kernun, because its value is not that expected by the server. Even inside the protected network, a stolen cookie has only limited potential of misuse, because after the proxy sends a cookie to a client, it accepts it back only from the same client IP address.

The proxy maintains a cookie table that is use for restoring modified values of cookies passed from a client to a server. To increase security, neither the modified cookie value passed to the client, nor the related record in the cookie table suffices for restoring the original cookie value. The two pieces of information must be put together in order to reverse the cookie modification operation.

Properties of the cookie table (file name, size, expiration, and cleaning rule) are set in section cookie-table. Rules for cookie modification are defined by items request-acl.modify-cookies. It is possible to modify only some cookies, selected by name, disable checking of client IP address by flag any-client, and decide whether cookie values sent by a client to a server and not found in the cookie table should be passed unchanged (flag keep-not-found) or replaced with an empty value. A request that uses a request-acl with item delete-cookies causes deleting all cookies related to a single IP address. Either the IP address of the requesting client, or the IP address contained (in standard textual notation) in the query part of the request URI, is used, according to flag ip-from-query.

Authentication Proxy (AProxy)

It is possible to configure http-proxy for providing access from an external network to some web server in the internal protected network. Often requirements in such configuration are encryption of the communication between the client and the proxy and using challenge-response authentication. Module AProxy of http-proxy provides this functionality. If a user is not authenticated, the proxy returns an authentication form instead of a normal response. When the user authenticates, the response for the original request is returned and further requests are processed normally until the user logs out or a timeout expires.

AProxy mode is switched on by item aproxy in session-acl. It is advisable to turn on SSL/TLS between clients and the proxy by item ssl in session-acl. Configuration section aproxy sets various AProxy parameters. Section auth defines AProxy authentication database. Username/password authentication is supported for both passwd and radius, challenge/response authentication may be used only with radius. User and group names obtained during AProxy authentication are matched against request-acl.aproxy-user condition.

The proxy identifies sessions belonging to authenticated users by cookies. It is necessary to choose a cookie-name so that it does not collide with cookies used by the origin server. The maximum number of simultaneously active user sessions is specified by max-aproxy-sessions. If insecure-cookie is not set, the client is asked not to send the session cookie across an unencrypted connection. It prevents possible revealing of the cookie when the user inadvertently enters http: instead https: into the browser.

Out of Band Authentication Server

Http-proxy is used also as an OOB authentication server, see auth(7). In this mode, the proxy manages the list of OOB authenticated users and provides the list to other proxies. OOB authentication server is turned on by a section aproxy containing item oob-auth. Parameters of the OOB authentication are set by a section oob-auth referenced by http-proxy.oob-auth-srv. OOB authentication uses either the html-form method (users authenticate themselves by filling the same form as in AProxy authentication) or the external method (the list of users is provided by an external program, e.g., ooba-samba(1), which passes it via HTTP to the authentication server).

Web Filter

The request URI can be processed by an external web filter. Interface to IBM Proventia Web Filter is implemented in the proxy. The web filter has a regularly updated database of web servers. It takes a request URI from http-proxy and assigns a set of categories to it (for example, pornography, games, lifestyle, criminal activities). Then it processes the categories together with client IP address and user name (if proxy authentication is enabled) and decides according to its ruleset whether the URI should be accepted or rejected. If the web filter accepts the URI, request processing continues in http-proxy. Otherwise, the proxy returns an error page to the client.

In the web filter configuration, ICAP Integration must be enabled (in Proxy Integration dialog of the management console). Also select User Profile Support in this dialog. In the Kernun configuration, section web-filter contains parameters of a connection to a web filter. Processing a request URI by the web filter is enabled by item request-acl.web-filter.

IBM Proventia Web Filter requires user names in the form domain\user. The http-proxy uses always domain name kernun. Therefore, user names in web filter configuration must be entered as kernun\user.

Program-Generated Responses

If item request-acl.program-response is set in the configuration, HTTP requests from clients are processed by an external program specified in this configuration item. A new instance of the program is started for each request. The complete HTTP request as received from the client is passed to the standard input of the program. The program must reply with a valid HTTP response written to its standard output and terminate. The proxy then sends the response back to the client. If the program does not terminate until a configured timeout or the request processing is interrupted before the program terminates, the proxy sends the SIGTERM signal to the running program.

In addition to the HTTP request on the standard input, the program is also provided with a set of environment variables:

APROXY_USER: User name from the AProxy authentication
CONTENT_LENGTH: Size in bytes of the request body. Word chunked means that request body uses the chunked transfer encoding.
CONTEXT: Context which the program is executed in. It can be program-response for a program executed via request-acl.program-response, or one of request-end-program-ACCEPTED, request-end-program-REJECTED, request-end-program-FAILED for a program executed by request-acl.request-end-program (as described in the next section).
DOC_ACL: Name of the doc-acl used for this request or the empty string if no doc-acl has been selected. Note that in the case of a program-response, no doc-acl is used.
HTML_REPLACE_HASH: If request data have been matched by a request-acl.request-body-match rule with type html-replace, this variable contains a hash value computed fro the matching HTML form values. Otherwise, the variable contains the empty string. In fact, this variable can be also set by response data mtching a doc-acl.response-body-match rule with type html-replace, but HTML form data are usually not sent and matched in HTTP responses.
LOG_FILE: The name of the file used by the proxy for logging, or the empty string if the proxy logs via syslog.
LOG_LEVEL: The current numeric log level of the proxy.
PATH_INFO: This is the path part of the request URI, without the query part.
PROXY_NAME: The name of the proxy as specified in the configuration.
QUERY_STRING: Contains the query part of the request URI, without the initial question mark delimiting it from the path.
REMOTE_ADDR: IP address of the client
REMOTE_HOST: Host name of the client if known, empty otherwise
REMOTE_USER: User name if the user was authenticated by the proxy.
REQUEST_ACL: Name of the request-acl used for this request.
REQUEST_HOST: The host part of the request URI.
REQUEST_METHOD: The HTTP request method as specified by the client in the request
REQUEST_URI: The complete request URI.
SESSION-ACL: Name of the session-acl used for this request.

Note that although this program interface resembles the CGI commonly used by WWW servers, it does not comply to the CGI definition in RFC 3875.

Running a Program at the End of Request

Item request-acl.request-end-program enables running an external program at the end of each request. The proxy does not wait for termination of the program. The program gets information about the request in the same set of environment variables as a program for generating responses described in the previous section. The suffix of the CONTEXT variable value corresponds to the request processing result as reported in the REQUEST-END log message.

Logging

As all other Kernun proxies, http-proxy generates many log messages during its operation. Meaning of the messages may be found in section 6 of the manual pages. Details about Kernun logging can be found in logging(7).

The proxy logs statistical messages about each client connection and each request. When a connection arrives, SESSION-START is logged. Then ACL message informs about the session ACL selected for this connection. Each request generates REQUEST-START (when the request line and headers are received from the client), ACL (selection of a request ACL), and REQUEST-END (at the end of request processing). Finally, SESSION-END is logged when the client connection is closed. If AProxy is enabled, login and logout of each user is reported as an APROXY-AUTH message.

Common Kernun Features

Http-proxy uses common Kernun mechanisms for listening on its sockets, accepting client connections, and managing its processes. It can also run in a chrooted environment and change its user identity upon startup. See also application(5), tcpserver(5), and tcpserver(7).

The proxy uses a common Kernun mechanism for network input/output. The configuration allows to specify several parameters like buffer sizes and timeouts, both for client and server connections. The parameters are set in configuration sections client-conn and server-conn. See netio(7) for details.

The proxy uses common Kernun mechanism for name resolving (see resolving(7) manual page).

Http-proxy uses common Kernun mechanism for runtime monitoring. For more detailed information, see monitoring(7).

Http-proxy uses common Kernun mechanism for traffic shaping. For more detailed information, see traffic-shaping(7).

Options

-h: Display usage information and exit.
-v: Print version information and exit.
-d dbglev: Set debuging level to a specific number. Permitted values are 3 through to 9, 3 being the least and 9 the most verbose. See logging(7) for details. This setting is relevant only till configuration reading is finished.
-f cfgfile: Read configuration from cfgfile.
-r: Resolve names in configuration prior to testing.
-t test_expr: Test configuration according to given expression. Format of the test_expr is described in test-expr(5).

Document Templates

Two kinds of errors are generated by the http-proxy: hard and soft. A hard error is such a state of the proxy when the only possible reaction is to close (reset) the connection to the client immediately. A soft error means that the state of the protocol and the nature of the error allow to send an error response describing the error to the client. If a soft error occurs, the proxy sends a response (an error document) describing the error. The same mechanism is used for other responses generated locally by the proxy, e.g., FTP response for PUT method, AProxy authentication form and AProxy logout page.

The content type of a response document is text/html. The response document can be in various languages (UTF-8 charset), depending on the proxy configuration and client's preferences. Templates of response documents are stored in files named document-root/class.html.language where document-root is the root directory of error documents set in the proxy configuration file, class is a class of a response, and language distinguishes documents in various languages. Possible values for language are:

EN: language English (en)
CZ: language Czech (cs)

For each pair of response class and language there a is template file in the document root directory. The template is merely an HTML document possibly containing $$, $-$ , and $n$ (where n is a non-negative integer) directives. Each $$ or $0$ is substituted by a single $ character. Directives $-$ , $1$ , $2$ , etc. are replaced by substitution strings generated by the proxy. The substitution strings contain variable parts of a response document, which are specific for each response of a given class. Some substitution strings are common to all response classes. Additional ones may be defined for a particular class. The substitution strings are:

common for all classes

$1$: the HTTP status code of the response
$2$: the reason phrase corresponding to the status code
$3$: the request URI of the request (if the URI does not contain host, it is added from the Host header and if there is no Host header and the request is transparent, the real destination address is used)
$4$: the firewall administrator address taken from configuration (item admin)
$5$: the Kernun product type (UTM / Clear Web / Firewall+)
$6$: the session id in log format (altname[pid.session])
$7$: the request start date/time (%Y/%m/%d %H:%M:%S)

acl-deny

Error response when the request is denied by request-acl or doc-acl.

$8$: name of the ACL that denied access
$9$: the message specified by item request-acl.deny-msg or doc-acl.deny-msg
$10$: the client IP address and/or domain name
$11$: the user name (if authenticated)
$12$: the AProxy user name (if AProxy authentication used)
$13$: the list of categories assigned to the request URI by the Clear Web DataBase
$14$: the Clear Web DataBase categories assigned to the request URI matched by the clear-web-db-match item, that is, the intersection of $13$ and $15$
$15$: the Clear Web DataBase categories specified in the selected request-acl by item clear-web-db-match

The same set of substitution string is used also by the response classes clear-web-db-deny and by responses defined by request-acl.deny-msg and doc-acl.deny-msg.

clear-web-db-deny

Error response used when a request is denied by the Clear Web DataBase, that is, if a request-acl contains both items clear-web-db-match and deny.

bypass

The Clear Web DataBase Bypass activation page, returned if a request-acl contains both items clear-web-db-match and clear-web-db-bypass, and bypass has not been activated yet.

$8$: the list of categories assigned to the request URI by the Clear Web DataBase
$9$: bypass duration as set by clear-web-db-bypass.duration
$10$: the Clear Web DataBase categories specified in the selected request-acl by item clear-web-db-match

generic-error

Error response used when a soft error occurs. Description of the error is substituted for $8$ .

cert-error

Error response used when a server presents an invalid certificate.

$8$: the certificate common name
$9$: the certificate issuer name
$10$: the certificate serial number

redirect

The response used when request-acl.rewrite contains a redirect. The redirection target URI is substituted for $8$ .

ftp-response-put

Response for a PUT request with ftp: scheme. Result returned by ftp-proxy is substituted for $8$ .

aproxy-password-form

AProxy form for entering user name and password.

$8$: error message generated by AProxy
$10$: AProxy cookie name
$11$: AProxy cookie value
$12$: original request method
$13$: encoded original request headers
$-$: encoded original request body

aproxy-response-form

AProxy form which displays a challenge and asks for a response.

$8$: error message generated by AProxy
$9$: AProxy authentication challenge
$10$: AProxy cookie name
$11$: AProxy cookie value
$12$: original request method
$13$: encoded original request headers
$-$: encoded original request body

aproxy-logout

A page with information that the user has been logged out by AProxy.

Files

error_documents: Directory containing templates of error responses, FTP responses, AProxy forms, and local replacement documents; its real name and location is specified by configuration item document-root.

Bugs

The Kernun http-proxy is a security proxy, not a caching proxy. If caching of HTTP responses is needed, some caching HTTP proxy server can be chained using hand-off configuration directive or using a transparent redirection of requests.

HTTP/1.1 request pipelining is not supported. If the client sends pipelined requests, they are processed sequentially, as in the non-pipelined case.

Prev	Up	Next
h323-proxy	Home	icamd