Name

tcpserver — TCP client connections and process management in proxies

Description

The part of Kernun Firewall called tcpserver handles the server side of proxies. It is implemented by the C function tcpserver() contained in a library linked to proxies. After a proxy performs the initializion (command line parsing, configuration reading, log opening), it calls tcpserver(). Among other parameters, tcpserver() gets a callback function for connection handling. The tcpserver() function waits for a connection from a client and then calls the callback and passes it the file descriptor of the accepted connection. The callback is supposed to process the connection (it performs the proxy-specific work) and then return to tcpserver(). When this happens, tcpserver() waits for the next connection.

The tcpserver() function also manages multiple processes needed for parallel handling of connections. Moreover, it processes termination and log level change signals.

The management of proxy child processes is performed using pre-forked processes. This concept of process managemenent is used, for example, by the Apache WWW server.

Most TCP process control attributes are contained in the tcpserver configuration section (see tcpserver(5) manual page); some, which are common for TCP and UDP proxies, are part of another configuration section, application (see application(5) manual page).

Signals

TCP server handles some signals. All signals except SIGUSR1 and SIGUSR2 should be always sent to the parent process of a proxy only.

SIGUSR1

Increase the log level of a child process (or the parent process and all its children, if sent to parent).

SIGUSR2

Decrease the log level of a child process (or the parent process and all its children if sent to parent).

SIGHUP

Graceful termination; the proxy does not accept any new connection, waits until all open connections are closed, and terminates.

SIGTERM, SIGINT, SIGQUIT

Immediate termination; the proxy closes all connections and terminates immediately.

Single Process Operation

If item singleproc is present in the application configuration section, the proxy manages all connections using a single process. The algorithm is very simple:

  1. Create and bind sockets according to the configuration (see listen-on(5)).

  2. Switch credentials according to the configuration (see application(5)).

  3. Wait for a connection from a client.

  4. Call the proxy-specific connection handling function and pass it the accepted connection.

  5. After a successful return from the handling function, go to 3. If the handling function returns an error, exit TCP server.

Parent/Children Operation

If item singleproc is not present in the configuration, the parent proxy process forks child processes that handle incoming connections. The parent does not accept any connection; it only monitors the status of child processes, starts new children and/or kills superfluous ones.

Parent algorithm:

  1. Create and bind sockets according to the configuration (see listen-on(5)).

  2. Switch credentials according to the configuration (see application(5)).

  3. Create init-children child processes.

  4. Count busy children (those processing a connection) and idle ones (those waiting for a connection).

  5. If there are less than min-idle idle children, try to fork new children to achieve min-idle. At most min-start-rate children are forked and the total number of child processes never exceeds max-children. If there are still not enough idle child processes during the next parent cycle, 2 * min-start-rate new children will be forked. Subsequently, the number of forked children is doubled in each following parent cycle, up to the maximum of max-start-rate new children per cycle. If min-idle is reached, the number of forks per cycle is changed back to min-start-rate.

  6. If there are more than max-idle idle child processes, try to kill some idle children to achieve max-idle. At most kill-rate children are killed.

  7. If SIGHUP has been received, wait for all children to terminate and exit.

  8. If the parent cycle has been repeated info-cycle times, log a statistical message containing the number of forked and killed children.

  9. Wait for parent-cycle ms and start a new parent cycle (go to 4).

If the creation of a new child process fails because of a lack of system resources, it is repeated up to fork-retries times. There is a pause of fork-wait ms between every two attempts. If all fork-retries are unsuccessful, no new child is started, but the proxy continues its operation (and possibly starts children later, when the system load decreases).

Additionally, the parent process manages a single child process that resolves DNS names from the configuration. This child process is not controlled by the above algorithm and is restarted as required for proper name resolution (see resolving(7)).

Child algorithm:

  1. Start listening on all server sockets, as specified by the listen-on configuration value.

  2. Wait for a connection from a client.

  3. Call the proxy-specific connection handling function and pass it the accepted connection.

  4. After a successful return from the handling function, go to 2. If the handling function returns an error, terminate the particular child process. The proxy continues running and replaces the terminated child as necessary.

Inter-Process Communication

In order to be able to manage its child processes, the parent process must communicate with them. Two mechanisms are used for this purpose: shared memory and signals. There is a shared memory structure called scoreboard containing one slot for each possible child (i.e., max-children slots). Each child maintains a flag in its scoreboard slot that indicates whether the child is busy, or idle. The parent reads these flags when counting its children. The parent sends signals to the children in order to kill a superfluous child, perform an immediate or graceful termination, and increase or decrease the log level. As there are not enough signal numbers available, the parent uses SIGTERM for immediate termination and SIGHUP for all other requests. The type of request is indicated by a value set by the parent in the scoreboard before sending the signal.

Accept Serialization

Doing select()/accept() by multiple processes in parallel on the same set of sockets causes a problem (see, e.g., Apache WWW server documentation, section "General Performance hints"). If a single connection arrives, all processes are woken up from select() and call accept(). A single accept() succeeds and returns, all the other processes are blocked in accept(). However, all processes are waiting for a connection on a single socket now and the remaining sockets are not handled. Therefore, select() and accept() are placed in a critical section secured by a lock, which ensures that only one process sleeps in select() at a time. The lock is implemented using flock() on a file specified by a parameter of the lock item. For a large number of child processes (many hundreds or thousands), locking via flock() may behave incorrectly and block the proxy operation. Therefore, it is possible to use an alternative lock implementation selected by the alt-lock item. The following possibilities are available:

none

No locking is done. Accept is called in the non-blocking mode, in order to solve the above-mentioned problem with processes blocked in an accept() function on a single socket.

semaphore

Locking is done using a System V semaphore.

lock2

Locking uses a two-level flock() locking scheme with locking parts of a single lock file. This is an experimental variant that should not be used, because it exhibits a similar problem with many processes as the standard single flock().

multilock2

This is the recommended alternative locking mechanism. It uses a two-level flock() locking scheme with each lock on a separate file. The set of NxN processes is divided into N subsets of N processes. Members of each subset share one lock and there is a single global lock. To acquire the lock, a process must first lock the lock file belonging to its subset and then lock the global lock. This algorithm reduces the maximum number of processes waiting on a single lock file.

If either both or none of the lock and alt-lock items are specified, the standard locking is used if max-children is up to 500, and multilock2 is used for max-children of 501 or more.

Note

Experiments indicate that this arrangement is not strictly necessary on FreeBSD, because it seems that if there is a very short time between select() and accept(), only a single process is woken up from select() and calls accept(). However, this positive feature is dependent on timing (and thus on such unpredictable conditions as the system load). We have implemented the serialization lock in order to prevent race conditions.

Warning

Be careful when configuring lock-file names for proxies. If two different proxies happen to use the same filename, one of them gets stuck. Such a situation looks rather strange: the TCP handshake takes place, but data exchange does not. As proxy processes are unable to detect this situation, care should be taken.

Process Groups

Caution

If item nodaemon without singleproc is used in the configuration, i.e., parent/children operation in no-daemon mode, the proxy runs in the same process group as its parent process (if it was not moved to another group before executing the proxy program). The proxy parent process uses kill(0, sig) syscall to propagate SIGTERM and SIGHUP to its children. But the signal is delivered to all processes in the process group of the proxy. Thus, other processes (not belonging to the proxy) in the same group should make appropriate provisions in order not to be disturbed by these signals.

See Also

listen-on(5), application(5), tcpserver(5), resolving(7)

Authors

This man page is a part of Kernun Firewall.
Copyright © 2000–2023 Trusted Network Solutions, a. s.
All rights reserved.