tcpserver — TCP client connections and process management in proxies
The part of Kernun Firewall called tcpserver
    handles the server side of proxies. It is implemented by the C function 
    tcpserver() contained in a library linked
    to proxies. After a proxy performs the initializion (command line parsing,
    configuration reading, log opening), it calls
    tcpserver(). Among other parameters,
    tcpserver() gets a callback function for connection
    handling. The tcpserver() function waits for a connection
    from a client and then calls the callback and passes it the file
    descriptor of the accepted connection. The callback is supposed to process
    the connection (it performs the proxy-specific work) and then return to
    tcpserver(). When this happens,
    tcpserver() waits for the next connection.
    
The tcpserver() function also manages multiple
    processes needed for parallel handling of connections. Moreover, it
    processes termination and log level change signals.
    
The management of proxy child processes is performed using pre-forked processes. This concept of process managemenent is used, for example, by the Apache WWW server.
Most TCP process control attributes are contained in the
    tcpserver configuration section (see
    tcpserver(5) manual page); some, which
    are common for TCP and UDP proxies, are part of another configuration
    section, application (see
    application(5) manual page).
    
TCP server handles some signals. All signals except
        SIGUSR1 and SIGUSR2 should
        be always sent to the parent process of a proxy only.
        
SIGUSR1Increase the log level of a child process (or the parent process and all its children, if sent to parent).
SIGUSR2Decrease the log level of a child process (or the parent process and all its children if sent to parent).
SIGHUPGraceful termination; the proxy does not accept any new connection, waits until all open connections are closed, and terminates.
SIGTERM, SIGINT,
            SIGQUITImmediate termination; the proxy closes all connections and terminates immediately.
If item singleproc is present in the
        application configuration section, the proxy manages
        all connections using a single process. The algorithm is very simple:
        
Create and bind sockets according to the configuration (see listen-on(5)).
Switch credentials according to the configuration (see application(5)).
Wait for a connection from a client.
Call the proxy-specific connection handling function and pass it the accepted connection.
After a successful return from the handling function, go to 3. If the handling function returns an error, exit TCP server.
If item singleproc is not present in the
        configuration, the parent proxy process forks child processes
        that handle incoming connections. The parent does not accept any
        connection; it only monitors the status of child processes, starts new
        children and/or kills superfluous ones.
        
Parent algorithm:
Create and bind sockets according to the configuration (see listen-on(5)).
Switch credentials according to the configuration (see application(5)).
Create init-children child processes.
            
Count busy children (those processing a connection) and idle ones (those waiting for a connection).
If there are less than min-idle idle
            children, try to fork new children to achieve
            min-idle. At most min-start-rate
            children are forked and the total number of child processes never
            exceeds max-children. If there are still not
            enough idle child processes during the next parent cycle,
            2 * min-start-rate new children will be
            forked. Subsequently, the number of forked children is doubled in
            each following parent cycle, up to the maximum of
            max-start-rate new children per cycle. If
            min-idle is reached, the number of forks per cycle is
            changed back to min-start-rate.
            
If there are more than max-idle idle child
            processes, try to kill some idle children to achieve
            max-idle. At most kill-rate
            children are killed.
 If SIGHUP has been received, wait for all
            children to terminate and exit. 
If the parent cycle has been repeated info-cycle
            times, log a statistical message containing the number of forked and killed
            children. 
Wait for parent-cycle ms and start a new parent
            cycle (go to 4). 
If the creation of a new child process fails because of a lack
        of system resources, it is repeated up to fork-retries
        times. There is a pause of fork-wait ms
        between every two attempts. If all fork-retries are
        unsuccessful, no new child is started, but the proxy continues its
        operation (and possibly starts children later, when the system load
        decreases). 
Additionally, the parent process manages a single child process that resolves DNS names from the configuration. This child process is not controlled by the above algorithm and is restarted as required for proper name resolution (see resolving(7)).
Child algorithm:
Start listening on all server sockets, as specified by the
            listen-on configuration value. 
Wait for a connection from a client.
Call the proxy-specific connection handling function and pass it the accepted connection.
After a successful return from the handling function, go to 2. If the handling function returns an error, terminate the particular child process. The proxy continues running and replaces the terminated child as necessary.
In order to be able to manage its child processes, the parent 
        process must communicate with them. Two mechanisms are used for this
        purpose: shared memory and signals. There is a shared memory structure
        called “scoreboard” containing one slot for each possible
        child (i.e., max-children slots). Each child
        maintains a flag in its scoreboard slot that indicates whether the
        child is busy, or idle. The parent reads these flags when counting
        its children. The parent sends signals to the children in order to kill
        a superfluous child, perform an immediate or graceful termination, and
        increase or decrease the log level. As there are not enough signal
        numbers available, the parent uses SIGTERM for
        immediate termination and SIGHUP for all other
        requests. The type of request is indicated by a value set by the
        parent in the scoreboard before sending the signal.
        
Doing
        select()/accept() by
        multiple processes in parallel on the same set of sockets causes
        a problem (see, e.g., Apache WWW server documentation, section
        "General Performance hints"). If a single connection arrives, all
        processes are woken up from select() and call
        accept(). A single accept()
        succeeds and returns, all the other processes are blocked in
        accept(). However, all processes are waiting for
        a connection on a single socket now and the remaining sockets are not
        handled. Therefore, select() and
        accept() are placed in a critical section secured
        by a lock, which ensures that only one process sleeps
        in select() at a time. The lock is implemented
        using flock() on a file specified by a parameter
        of the lock item. For a large number of child processes
        (many hundreds or thousands), locking via flock()
        may behave incorrectly and block the proxy operation. Therefore, it is
        possible to use an alternative lock implementation selected by the
        alt-lock item. The following possibilities are available:
        
noneNo locking is done. Accept is called in the
            non-blocking mode, in order to solve the above-mentioned problem
            with processes blocked in an accept() function on
            a single socket.
semaphoreLocking is done using a System V semaphore.
lock2Locking uses a two-level
            flock() locking scheme with locking parts of
            a single lock file. This is an experimental variant that should
            not be used, because it exhibits a similar problem with many
            processes as the standard single
            flock().
multilock2This is the recommended alternative locking
            mechanism. It uses a two-level flock()
            locking scheme with each lock on a separate file. The set of
            NxN processes is divided into N subsets of N processes. Members of
            each subset share one lock and there is a single global lock.
            To acquire the lock, a process must first lock the lock file
            belonging to its subset and then lock the global lock. This
            algorithm reduces the maximum number of processes waiting on
            a single lock file.
        If either both or none of the lock and
        alt-lock items are specified, the standard locking
        is used if max-children is up to 500, and
        multilock2 is used for
        max-children of 501 or more.
        
Experiments indicate that this arrangement is not strictly
        necessary on FreeBSD, because it seems that if there is a very short
        time between select() and
        accept(), only a single process is woken up from
        select() and calls accept().
        However, this positive feature is dependent on timing (and thus on such
        unpredictable conditions as the system load). We have implemented the 
        serialization lock in order to prevent race conditions.
        
Be careful when configuring lock-file names for proxies. If two different proxies happen to use the same filename, one of them gets stuck. Such a situation looks rather strange: the TCP handshake takes place, but data exchange does not. As proxy processes are unable to detect this situation, care should be taken.
If item nodaemon without
        singleproc is used in the configuration, i.e.,
        parent/children operation in no-daemon mode, the proxy runs in
        the same process group as its parent process (if it was not moved to
        another group before executing the proxy program). The proxy parent
        process uses kill(0, sig) syscall to
        propagate SIGTERM and SIGHUP
        to its children. But the signal is delivered to all processes in the
        process group of the proxy. Thus, other processes (not belonging to
        the proxy) in the same group should make appropriate provisions in 
        order not to be disturbed by these signals.