crossroads

Git mirror of https://crossroads.e-tunity.com/
git clone git://git.finwo.net/app/crossroads
Log | Files | Refs

crossroads.html (129102B)


      1 <a name="defs.yo"></a><html><head>
      2 <title>Crossroads 1.23</title>
      3 <link rel="stylesheet" type="text/css" href="http://www.e-tunity.com/css/yodl.css">
      4 <link rel="stylesheet" type="text/css" href="http://www.e-tunity.com/css/yodl.css">
      5 <link rev="made" href="mailto:info@e-tunity.com">
      6 </head>
      7 <body>
      8 <hr>
      9 <h1>Crossroads 1.23</h1>
     10 <h2>Karel Kubat</h2>
     11 
     12 <h2>e-tunity</h2><h2>2005, 2006, ff.</h2>
     13 
     14 <blockquote><em>Crossroads is a load balance and fail over utility for TCP
     15          based services. It is a daemon program running in user
     16          space, and features extensive configurability, polling of
     17          back ends using 'wakeup calls', detailed status reporting,
     18          'hooks' for special actions when backend calls fail, and much
     19          more. Crossroads is service-independent: it is usable for
     20          HTTP/HTTPS, SSH, SMTP, DNS, etc. In the case of HTTP
     21          balancing, Crossroads can modify HTTP headers, e.g. to
     22          provide 'session stickiness' for
     23          back-end processes that need sessions, but aren't
     24          session-aware of other back-ends.</em></blockquote>
     25 
     26 <h1>Table of Contents</h1>
     27 <dl>
     28 <dl>
     29 <dt><h3><a href="#l1">1: Introduction</a></h3></dt>
     30 <dl>
     31 <dt><a href="#l2">1.1: Obtaining Crossroads</a></dt>
     32 <dt><a href="#l3">1.2: Copyright and Disclaimer</a></dt>
     33 <dt><a href="#l4">1.3: Terminology</a></dt>
     34 <dt><a href="#l5">1.4: Porting issues for pre-1.21 installations</a></dt>
     35 <dt><a href="#l6">1.5: Porting issues for pre-0.26 installations</a></dt>
     36 <dt><a href="#l7">1.6: Porting issues for pre-1.08 installations</a></dt>
     37 </dl>
     38 <dt><h3><a href="#l8">2: Installation for the impatient</a></h3></dt>
     39 <dt><h3><a href="#l9">3: Using Crossroads</a></h3></dt>
     40 <dl>
     41 <dt><a href="#l10">3.1: General Commandline Syntax</a></dt>
     42 <dt><a href="#l11">3.2: Logging-related options</a></dt>
     43 <dt><a href="#l12">3.3: Reloading Configurations</a></dt>
     44 </dl>
     45 <dt><h3><a href="#l13">4: The configuration</a></h3></dt>
     46 <dl>
     47 <dt><a href="#l14">4.1: General language elements</a></dt>
     48 <dl>
     49 <dt><a href="#l15">4.1.1: Empty lines and comments</a></dt>
     50 <dt><a href="#l16">4.1.2: Keywords, numbers, identifiers, generic strings</a></dt>
     51 </dl>
     52 <dt><a href="#l17">4.2: Service definitions</a></dt>
     53 <dl>
     54 <dt><a href="#l18">4.2.1: type - Defining the service type</a></dt>
     55 <dt><a href="#l19">4.2.2: port - Specifying the listen port</a></dt>
     56 <dt><a href="#l20">4.2.3: bindto - Binding to a specific IP address</a></dt>
     57 <dt><a href="#l21">4.2.4: verbosity - Controlling debug output</a></dt>
     58 <dt><a href="#l22">4.2.5: dispatchmode - How are back ends selected</a></dt>
     59 <dt><a href="#l23">4.2.6: revivinginterval - Back end wakeup calls</a></dt>
     60 <dt><a href="#l24">4.2.7: maxconnections - Limiting concurrent clients at service level</a></dt>
     61 <dt><a href="#l25">4.2.8: backlog - The TCP Back Log size</a></dt>
     62 <dt><a href="#l26">4.2.9: shmkey - Shared Memory Access</a></dt>
     63 <dt><a href="#l27">4.2.10: allow* and deny* - Allowing or denying connections</a></dt>
     64 <dt><a href="#l28">4.2.11: useraccount - Limiting the effective ID of external processes</a></dt>
     65 </dl>
     66 <dt><a href="#l29">4.3: Backend definitions</a></dt>
     67 <dl>
     68 <dt><a href="#l30">4.3.1: server - Specifying the back end address</a></dt>
     69 <dt><a href="#l31">4.3.2: verbosity - Controlling verbosity at the back end level</a></dt>
     70 <dt><a href="#l32">4.3.3: weight - When a back end is more equal than others</a></dt>
     71 <dt><a href="#l33">4.3.4: decay - Levelling out activity of a back end</a></dt>
     72 <dt><a href="#l34">4.3.5: onstart, onend, onfail - Action Hooks</a></dt>
     73 <dt><a href="#l35">4.3.6: trafficlog and throughputlog - Debugging and Performance Aids</a></dt>
     74 <dt><a href="#l36">4.3.7: stickycookie - Back end selection with an HTTP cookie</a></dt>
     75 <dt><a href="#l37">4.3.8: HTTP Header Modification Directives</a></dt>
     76 </dl>
     77 </dl>
     78 <dt><h3><a href="#l38">5: Tips, Tricks and Remarks</a></h3></dt>
     79 <dl>
     80 <dt><a href="#l39">5.1: How back ends are selected in load balancing</a></dt>
     81 <dl>
     82 <dt><a href="#l40">5.1.1: Bysize, byduration or byconnections?</a></dt>
     83 <dt><a href="#l41">5.1.2: Averaging size and duration</a></dt>
     84 <dt><a href="#l42">5.1.3: Specifying decays</a></dt>
     85 <dt><a href="#l43">5.1.4: Adjusting the weights</a></dt>
     86 <dt><a href="#l44">5.1.5: Throttling the number of concurrent connections</a></dt>
     87 </dl>
     88 <dt><a href="#l45">5.2: Using an external program to dispatch</a></dt>
     89 <dl>
     90 <dt><a href="#l46">5.2.1: Configuring the external handler</a></dt>
     91 <dt><a href="#l47">5.2.2: Writing the external handler</a></dt>
     92 <dt><a href="#l48">5.2.3: Examples of external handlers</a></dt>
     93 </dl>
     94 <dt><a href="#l49">5.3: HTTP Session Stickiness</a></dt>
     95 <dl>
     96 <dt><a href="#l50">5.3.1: Don't use stickiness!</a></dt>
     97 <dt><a href="#l51">5.3.2: But if you must..</a></dt>
     98 </dl>
     99 <dt><a href="#l52">5.4: Passing the client's IP address</a></dt>
    100 <dl>
    101 <dt><a href="#l53">5.4.1: Sample Crossroads configuration</a></dt>
    102 <dt><a href="#l54">5.4.2: Sample Apache configuration</a></dt>
    103 </dl>
    104 <dt><a href="#l55">5.5: Debugging network traffic</a></dt>
    105 <dt><a href="#l56">5.6: Limiting Access to Crossroads by Client IP Address</a></dt>
    106 <dl>
    107 <dt><a href="#l57">5.6.1: General Examples</a></dt>
    108 <dt><a href="#l58">5.6.2: Using External Files</a></dt>
    109 <dt><a href="#l59">5.6.3: Mixing Directives</a></dt>
    110 </dl>
    111 <dt><a href="#l60">5.7: Configuration examples</a></dt>
    112 <dl>
    113 <dt><a href="#l61">5.7.1: A load balancer for three webserver back ends</a></dt>
    114 <dt><a href="#l62">5.7.2: An HTTP forwarder when travelling</a></dt>
    115 <dt><a href="#l63">5.7.3: SSH login with enforced idle logout</a></dt>
    116 </dl>
    117 </dl>
    118 <dt><h3><a href="#l64">6: Benchmarking</a></h3></dt>
    119 <dl>
    120 <dt><a href="#l65">6.1: Benchmark 1: Accessing a proxy via crossroads or directly</a></dt>
    121 <dl>
    122 <dt><a href="#l66">6.1.1: Results</a></dt>
    123 <dt><a href="#l67">6.1.2: Discussion</a></dt>
    124 </dl>
    125 <dt><a href="#l68">6.2: Benchmark 2: Crossroads versus Linux Virtual Server (LVS)</a></dt>
    126 <dl>
    127 <dt><a href="#l69">6.2.1: Environment</a></dt>
    128 <dt><a href="#l70">6.2.2: Tests and results</a></dt>
    129 </dl>
    130 </dl>
    131 <dt><h3><a href="#l71">7: Compiling and Installing</a></h3></dt>
    132 <dl>
    133 <dt><a href="#l72">7.1: Prerequisites</a></dt>
    134 <dt><a href="#l73">7.2: Compiling and installing</a></dt>
    135 <dt><a href="#l74">7.3: Configuring crossroads</a></dt>
    136 <dt><a href="#l75">7.4: A boot script</a></dt>
    137 <dl>
    138 <dt><a href="#l76">7.4.1: SysV Style Startup</a></dt>
    139 <dt><a href="#l77">7.4.2: BSD Style Startup</a></dt>
    140 </dl>
    141 
    142 <p><hr><p>
    143 <p>
    144 <a name="l1"></a>
    145 <h2>1: Introduction</h2>
    146 <a name="intro"></a>Crossroads is a daemon that basically accepts TCP connections
    147 at preconfigured ports, and given a list of 'back ends'
    148 distributes each incoming connection to one of the back ends,
    149 so that a client request is
    150 served. Additionally, crossroads maintains an internal
    151 administration of the back end connectivity: if a back end isn't
    152 usable, then the client request is handled using another back
    153 end. Crossroads will then periodically check whether a previously not
    154 usable back end has come to life yet. Also, crossroads can select
    155 back ends by estimating the load, so that balancing is achieved.
    156 <p>
    157 Using this approach, crossroads serves as load balancer and fail over
    158 utility. Crossroads will very likely not be as reliable as
    159 hardware based balancers, since it always will require a server to
    160 run on. This server, in turn, may become a new Single Point of
    161 Failure (SPOS).  However, in situations where cost efficiency is an issue,
    162 crossroads may be a good choice. Furthermore, crossroads can be
    163 deployed in situations where a hardware based balancing already
    164 exists and augmenting service reliability is needed. Or, crossroads may be
    165 run off a diskless system, which again improves reliability of the
    166 underlying hardware.
    167 <p>
    168 This document describes how to use crossroads, how to configure it
    169 in order to increase the reliability of your systems, and how to
    170 compile the program from its sources. This document is
    171 also available in <a href="crossroads.pdf">PDF</a> format.
    172 <p>
    173 <a name="l2"></a>
    174 <h3>1.1: Obtaining Crossroads</h3>
    175 <p>
    176 As quick reference, here are some important URL's for Crossroads:
    177 <p>
    178 <ul>
    179         <li> <a href="http:/crossroads.e-tunity.com">http:/crossroads.e-tunity.com</a> is the site that serves
    180         Crossroads. You can browse this at leisure
    181         for documentation, sources, and so on.
    182 <p>
    183 <li> <a href="http://freshmeat.net/projects/crossr">http://freshmeat.net/projects/crossr</a> is the
    184         Freshmeat announcement page.
    185 <p>
    186 <li> <a href="svn://svn.e-tunity.com/crossroads">svn://svn.e-tunity.com/crossroads</a> is the SVN
    187         repository; anonymous reading (fetching) is allowed. In order
    188         to commit changes, <a href="mailto:karel@e-tunity.com">mail me</a> for
    189         credentials.</ul>
    190 <p>
    191 <a name="l3"></a>
    192 <h3>1.2: Copyright and Disclaimer</h3>
    193 <p>
    194 Crossroads is distributed as-is, without assumptions of fitness
    195 or usability. You are free to use crossroads to your
    196 liking. It's free, and as with everything that's free: there's
    197 also no warranty.
    198 <p>
    199 You are allowed to make modifications to the source code of
    200 crossroads, and you are allowed to (re)distribute crossroads, as
    201 long as you include this text, all sources, and if applicable: all
    202 your modifications, with each distribution.
    203 <p>
    204 While you are allowed to make any and all changes to the sources,
    205 I would appreciate hearing about them. If the changes concern new
    206 functionality or bugfixes, then I'll include them in a next
    207 release, stating full credits. If you want to seriously contribute (to
    208 which you are heartily encouraged), then mail me and I'll get you
    209 access to the Crossroads SVN repository, so that you can update and
    210 commit as you like.
    211 <p>
    212 <a name="l4"></a>
    213 <h3>1.3: Terminology</h3>
    214 <p>
    215 Throughout this document, the following terms are used: &nbsp;(Many
    216 more meanings of the terms will exist -- yes, I am aware of that. I'm
    217 using the terms here in a very strict sense.)
    218 <p>
    219 <dl>
    220         <p><dt><strong>A client</strong><dd> is a process that initiates a network connection
    221             to get contact with some service. 
    222         <p><dt><strong>A service</strong><dd> or <strong>server process</strong> or <strong>listener</strong>
    223             is a central application
    224             that accepts network connections from clients and sevices
    225             them.
    226         <p><dt><strong>Back ends</strong><dd> are locations where crossroads looks in
    227             order to service its clients. Crossroads sits 'in between'
    228             and does its tricks. Therefore, as far as the back ends
    229             are concerned, crossroads behaves like a client. As far as
    230             the true client is concerned, crossroads behaves like the
    231             service. The communication is however transparent: neither
    232             client nor back end are aware of the middle position of
    233             crossroads.
    234         <p><dt><strong>A connection</strong><dd> is a network conversation between client and service,
    235             where data are transferred to and fro. As
    236             far as crossroads is concerned, success means that a
    237             connection can be established without errors on
    238             the network level. Crossroads isn't aware of service
    239             pecularities. E.g., when a webserver answers <code>HTTP/1.0
    240             500 Server Error</code> then crossroads will see this as a
    241             succesful connection, though the user behind a browser may
    242             think otherwise.
    243         <p><dt><strong>Back end selection algorithms</strong><dd> are methods by which
    244             crossroads determines which back end it will talk to
    245             next. Crossroads has a number of built-in algorithms,
    246             which may be configured per service.
    247         <p><dt><strong>Back end states</strong><dd> are the statusses of each back end that
    248             is known to crossroads. A back end may be available,
    249             (temporarily) unavailble or truly down. When a back end is
    250             temporarily unavailable, then crossroads will periodically
    251             check whether the back end has come to life yet (that is,
    252             if configured so).
    253         <p><dt><strong>A spike</strong><dd> is a sudden increase in activity, leading to
    254             extra load on a given service. When crossroads is in
    255             effect and when the spike occurs in one connection,
    256             then obviously the spike will also appear at one
    257             of the back ends. However, crossroads will see the spike
    258             and will make sure that a subsequent request goes to an
    259             other back end. In contrast, when several connections
    260             arrive simultaneously and cause a spike, then crossroads
    261             will be able to distribute the connections over several
    262             back ends, thereby 'flattening out' the increase.
    263         <p><dt><strong>Load balancing</strong><dd> means that incoming client requests are
    264             distributed over more than just one back end (which wouldn't be the
    265             case if you wouldn't be running crossroads). Enabling load
    266             balancing is nothing more than duplicating services over
    267             more than one back end, and having something (in this
    268             case: crossroads) distribute the requests, so that per
    269             back end the load doesn't get too high.
    270         <p><dt><strong>An HTTP session</strong><dd> is a series of separate network connections
    271             that originate from one browser. E.g., to fill the display
    272             with text and images, the browser hits a website several times.
    273             An HTTP session may even span several
    274             screens. E.g., a website registration dialog may involve 3
    275             screens that when called from the same browser, 
    276             form a logical group of some sort.
    277         <p><dt><strong>Headers</strong><dd> or <strong>header lines</strong> are specific parts of an HTTP
    278             message. Crossroads has directives to add or modify
    279             headers that are part of the request that a browser sends
    280             to server, or those that are part of the server.
    281         <p><dt><strong>Session stickiness</strong><dd> means that when a browser starts an
    282             HTTP dialog, the balancer makes sure that it 'sticks' to
    283             the same back end (i.e., subsequent requests from the
    284             browser are forced to go to the same back end, instead of
    285             being balanced to other ones).
    286         <p><dt><strong>Back end usage</strong><dd> is measured by crossroads in order to be
    287             able to determine back end selection. Crossroads stores
    288             information about the number of active connections, the
    289             transferred bytes and 
    290             about the connection duration. These numbers can be used to
    291             estimate which back end is the least used -- and
    292             therefore, presumably, the best candidate for a new
    293             request.
    294         <p><dt><strong>Fail over</strong><dd> is almost always used when load balancing is in
    295             effect. The distributor of client requests (crossroads of
    296             course) can also monitor back ends, so that incase a back
    297             end is 'down', it is no longer accessed.
    298         <p><dt><strong>Service downtime</strong><dd> normally occurs when a service is
    299             switched off. Downtime is obviously avoided when fail over
    300             is in effect: a back end can be taken out of service in a
    301             controlled manner, without any client noticing it.
    302 </dl>
    303 <p>
    304 <a name="l5"></a>
    305 <h3>1.4: Porting issues for pre-1.21 installations</h3>
    306 <p>
    307 As of version 1.21, the event-hook directives <code>onsuccess</code> and
    308     <code>onfailure</code> no longer exists.
    309 <p>
    310 <ul>
    311         <li> Please replace <code>onsuccess</code> by <code>onstart</code>;
    312         <li> Please replace <code>onfailure</code> bu <code>onfail</code>;
    313         <li> Note that there is a new hook <code>onend</code>.</ul>
    314 <p>
    315 The commands that are run via <code>onstart</code>, <code>onend</code> or <code>onfail</code>
    316     are subject to format expansion; e.g., <code>%1w</code> is expanded to the
    317     weight of the first back end, etc.. See section <a href="crossroads.html#config">4</a> for details.
    318 <p>
    319 <a name="l6"></a>
    320 <h3>1.5: Porting issues for pre-0.26 installations</h3>
    321 <p>
    322 As of version 0.26 the syntax of the configuration file has
    323     changed. In particular:
    324 <p>
    325 <ul>
    326         <li> The keyword <code>maxconnections</code> is now used instead of
    327              <code>maxclients</code>;
    328         <li> The keyword <code>connectiontimeout</code> is now used instead of
    329              <code>sessiontimeout</code>.</ul>
    330 <p>
    331 Therefore when converting configuration files to the new syntax,
    332     the above keywords must be changed. (The reason for these changes
    333     is that 0.26 introduces <em>sticky HTTP sessions</em> that span
    334     multiple TCP connections, and the term
    335     <em>session</em> is used strictly in that sense -- and no longer for a
    336     TCP connection.)
    337 <p>
    338 <a name="l7"></a>
    339 <h3>1.6: Porting issues for pre-1.08 installations</h3>
    340 <p>
    341 As of version 1.08, the following directives no longer are
    342     supported:
    343 <p>
    344 <ul>
    345         <li> <code>insertstickycookie</code> was replaced by the more generic
    346         directive <code>addclientheader</code>. E.g., instead of <br>
    347             <code>insertstickycookie "XRID=100; Path=/";</code> <br>
    348         the syntax is now <br>
    349             <code>addclientheader "Set-Cookie: XRID=100; Path=/";</code>
    350 <p>
    351 <li> <code>insertrealip</code> was replaced by the more generic
    352         directive <code>setserverheader</code>. E.g., instead of <br>
    353             <code>insertrealip on;</code> <br>
    354         the syntax is now <br>
    355             <code>setserverheader "XR-Real-IP: %r";</code> <br>
    356         This incidentally also makes it possible to change the header
    357         name (here: <code>XR-Real-IP</code>).</ul>
    358 <p>
    359 <a name="l8"></a>
    360 <h2>2: Installation for the impatient</h2>
    361 <a name="impatient"></a>
    362 For the impatient, here's the very-quick-but-very-superficial recipy
    363 for getting crossroads up and running:
    364 <p>
    365 <ul>
    366 <p>
    367 <li> If you don't have SVN or don't want to use it:
    368 <p>
    369 <ul>
    370           <li> Obtain the crossroads source archive at
    371           <a href="http://crossroads.e-tunity.com">http://crossroads.e-tunity.com</a>.
    372 <p>
    373 <li> Change-dir to a 'sources' directory on your system and
    374           unpack the archive.
    375 <p>
    376 <li> Change-dir into the create directory <code>crossroads/</code>.</ul>
    377 <p>
    378 <li> If you have SVN and want to go for the newest snapshot:
    379 <p>
    380 <ul>
    381           <li> Get the latest sources and snapshots using SVN from <br>
    382  	  <code>svn://svn.e-tunity.com/crossroads</code>.
    383 <p>
    384 <li> You'll find the newest alpha version under
    385           <code>crossroads/trunk</code> and the stable versions under
    386           <code>crossroads/tags</code>,
    387           e.g. <code>crossroads/tags/release-1.00</code>.
    388 <p>
    389 <li> Choose which you want to use: the latest stable
    390           release, or the bleeding edge alpha? In the former case,
    391           change-dir to <code>crossroads/tags/release-</code><em>X.YY</em>, where
    392           <em>X.YY</em> is a release ID. In the latter case, change-dir to
    393           <code>crossroads/trunk</code>.</ul>
    394 <p>
    395 <li> Type <code>make install</code>. This installs the crossroads
    396         binary into <code>/usr/local/bin/</code>. If the compilation doesn't
    397         work on your system, check <code>etc/Makefile.def</code> for hints.
    398 <p>
    399 <li> Create a file <code>/etc/crossroads.conf</code>. In it state
    400         something like:
    401 <p>
    402 <pre>
    403 service www {
    404     port 80;
    405     revivinginterval 15;
    406     backend one {
    407         server 10.1.1.100:80;
    408     }
    409     backend two {
    410         server 10.1.1.101:80;
    411     }
    412 }
    413 </pre>
    414 
    415 <p>
    416 That's off course assuming that you want to balance HTTP on
    417         port 80 to two back ends at 10.1.1.100 and 10.1.1.101.
    418 <p>
    419 <li> Type <code>crossroads start</code>.
    420 <p>
    421 <li> Surf to the machine where crossroads is running. You will
    422         see the pages served by the back ends 10.1.1.100 or
    423         10.1.1.101.
    424 <p>
    425 <li> To monitor the status of crossroads, type <code>crossroads
    426         status</code>.
    427 </ul>
    428 <p>
    429 <a name="l9"></a>
    430 <h2>3: Using Crossroads</h2>
    431 <a name="using"></a>Crossroads is started from the commandline, and highly depends on
    432 <code>/etc/crossroads.conf</code> (the default configuration file). It 
    433 supports a number of flags (e.g., to overrule the location of the
    434 configuration file). The actual usage information is always obtained
    435 by typing <code>crossroads</code> without any arguments. Crossroads then
    436 displays the allowed arguments.
    437 <p>
    438 <a name="l10"></a>
    439 <h3>3.1: General Commandline Syntax</h3>
    440 <p>
    441 This section shows the most basic usage. As said above, start
    442 <code>crossroads</code> without arguments to view the full listing of options.
    443 <p>
    444 <ul>
    445         <li> <code>crossroads start</code> and <code>crossroads stop</code> are typical
    446              actions that are run from system startup scripts. The
    447              meaning is self-explanatory.
    448         <li> <code>crossroads restart</code> is a combination of the former
    449              two. Beware that a restart may cause discontinuity in
    450              service; it is just a shorthand for typing the 'stop' and
    451              'start' actions after one another.
    452         <li> <code>crossroad status</code> reports on each running
    453              service. Per service, the state of each back end is
    454              reported.
    455         <li> <code>crossroads tell</code> <em>service backend state</em> is a
    456              command line way of telling crossroads that a given back
    457              end, of a given service, is in a given state. Normally
    458              crossroads maintains state information itself, but by
    459              using <code>crossroads tell</code>, a back end can be e.g. taken
    460              'off line' for servicing.
    461         <li> <code>crossroads configtest</code> tells you whether the
    462              configuration is syntactially correct.
    463         <li> <code>crossroads services</code> reports on the configured 
    464              services. In contrast to <code>crossroads status</code>, this
    465              option only shows what's configured -- not what's up and
    466              running. Therefore, <code>crossroads services</code> doesn't
    467              report on back end states.
    468         <li> <code>crossroads sampleconf</code> shows a sample configuration on
    469              screen. A good way of quicky viewing the configuration
    470              file syntax, or of getting a start for your own
    471              configuration <code>/etc/crossroads.conf</code>.
    472 </ul>	     
    473 <p>
    474 <a name="l11"></a>
    475 <h3>3.2: Logging-related options</h3>
    476 <p>
    477 Two 'flags' of Crossroads are specifically logging-related. This
    478 section elaborates on these flags.
    479 <p>
    480 First, there's flag <code>-a</code>. When present, the start and end of
    481 activity is logged using statements like
    482 <p>
    483 <center><em>YYYY-MM-DD HH/MM/SS starting http from 61.45.32.189 to 10.1.1.1</em></center>
    484 <p>
    485 Similarly, there are 'ending' statements. Using this flag and 
    486 scanning your logs for these statements may be helpful in quickly
    487 determining your system load.
    488 <p>
    489 Second, there's flag <code>-l</code>. This flag selects the 'facility' of
    490 logging and defaults to <code>LOG_DAEMON</code>. You can supply a number
    491 between 0 and 7 to flag <code>-l</code> to select <code>LOG_LOCAL0</code> to
    492 <code>LOG_LOCAL7</code>. This would separate the Crossroads-related logging
    493 from other streams. Here's a very short guide; please read your Unix
    494 manpages of <code>syslogd</code> for more information.
    495 <p>
    496 <ul>
    497     <li> First edit <code>/etc/syslog.conf</code> and add a line:
    498 <p>
    499 <pre>
    500 local7.*   /var/log/crossroads.log
    501 </pre>
    502 
    503 <p>
    504 That instructs <code>syslogd</code> to send <code>LOG_LOCAL7</code> requests to the
    505     logfile <code>/var/log/crossroads.log</code>.
    506 <p>
    507 <li> Next, restart <code>syslogd</code>. On most Unices that's done by
    508     issuing <code>killall -1 syslogd</code>. (As a side-note, I tried this once
    509     on an Bull/AIX system, and the box just shut down. The <code>killall</code>
    510     command killed every process...)
    511 <p>
    512 <li> Now start <code>crossroads</code> with the flag <code>-l7</code>.
    513 <p>
    514 <li> Finally, monitor <code>/var/log/crossroads.log</code> for Crossroads'
    515     messages.</ul>
    516 <p>
    517 <a name="l12"></a>
    518 <h3>3.3: Reloading Configurations</h3>
    519 <p>
    520 Crossroads doesn't support the reloading of a configuration while
    521 running (such as other programs, e.g. Apache do). There are various
    522 technical reasons for this.
    523 <p>
    524 However, external lists of allowed or denied IP addresses can be
    525 reloaded by sending a signal -1 (<code>SIGHUP</code>) to Crossroads. See
    526 section <a href="crossroads.html#servicedef">4.2</a> for the details.
    527 <p>
    528 <a name="l13"></a>
    529 <h2>4: The configuration</h2>
    530 <a name="config"></a>The configuration that crossroads uses is normally stored in the file
    531 <code>/etc/crossroads.conf</code>. This location can be overruled using the
    532 command line flag <code>-c</code>.
    533 <p>
    534 This section explains the syntax of the configuration file, and what
    535 all settings do.
    536 <p>
    537 <a name="l14"></a>
    538 <h3>4.1: General language elements</h3>
    539 <p>
    540 This section describes the general elements of the crossroads
    541 configuration language.
    542 <p>
    543 <a name="l15"></a>
    544 <strong>4.1.1: Empty lines and comments</strong>
    545 <p>
    546 Empty lines are of course allowed in the
    547 configuration. Crossroads recognizes three formats of comment:
    548 <p>
    549 <ul>
    550         <li> C-style, between <code>/*</code> and <code>*/</code>,
    551         <li> C++-style, starting with <code>//</code> and ending with the end
    552         of the text line;
    553         <li> Shell-style, starting with <code>#</code> and ending with the end
    554         of the text line.</ul>
    555 <p>
    556 Simply choose your favorite editor and use the comment that 'looks
    557 best'.&nbsp;(I favor C or C++ comment. My favorite editor <em>emacs</em>
    558 can be put in <code>cmode</code> and nicely highlight what's comment and what's
    559 not. And as a bonus it will auto-indent the configuration!)
    560 <p>
    561 <a name="l16"></a>
    562 <strong>4.1.2: Keywords, numbers, identifiers, generic strings</strong>
    563 <p>
    564 In a configuration file, statements are identified by <em>keywords</em>,
    565 such as <code>service</code>, <code>verbosity</code>. These are reserved words.
    566 <p>
    567 Many keywords require an <em>identifier</em> as the argument. E.g, a
    568 service has a unique name, which must start with a letter or
    569 underscore, followed by zero or more letters, underscores, or
    570 digits. Therefore, in the statement <code>service myservice</code>, the keyword is
    571 <code>service</code> and the identifier is <code>myservice</code>.
    572 <p>
    573 Other keywords require a numeric argument. Crossroads knows only
    574 non-negative integer numbers, as in <code>port 8000</code>. Here, <code>port</code> is
    575 the keyword and <code>8000</code> is the number.
    576 <p>
    577 Yet other keywords require 'generic strings', such as hostname
    578 specifications or system commands. Such generic strings contain any
    579 characters (including white space) up to the terminating statement
    580 character <code>;</code>. If a string must contain a semicolon, then it must
    581 be enclosed in single or double quotes:
    582 <p>
    583 <ul>
    584         <li> <code>This is a string;</code> is a string that starts at <code>T</code>
    585         and ends with <code>g</code>
    586         <li> <code>"This is a string";</code> is the same, the double quotes
    587         are not necessary
    588         <li> <code>"This is ; a string";</code> has double quotes to protect
    589         the inner ;</ul>
    590 <p>
    591 Finally, an argument can be a 'boolean' value. Crossroads knows
    592 <code>true</code>, <code>false</code>, <code>yes</code>, <code>no</code>, <code>on</code>, <code>off</code>. The keywords
    593 <code>true</code>, <code>yes</code> and <code>on</code> all mean the same and can be used
    594 interchangeably; as can the keywords <code>false</code>, <code>no</code> and <code>off</code>.
    595 <p>
    596 <a name="l17"></a>
    597 <h3>4.2: Service definitions</h3> <a name="servicedef"></a>
    598 <p>
    599 Service definitions are blocks in the configuration file that
    600 state what is for each service. A service definition starts with
    601 <code>service</code>, followed by a unique identifier, and by statements in
    602 <code>{</code> and <code>}</code>. For example:
    603 <p>
    604 <pre>
    605 // Definition of service 'www':
    606 service www {
    607     ...
    608     ... // statements that define the
    609     ... // service named 'www'
    610     ...
    611 }
    612 </pre>
    613 
    614 <p>
    615 The configuration file can contain many service blocks, as long as the
    616 identifying names differ. The following list shows possible
    617 statements. Each statement must end with a semicolon, except for the
    618 <code>backend</code> statement, which has is own block (more on this later).
    619 <p>
    620 <a name="conf/type"></a><a name="l18"></a>
    621 <strong>4.2.1: type - Defining the service type</strong> <a name="conftype - Defining the service type"></a>
    622     <dl>
    623         <p><dt><strong>Description:</strong><dd> The <code>type</code> statement defines how crossroads handles the stated
    624      service. There are currently two types: <code>any</code> and
    625      <code>http</code>. The type <code>any</code> means that crossroads doesn't
    626      interpret the contents of a TCP stream, but only distributes streams
    627      over back ends. The type <code>http</code> means that crossroads has to
    628      analyze what's in the messages, does magical HTTP header tricks, and
    629      so on -- all to ensure that multiple connections are treated as one
    630      session, or that the back end is notified of the client's IP
    631      address.
    632 <p>
    633 Unless you really need such special features, use the type <code>any</code> (the
    634      default), even for HTTP protocols.
    635         <p><dt><strong>Syntax:</strong><dd> <code>type</code> <em>specifier</em>, where <em>specifier</em> is <code>any</code> or
    636      <code>http</code>
    637         <p><dt><strong>Default:</strong><dd> <code>any</code>
    638     </dl>
    639 <p>
    640 <a name="conf/port"></a><a name="l19"></a>
    641 <strong>4.2.2: port - Specifying the listen port</strong> <a name="confport - Specifying the listen port"></a>
    642     <dl>
    643         <p><dt><strong>Description:</strong><dd> The <code>port</code> statement defines to which TCP port a service
    644      'listens'. E.g. <code>port 8000</code> says that this service will accept
    645      connections on port 8000.
    646         <p><dt><strong>Syntax:</strong><dd> <code>port</code> <em>number</em>
    647         <p><dt><strong>Default:</strong><dd> There is no default. This is a required setting.
    648     </dl>
    649 <p>
    650 <a name="conf/bindto"></a><a name="l20"></a>
    651 <strong>4.2.3: bindto - Binding to a specific IP address</strong> <a name="confbindto - Binding to a specific IP address"></a>
    652     <dl>
    653         <p><dt><strong>Description:</strong><dd> The <code>bindto</code> statement is used in situations where crossroads
    654      should only listen to the stated port at a given IP address. E.g.,
    655      <code>bindto 127.0.0.1</code> causes crossroads to 'bind' the service only to
    656      the local IP address. Network connections from other hosts won't be
    657      serviced. By default, crossroads binds a service to all presently
    658      active IP addresses at the invoking host.
    659         <p><dt><strong>Syntax:</strong><dd> <code>bindto</code> <em>address</em>, where <em>address</em> is a numeric IP
    660      address, such as 127.0.0.1, or the keyword <code>any</code>.
    661         <p><dt><strong>Default:</strong><dd> <code>any</code>
    662     </dl>
    663 <p>
    664 <a name="conf/verbose"></a><a name="l21"></a>
    665 <strong>4.2.4: verbosity - Controlling debug output</strong> <a name="confverbosity - Controlling debug output"></a>
    666     <dl>
    667         <p><dt><strong>Description:</strong><dd> Verbosity statements  come in two forms: <code>verbosity on</code> or
    668      <code>verbosity off</code>. When 'on', log messages to <code>/var/log/messages</code>
    669      are generated that show what's going on.&nbsp;(Actually, the
    670      messages go to <code>syslog(3)</code>, using facility <code>LOG_DAEMON</code> and
    671      priority <code>LOG_INFO</code>. In most (Linux) cases this will mean: output to
    672      <code>/var/log/messages</code>. On Mac OSX the messages go to
    673      <code>/var/log/system.log</code>.) The keyword <code>verbose</code> is an alias for
    674      <code>verbosity</code>.
    675         <p><dt><strong>Syntax:</strong><dd> <code>verbosity</code> <em>setting</em> or <code>verbose</code> <em>setting</em>, where
    676      <em>setting</em> is <code>true</code>, <code>yes</code> or <code>on</code> to turn 
    677      verbosity on; or <code>false</code>, <code>no</code>, <code>off</code> to turn it off.
    678         <p><dt><strong>Default:</strong><dd> <code>off</code>
    679     </dl>
    680 <p>
    681 <a name="conf/dispatchmode"></a><a name="l22"></a>
    682 <strong>4.2.5: dispatchmode - How are back ends selected</strong> <a name="confdispatchmode - How are back ends selected"></a>
    683     <dl>
    684         <p><dt><strong>Description:</strong><dd> The dispatch mode controls how crossroads selects a back end from
    685      a list of active back ends. The below text shows the bare
    686      syntax. See section <a href="crossroads.html#howselected">5.1</a> for a textual explanation.
    687 <p>
    688 The settings can be:
    689 <p>
    690 <ul>
    691         <li> <code>dispatchmode roundrobin</code>: Simply the 'next in line' is
    692         chosen. E.g, when 3 back ends are active, then the usage
    693         series is 1, 2, 3, 1, 2, 3, and so on.
    694 <p>
    695 Roundrobin dispatching is the default method, when no
    696         <code>dispatchmode</code> statement occurs.
    697 <p>
    698 <li> <code>dispatchmode random</code>: Random selection. Probably only
    699         for stress testing, though when used with weights (see below)
    700         it is a good distributor of new connections too.
    701 <p>
    702 <li> <code>dispatchmode bysize [ over</code> <em>connections</em> <code>]</code>:
    703         The next back end is the one
    704         that has transferred the least number of bytes. This
    705         selection mechanism assumes that the more bytes, the heavier
    706         the load.
    707 <p>
    708 The modifier <code>over</code> <em>connections</em> is optional. (The square
    709         brackets shown above are not part of the statement but
    710         indicate optionality.)  When given,
    711         the load is computed as an average of the last stated number of
    712         connections. When this modifier is absent, then the load is
    713         computed over all connections since startup.
    714 <p>
    715 <li> <code>dispatchmode byduration [ over</code> <em>connections</em> <code>]</code>:
    716         The next back end is the one
    717         that served connections for the shortest time. This mechanism
    718         assumes that the longer the connection, the heavier the load.
    719 <p>
    720 <li> <code>dispatchmode byconnections</code>: The next back end is the one
    721         with the least active connections. This mechanism assumes that
    722         each connection to a back end represents load. It is usable
    723         for e.g. database connections.
    724 <p>
    725 <li> <code>dispatchmode byorder</code>: The first back end is selected
    726         every time, unless it's unavailable. In that case the second
    727         is taken, and so on.
    728 <p>
    729 <li> <code>dispatchmode externalhandler</code> <em>program arguments</em>:
    730         This is a special mode, where an external program is delegated
    731         the responsibility to say which back end should be used
    732         next. In this case, Crossroads will call the external program,
    733         and this will of course be slower than one of the 'built-in'
    734         dispatch modes. However, this is the ultimate escape when
    735         custom-made dispatch modes are needed.
    736 <p>
    737 The dispatch mode that uses an <code>externalhandler</code> is
    738         discussed separately in section <a href="crossroads.html#externalhandler">5.2</a>.</ul>
    739 <p>
    740 The selection algorithm is only used when clients are serviced that
    741      aren't part of a sticky HTTP session. This is the case during:
    742 <p>
    743 <ul>
    744         <li> all client requests of a service type <code>any</code>;
    745         <li> new sessions of a service type <code>http</code>.</ul>
    746 <p>
    747 When type <code>http</code> is in effect and a session is underway, then the
    748      previously used back end is always selected -- regardless of
    749      dispatching mode.
    750 <p>
    751 Your 'right' dispatch mode will depend on the type of service. Given
    752      the fact that crossroads doesn't know (and doesn't care) how to
    753      estimate load from a network traffic stream, you have to choose an
    754      appropriate dispatch mode to optimize load balancing. In most cases,
    755      <code>roundrobin</code> or <code>byconnections</code> will do the job just fine.
    756         <p><dt><strong>Syntax:</strong><dd> <code>dispatchmode</code> <em>mode</em> (see above for the modes), optionally
    757      followed by <code>over</code> <em>number</em>, or when the <em>mode</em> is
    758      <code>externalhandler</code>, followed by <em>program</em>.
    759         <p><dt><strong>Default:</strong><dd> <code>roundrobin</code>
    760     </dl>
    761 <p>
    762 <a name="conf/revivinginterval"></a><a name="l23"></a>
    763 <strong>4.2.6: revivinginterval - Back end wakeup calls</strong> <a name="confrevivinginterval - Back end wakeup calls"></a>
    764     <dl>
    765         <p><dt><strong>Description:</strong><dd> A reviving interval definition is needed when crossroads
    766      determines that a back end is temporarily unavailable. This will
    767      happen when: 
    768 <p>
    769 <ul>
    770         <li> The back end cannot be reached (network connection
    771              fails);
    772         <li> The network connection to the back end suddenly dies.</ul>
    773 <p>
    774 An example of the definition is <code>revivinginterval 10</code>. When this
    775      reviving interval is given, crossroads will check each 10 seconds
    776      whether unavailable back ends have woken up yet. A back end is
    777      considered awake when a network connection to that back end can
    778      succesfully be established.
    779         <p><dt><strong>Syntax:</strong><dd> <code>revivinginterval</code> <em>number</em>, where the number is the interval
    780      in seconds.
    781         <p><dt><strong>Default:</strong><dd> 0 (no wakeup calls)
    782     </dl>
    783 <p>
    784 <a name="conf/maxconnections"></a><a name="l24"></a>
    785 <strong>4.2.7: maxconnections - Limiting concurrent clients at service level</strong> <a name="confmaxconnections - Limiting concurrent clients at service level"></a>
    786     <dl>
    787         <p><dt><strong>Description:</strong><dd> The maximum number of connections is specified using
    788      <code>maxconnections</code>. There is one argument; the number of concurrent
    789      established connections that may be active within one service.
    790 <p>
    791 'Throttling' the number of connections is a way of preventing Denial of
    792      Service (DOS) attacks. Without a limit, numerous network connections
    793      may spawn so many server instances, that the service ultimately breaks
    794      down and becomes unavailable.
    795         <p><dt><strong>Syntax:</strong><dd> <code>maxconnections</code> <em>number</em>, where the number specifies the
    796      maximum of concurrent connections to the service.
    797         <p><dt><strong>Default:</strong><dd> 0, meaning that all connections will be accepted.
    798     </dl>
    799 <p>
    800 <a name="conf/backlog"></a><a name="l25"></a>
    801 <strong>4.2.8: backlog - The TCP Back Log size</strong> <a name="confbacklog - The TCP Back Log size"></a>
    802     <dl>
    803         <p><dt><strong>Description:</strong><dd> The TCP back log size is a number that controls how many
    804      'waiting' network connections may be queued, before a client simply
    805      cannot connect. The syntax is e.g. <code>backlog 5</code> to cause crossroads
    806      to have 5 waiting connections for 1 active connection.
    807      The backlog queue shouldn't be too
    808      high, or clients will experience timeouts before they can actually
    809      connect. The queue shouldn't be too small either, because clients
    810      would be simply rejected. Your mileage may vary.
    811         <p><dt><strong>Syntax:</strong><dd> <code>backlog</code> <em>number</em>
    812         <p><dt><strong>Default:</strong><dd> 0, which takes the operating system's default
    813      value for socket back log size.
    814     </dl>
    815 <p>
    816 <a name="conf/shmkey"></a><a name="l26"></a>
    817 <strong>4.2.9: shmkey - Shared Memory Access</strong> <a name="confshmkey - Shared Memory Access"></a>
    818     <dl>
    819         <p><dt><strong>Description:</strong><dd> Different Crossroads
    820      invocations must 'know' of each others activity. E.g, <code>crossroad
    821      status</code> must be able to get to the actual state information of all
    822      running services. This is internally implemented through shared
    823      memory, which is reserved using a key.
    824 <p>
    825 Normally crossroads will supply a shared memory key, based on the
    826      service port and bitwise or-ed with a magic number. In situations
    827      where this conflicts with existing keys (of other programs, having
    828      their own keys), you may supply a chosen value.
    829 <p>
    830 The actual key value doesn't matter much, as long as it's unique
    831      and as long as each invocation of crossroads uses it.
    832         <p><dt><strong>Syntax:</strong><dd> <code>shmkey</code> <em>number</em>
    833         <p><dt><strong>Default:</strong><dd> 0, which means that crossroads will 'guess' its
    834      own key, based on TCP port and a magic number.
    835     </dl>
    836 <p>
    837 <a name="conf/allow"></a><a name="l27"></a>
    838 <strong>4.2.10: allow* and deny* - Allowing or denying connections</strong> <a name="confallow* and deny* - Allowing or denying connections"></a>
    839     <dl>
    840         <p><dt><strong>Description:</strong><dd> Crossroads can allow or deny
    841      connections based on the IP address of a client. There are four
    842      directives that are relevant: <code>allowfrom</code>, <code>allowfile</code>,
    843      <code>denyfrom</code> and <code>denyfile</code>. When using <code>allowfrom</code> and
    844      <code>denyfrom</code> then the IP addresses to allow or deny connections are
    845      stated in <code>/etc/crossroads.conf</code>.
    846 <p>
    847 When <code>allow*</code> directives are used, then all connections are denied
    848      unless they match the stated allowed IP's. When <code>deny*</code> directives
    849      are used, then all connections are allowed unless they match the
    850      stated disallowed IP's. When denying and allowing is both used,
    851      then the Crossroads checks the deny list first.
    852 <p>
    853 The statements <code>allowfrom</code> and <code>denyfrom</code> are followed by a
    854      list of filter specifications. The statements <code>allowfile</code> and
    855      <code>denyfile</code> are followed by a filename; Crossroads will read
    856      filter specifications from those external files. In both cases,
    857      Crossroads obtains filter specifications and places them in its
    858      lists of allowed or denied IP addresses. The difference between
    859      specifying filters in <code>/etc/crossroads.conf</code> or in external
    860      files, is that Crossroads will reload the external files when it
    861      receives signal 1 (<code>SIGHUP</code>), as in <code>killall -1 crossroads</code>.
    862 <p>
    863 The filter specifications must obey the following syntax: it
    864      consists of up to 
    865      four numbers ranging from 0 to 255 and separated by a decimal
    866      sign. Optionally a slash follows, with a bitmask which is also a
    867      decimal number.
    868 <p>
    869 This is probably best explained by a few examples:
    870 <p>
    871 <ul>
    872         <li> <code>allowfrom 10/8;</code> will allow connections from
    873         <code>10.*.*.*</code> (a full Class A network). The mask <code>/8</code> means
    874         that the first 8 bits of the number (ie., only the <code>10</code>) are
    875         significant. On the last 3 positions of the IP address, all
    876         numbers are allowed. Given this directive, client connections
    877         from e.g. 10.1.1.1 and 10.2.3.4 will be allowed.
    878 <p>
    879 <li> <code>allowfrom 10.3/16;</code> will allow all IP addresses that
    880         start with <code>10.3</code>.
    881 <p>
    882 <li> <code>allowfrom 10.3.1/16;</code> is the same as above. The third
    883         byte of the IP address is superfluous because the netmask
    884         specifies that only the first 16 bits (2 numbers) are taken
    885         into account.
    886 <p>
    887 <li> <code>allowfrom 10.3.1.15;</code> allows traffic from only the
    888         specified IP address. There is no bitmask; all four numbers
    889         are relevant.
    890 <p>
    891 <li> <code>allowfrom 10.3.1.15 10.2/16;</code> allows traffic from one
    892         IP address <code>10.3.1.15</code> or from a complete Class B network
    893         <code>10.2.*.*</code> 
    894 <p>
    895 <li> <code>allowfile /tmp/myfile.txt;</code> in combination with a file
    896         <code>/tmp/myfile.txt</code>, with the contents <code>10.3.1.15 10.2/16</code>,
    897         is the same as above.</ul>
    898         <p><dt><strong>Syntax:</strong><dd> <ul>
    899 	<li> <code>allowfrom</code> <em>filter-specificication(s)</em>
    900 	<li> <code>denyfrom</code>  <em>filter-specificication(s)</em>
    901 	<li> <code>allowfile</code>  <em>filename</em>
    902 	<li> <code>denyfile</code>  <em>filename</em></ul>
    903         <p><dt><strong>Default:</strong><dd> In absence of these statements, all client IP's are accepted.
    904     </dl>
    905 <p>
    906 <a name="conf/useraccount"></a><a name="l28"></a>
    907 <strong>4.2.11: useraccount - Limiting the effective ID of external processes</strong> <a name="confuseraccount - Limiting the effective ID of external processes"></a>
    908     <dl>
    909         <p><dt><strong>Description:</strong><dd> Using the directive <code>useraccount</code>, the effective user and group
    910      ID can be restricted. This comes into effect when Crossroads runs
    911      external commands, such as:
    912      <ul>
    913 	<li> Hooks for <code>onstart</code>, <code>onend</code> or <code>onfail</code>;
    914 	<li> External dispatchers, when <code>dispatchmode
    915 	     externalhandler</code> is in effect.</ul>
    916      Once a user name for external commands is specified, Crossroads
    917      assumes the associated user ID and group ID before running those
    918      commands.
    919         <p><dt><strong>Syntax:</strong><dd> <code>useraccount</code> <em>username</em>
    920         <p><dt><strong>Default:</strong><dd> None; when unspecified, external commands are run with the 
    921      ID that was in effect when Crossroads was started.
    922     </dl>
    923 <p>
    924 <a name="l29"></a>
    925 <h3>4.3: Backend definitions</h3>
    926 <p>
    927 Inside the service definitions as are described in the previous
    928 section, <em>backend definitions</em> must also occur. Backend definitions
    929 are started by the keyword <code>backend</code>, followed by an identifier
    930 (the back end name) , and statements inside <code>{</code> and <code>}</code>:
    931 <p>
    932 <pre>
    933 service myservice {
    934     ...
    935     ... // statements that define the
    936     ... // service named 'myservice'
    937     ...
    938 
    939     backend mybackend {
    940         ...
    941         ... // statements that define the
    942         ... // backend named 'mybackend'
    943         ...
    944     }
    945 }
    946 </pre>
    947 
    948 <p>
    949 Each service definition must have at least one backend
    950 definition. There may be more (and probably will, if you want
    951 balancing and fail over) as long as the backend names differ.
    952 The statements in the backend definition blocks are described in the
    953 following sections.
    954 <p>
    955 Some directives (<code>stickycookie</code> etc.) only have effect when
    956 Crossroads treats the network traffic as a stream of HTTP messages;
    957 i.e., when the service is declared with <code>type http</code>. Incase of
    958 <code>type any</code>, the HTTP-specific directives have no effect.
    959 <p>
    960 <a name="conf/server.yo"></a><a name="l30"></a>
    961 <strong>4.3.1: server - Specifying the back end address</strong> <a name="confserver - Specifying the back end address"></a>
    962     <dl>
    963         <p><dt><strong>Description:</strong><dd> Each back end must be identified by the network name
    964      (server name) where it is located. For example: <code>server
    965      10.1.1.23</code>, or <code>server web.mydomain.org</code>. A TCP port specifier
    966      can follow the server name, as in <code>server web.mydomain.org:80</code>.
    967         <p><dt><strong>Syntax:</strong><dd> <ul>
    968 	<li> <code>server</code> <em>servername</em>, where <em>servername</em> is a
    969  	network name or IP address;
    970 	<li> <code>server</code> <em>servername:port</em></ul>
    971         <p><dt><strong>Default:</strong><dd> There is no default. This is a required setting.
    972     </dl>
    973 <p>
    974 <a name="conf/verbose-backend.yo"></a><a name="l31"></a>
    975 <strong>4.3.2: verbosity - Controlling verbosity at the back end level</strong> <a name="confverbosity - Controlling verbosity at the back end level"></a>
    976     <dl>
    977         <p><dt><strong>Description:</strong><dd> Similar to <code>service</code> specifications, a
    978      <code>backend</code> can have its own verbosity (<code>on</code> or <code>off</code>). When
    979      <code>on</code>, traffic to and fro this back end is reported.
    980         <p><dt><strong>Syntax:</strong><dd> <ul>
    981         <li> <code>verbosity</code> <em>setting</em>, or
    982 	<li> <code>verbose</code> <em>setting</em>, where <em>setting</em> is <code>true</code>,
    983 	<code>yes</code> or <code>on</code>, or <code>false</code>, <code>no</code>, <code>off</code> to turn it
    984 	off.</ul>
    985         <p><dt><strong>Default:</strong><dd> <code>off</code>
    986     </dl>
    987 <p>
    988 <a name="conf/weight"></a><a name="l32"></a>
    989 <strong>4.3.3: weight - When a back end is more equal than others</strong> <a name="confweight - When a back end is more equal than others"></a>
    990     <dl>
    991         <p><dt><strong>Description:</strong><dd> To influence how backends are selected, a backend can specify its
    992     'weight' in the process. The higher the weight, the less likely a
    993     back end will be chosen. The default is 1.
    994 <p>
    995 The weighing mechanism only applies to the dispatch modes
    996      <code>random</code>, <code>byconnections</code>, <code>bysize</code> and <code>byduration</code>. 
    997      The weight is in fact a penalty factor. E.g., if backend A has
    998      <code>weight 2</code> and backend B has <code>weight 1</code>, then backend B will
    999      be selected all the time, until its usage parameter is twice as
   1000      large as the parameter of A. Think of it as a 'sluggishness'
   1001      statement.
   1002         <p><dt><strong>Syntax:</strong><dd> <code>weight</code> <em>number</em>; the higher the number, the more 'sluggish'
   1003      a back end is
   1004         <p><dt><strong>Default:</strong><dd> 1; all back ends have equal weight.
   1005     </dl>
   1006 <p>
   1007 <a name="conf/decay"></a><a name="l33"></a>
   1008 <strong>4.3.4: decay - Levelling out activity of a back end</strong> <a name="confdecay - Levelling out activity of a back end"></a>
   1009     <dl>
   1010         <p><dt><strong>Description:</strong><dd> To make sure that a 'spike' of activity doesn't
   1011      influence the perceived load of a back end forever, you may
   1012      specify a certain decay. E.g, the statement <code>decay 10</code> makes
   1013      sure that the load that crossroads computes for this back end (be
   1014      it in seconds or in bytes) is decreased by 10% each time that
   1015      <strong>an other</strong> back end is hit. Decays are not applied to the count
   1016      of concurrent connections.
   1017 <p>
   1018 This means that when a given back end is hit, then its usage data
   1019      of the transferred bytes and the connection duration are updated
   1020      using the actual number of bytes and actual duration. However,
   1021      when a different back end is hit, then the usage data are
   1022      decreased by the specified decay. 
   1023         <p><dt><strong>Syntax:</strong><dd> <code>decay</code> <em>number</em>, where <em>number</em> is a percentage that
   1024      decreases the back end usage data when other back ends are
   1025      hit.
   1026         <p><dt><strong>Default:</strong><dd> 0, meaning that no decay is applied to usage statistics.
   1027     </dl>
   1028 <p>
   1029 <a name="conf/onhooks"></a><a name="l34"></a>
   1030 <strong>4.3.5: onstart, onend, onfail - Action Hooks</strong> <a name="confonstart, onend, onfail - Action Hooks"></a>
   1031     <dl>
   1032         <p><dt><strong>Description:</strong><dd> The three directives <code>onstart</code>, <code>onend</code> and <code>onfail</code> can be
   1033      specified to start system commands (external programs) when a
   1034      connection to a back end starts, fails or ends:
   1035      <ul>
   1036 	<li> <code>onstart</code> commands will be run when Crossroads
   1037 	successfully connects to a back end, and starts servicing;
   1038 	<li> <code>onend</code> commands will be run when a (previously
   1039 	established) connection stops;
   1040 	<li> <code>onfail</code> commands will be run when Crossroads tries to
   1041 	contact a back end to serve a client, but the back end can't
   1042 	be reached.</ul>
   1043 <p>
   1044 The format is always <code>on</code><em>type</em> <em>command</em>. The <em>command</em>
   1045      is an external program, optionally followed by arguments. The
   1046      command is expanded according to the following table:
   1047 <p>
   1048 <ul>
   1049         <li> <code>%a</code> is the availability of the current back end, when
   1050         a current back end is established;
   1051         <li> <code>%1a</code> is the availability of the first back end (0 when
   1052         unavailable, 1 if available); <code>%2a</code> is the availability of
   1053         the second back end, and so on;
   1054         <li> <code>%b</code> is the name of the current back end, when one is
   1055         established; 
   1056         <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the
   1057         second back end, and so on;
   1058         <li> <code>%e</code> is the count of seconds since start of epoch
   1059         (January 1st 1970 GMT);
   1060         <li> <code>%r</code> is the IP address of the client that requests a
   1061         connection and for whom the external dispatcher should compute
   1062         a back end;
   1063         <li> <code>%s</code> is the name of the current service that the client
   1064         connected to;
   1065         <li> <code>%t</code> is the current local time in ASCII format, in
   1066         <em>YYYY-MM-DD/hhh:mm:ss</em>; 
   1067         <li> <code>%T</code> is the current GMT time in ASCIII format;
   1068         <li> <code>%v</code> is the Crossroads version;
   1069         <li> Any other chararacter following a <code>%</code> sign is taken
   1070         literally; e.g. <code>%z</code> is just a z.</ul>
   1071 <p>
   1072 <p><dt><strong>Syntax:</strong><dd> <ul>
   1073         <li> <code>onstart</code> <em>commandline</em>
   1074 	<li> <code>onend</code> <em>commandline</em>
   1075 	<li> <code>onfail</code> <em>commandline</em>
   1076 	<li> <code>onsuccess</code> <em>commandline</em></ul>
   1077         <p><dt><strong>Default:</strong><dd> There is no default. Normally no external programs are run upon
   1078      connection, success or failure of a back end.
   1079     </dl>
   1080 <p>
   1081 <a name="conf/trafficlog"></a><a name="l35"></a>
   1082 <strong>4.3.6: trafficlog and throughputlog - Debugging and Performance Aids</strong> <a name="conftrafficlog and throughputlog - Debugging and Performance Aids"></a>
   1083     <dl>
   1084         <p><dt><strong>Description:</strong><dd> Two directives are available
   1085      to log network traffic to files. They are <code>trafficlog</code> and
   1086      <code>throughputlog</code>.
   1087 <p>
   1088 The <code>trafficlog</code> statement causes all traffic to be logged in
   1089      hexadecimal format. Each line is prefixed by <code>B</code> or <code>C</code>,
   1090      depending on whether the information was received from the back
   1091      end or from the client.
   1092 <p>
   1093 The <code>throughputlog</code> statement writes shorthand transmissions to
   1094      its log, accompanied by timings.
   1095         <p><dt><strong>Syntax:</strong><dd> <ul>
   1096 	<li> <code>trafficlog</code> <em>filename</em>
   1097         <li> <code>throughputlog</code> <em>filename</em></ul>
   1098         <p><dt><strong>Default:</strong><dd> none
   1099     </dl>
   1100 <p>
   1101 <a name="conf/stickycookie"></a><a name="l36"></a>
   1102 <strong>4.3.7: stickycookie - Back end selection with an HTTP cookie</strong> <a name="confstickycookie - Back end selection with an HTTP cookie"></a>
   1103     <dl>
   1104         <p><dt><strong>Description:</strong><dd> The directive <code>stickycookie</code> <em>value</em>
   1105      causes Crossroads to unpack clients' requests, to check for
   1106      <em>value</em> in the cookies. When found, the message is routed to the
   1107      back end having the appropriate <code>stickycookie</code> directive.
   1108 <p>
   1109 E.g., consider the following configuration:
   1110 <p>
   1111 <pre>
   1112 service ... {
   1113     ...
   1114     backend one {
   1115         ...
   1116         stickycookie "BalancerID=first";
   1117     }
   1118     backend two {
   1119         ...
   1120         stickycookie "BalancerID=second";
   1121     }
   1122 }
   1123 </pre>
   1124 
   1125 <p>
   1126 When clients' messages contain cookies named <code>BalancerID</code> with
   1127      the value <code>first</code>, then such messages are routed to backend
   1128      <code>one</code>. When the value is <code>second</code> then they are routed to the
   1129      backend <code>two</code>.
   1130 <p>
   1131 There are basically to provide such cookies to a browser. First, a
   1132      back end can insert such a cookie into the HTTP response. E.g.,
   1133      the webserver of back end <code>one</code> might insert a cookie named
   1134      <code>BalancerID</code>, having value <code>first</code>.
   1135      Second, Crossroads can insert such cookies using a carefully
   1136      crafted directive <code>addclientheader</code>.
   1137         <p><dt><strong>Syntax:</strong><dd> <code>stickycookie</code> <em>cookievalue</em>
   1138         <p><dt><strong>Default:</strong><dd> There is no default.
   1139     </dl>
   1140 <p>
   1141 <a name="conf/addclientheader"></a><a name="l37"></a>
   1142 <strong>4.3.8: HTTP Header Modification Directives</strong> <a name="confHTTP Header Modification Directives"></a>
   1143     <dl>
   1144         <p><dt><strong>Description:</strong><dd> Crossroads understands the following
   1145      header modification directives: <code>addclientheader</code>,
   1146      <code>appendclientheader</code>, <code>setclientheader</code>, <code>addserverheader</code>,
   1147      <code>appendserverheader</code>, <code>setserverheader</code>.
   1148 <p>
   1149 The directive names always consist of
   1150      <em>Action</em><em>Destination</em><code>header</code>, where:
   1151 <p>
   1152 <ul>
   1153         <li> The action is <code>add</code>, <code>append</code> or <code>insert</code>.
   1154 <p>
   1155 <ul>
   1156             <li> Action <code>add</code> adds a header, even when headers with
   1157             the same name already are present in an HTTP
   1158             message. Adding headers is useful for e.g. <code>Set-Cookie</code>
   1159             headers; a message may contain several of such headers.
   1160 <p>
   1161 <li> Action <code>append</code> adds a header if it isn't present
   1162             yet in an HTTP message. If such a header is already
   1163             present, then the value is appended to the pre-existing
   1164             header. This is useful for e.g. <code>Via</code> headers. Imagine
   1165             an HTTP message with a header <code>Via: someproxy</code>. Then the
   1166             directive <code>appendclientheader "Via: crossroads"</code> will
   1167             rewrite the header to <code>Via: someproxy; crossroads</code>.
   1168 <p>
   1169 <li> Action <code>set</code> overwrites headers with the same
   1170             name; or adds a new header if no pre-existing is found.
   1171             This is useful for e.g. <code>Host</code> headers.</ul>
   1172 <p>
   1173 <li> The destination is one of <code>client</code> or <code>server</code>. When
   1174         the destination is <code>server</code>, then Crossroads will apply such
   1175         directives to HTTP messages that originate from the browser
   1176         and are being forwarded to back ends. When the destination is
   1177         <code>client</code>, then Crossroads will apply such directives to
   1178         backend responses that are shuttled to the browser.</ul>
   1179 <p>
   1180 The format of the directives is e.g. <code>addclientheader
   1181      "X-Processed-By: Crossroads"</code>. The directives expect one
   1182      argument; a string, consisting of a header name, a colon, and a
   1183      header value. As usual, the directive must end with a semicolon.
   1184 <p>
   1185 The header value may contain one of the following formatting
   1186      directives:
   1187 <p>
   1188 <ul>
   1189         <li> <code>%a</code> is the availability of the current back end, when
   1190         a current back end is established;
   1191         <li> <code>%1a</code> is the availability of the first back end (0 when
   1192         unavailable, 1 if available); <code>%2a</code> is the availability of
   1193         the second back end, and so on;
   1194         <li> <code>%b</code> is the name of the current back end, when one is
   1195         established; 
   1196         <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the
   1197         second back end, and so on;
   1198         <li> <code>%e</code> is the count of seconds since start of epoch
   1199         (January 1st 1970 GMT);
   1200         <li> <code>%r</code> is the IP address of the client that requests a
   1201         connection and for whom the external dispatcher should compute
   1202         a back end;
   1203         <li> <code>%s</code> is the name of the current service that the client
   1204         connected to;
   1205         <li> <code>%t</code> is the current local time in ASCII format, in
   1206         <em>YYYY-MM-DD/hhh:mm:ss</em>; 
   1207         <li> <code>%T</code> is the current GMT time in ASCIII format;
   1208         <li> <code>%v</code> is the Crossroads version;
   1209         <li> Any other chararacter following a <code>%</code> sign is taken
   1210         literally; e.g. <code>%z</code> is just a z.</ul>
   1211 <p>
   1212 The following examples show common uses of header modifications.
   1213 <p>
   1214 <dl>
   1215         <p><dt><strong>Enforcing session stickiness:</strong><dd> By combining
   1216         <code>stickycookie</code> and <code>addclientheader</code>, HTTP session
   1217         stickiness is enforced. Consider the following configuration:
   1218 <p>
   1219 <pre>
   1220 service ... {
   1221     ...
   1222     backend one {
   1223         ...
   1224         addclientheader "Set-Cookie: BalancerID=first; path=/";
   1225         stickycookie "BalancerID=first";
   1226     }
   1227     backend two {
   1228         ...
   1229         addclientheader "Set-Cookie: BalancerID=second; path=/";
   1230         stickycookie "BalancerID=second";
   1231     }
   1232 }
   1233 </pre>
   1234 
   1235 <p>
   1236 The first request of an HTTP session is balanced to either
   1237         backend <code>one</code> or <code>two</code>. The server response is enriched
   1238         using <code>addclientheader</code> with an appropriate cookie. A
   1239         subsequent request from the same browser now has that cookie
   1240         in place; and is therefore sent to the same back end where the
   1241         its predecessors went.
   1242 <p>
   1243 <p><dt><strong>Hiding the server software version:</strong><dd> Many servers
   1244         (e.g. Apache) advertize their version, as in <code>Server: Apache
   1245         1.27</code>. This potentially provides information to attackers. The
   1246         following configuration hides such information:
   1247 <p>
   1248 <pre>
   1249 service ... {
   1250     ...
   1251     backend one {
   1252         ...
   1253         setclientheader "Server: WWW-Server";
   1254     }
   1255 }
   1256 </pre>
   1257 
   1258 <p>
   1259 <p><dt><strong>Informing the server of the clients' IP address:</strong><dd> Since
   1260         Crossroads sits 'in the middle' between a client and a back
   1261         end, the back end perceives Crossroads as its client. The
   1262         following sends the true clients' IP address to the server, in
   1263         a header <code>X-Real-IP</code>:
   1264 <p>
   1265 <pre>
   1266 service ... {
   1267     ...
   1268     backend one {
   1269         ...
   1270         setserverheader "X-Real-IP: %r";
   1271     }
   1272 }
   1273 </pre>
   1274 
   1275 <p>
   1276 <p><dt><strong>Keep-Alive Downgrading:</strong><dd> The directives
   1277         <code>setclientheader</code> and <code>setserverheader</code> also play a key
   1278         role in downgrading Keep-Alive connections to
   1279         'single-shot'. E.g., the following configuration makes sure
   1280         that no Keep-Alive connections occur.
   1281 <p>
   1282 <pre>
   1283 service ... {
   1284     ...
   1285     backend one {
   1286         ...
   1287         setserverheader "Connection: close";
   1288         setclientheader "Connection: close";
   1289     }
   1290 }
   1291 </pre>
   1292 </dl>
   1293         <p><dt><strong>Syntax:</strong><dd> <ul>
   1294 	<li> <code>addclientheader</code> <em>Headername: headervalue</em> to add a
   1295 	header in the traffic towards the client, even when another
   1296 	header <em>Headername</em> exists;
   1297 	<li> <code>appendclientheader</code> <em>Headername: headervalue</em> to
   1298 	append <em>headervalue</em> to an existing header <em>Headername</em>
   1299 	in the traffic towards the client,
   1300 	or to add the whole header alltogether;
   1301 	<li> <code>setclientheader</code> <em>Headername: headervalue</em> to
   1302 	overwrite an existing header in the traffic towards the
   1303 	client, or to add such a header;
   1304 	<li> <code>addserverheader</code> <em>Headername: headervalue</em> to add a
   1305 	header in the traffic towards the server, even when another
   1306 	header <em>Headername</em> exists;
   1307 	<li> <code>appendserverheader</code> <em>Headername: headervalue</em> to
   1308 	append <em>headervalue</em> to an existing header <em>Headername</em>
   1309 	in the traffic towards the server,
   1310 	or to add the whole header alltogether;
   1311 	<li> <code>setserverheader</code> <em>Headername: headervalue</em> to
   1312 	overwrite an existing header in the traffic towards the
   1313 	server, or to add such a header.</ul>
   1314         <p><dt><strong>Default:</strong><dd> There is no default.
   1315     </dl>
   1316 <p>
   1317 <a name="l38"></a>
   1318 <h2>5: Tips, Tricks and Remarks</h2>
   1319 <a name="tips"></a>The following sections elaborate on the directives as described in
   1320 section <a href="crossroads.html#config">4</a> to illustrate how crossroads works and to help you
   1321 achieve the "optimal" balancing configuration.
   1322 <p>
   1323 <a name="l39"></a>
   1324 <h3>5.1: How back ends are selected in load balancing</h3><a name="howselected"></a>
   1325 <p>
   1326 In order to tune your load balancing, you'll need to understand how
   1327 crossroads computes usage, how weighing works, and so on. In this
   1328 section we'll focus on the dispatching modes <code>bysize</code>, <code>byduration</code>
   1329 and <code>byconnections</code> only. The other dispatching types are
   1330 self-explanatory. 
   1331 <p>
   1332 <a name="l40"></a>
   1333 <strong>5.1.1: Bysize, byduration or byconnections?</strong>
   1334 <p>
   1335 As stated before, crossroads doesn't know 'what a service does' and
   1336 how to judge whether a given back end is very busy or not. You
   1337 must therefore give the right hints:
   1338 <p>
   1339 <ul>
   1340         <li> In general, a service which is CPU bound, will be more
   1341         busy when it takes longer to process a request. The dispatch
   1342         mode <code>byduration</code> is appropriate here.
   1343 <p>
   1344 <li> In contrast, a service which is filesystem bound, will be
   1345         more busy when more data are transferred. The dispatch mode
   1346         <code>bysize</code> is apppropriate.
   1347 <p>
   1348 <li> The dispatch mode <code>byduration</code> can also be used when
   1349         network latency is an issue. E.g., if your balancer has back
   1350         ends that are geograpically distributed, then <code>byduration</code>
   1351         would be a good way to select best available back ends.
   1352 <p>
   1353 <li> Furthermore it is noteworthy that <code>dispatchmode
   1354         byduration</code> is not usable for interactive processes such as
   1355         SSH logins. Idle time of a
   1356         login adds to the duration, while causing (almost) no
   1357         load. Mode <code>byduration</code> should only be used for automated
   1358         processes that don't wait for user interaction (e.g., SOAP
   1359         calls and other HTTP requests).
   1360 <p>
   1361 <li> As a last remark, the dispatching mode <code>byconnections</code> can
   1362         be used if you don't have other clues for load
   1363         estimations.
   1364 <p>
   1365 E.g., consider a database connection. What's
   1366         heavier on the back end, time-consuming connections, or connections
   1367         where loads of bytes are transferred? Well, that depends. A
   1368         tough <code>select</code> query that joins multiple tables can be very
   1369         heavy on the back end, though the response set can be quite
   1370         small - and hence the number of
   1371         transferred bytes. That would suggest
   1372         dispatching by duration. However, <code>byduration</code>
   1373         balancing doesn't respresent the true world, when interactive
   1374         connections can occur where users have an idle TCP connection to
   1375         the database:
   1376         this consumes time, but no bytes (see the SSH login example
   1377         above). In this case, the dispatch mode <code>byconnections</code> may be
   1378         your best bet.
   1379 <p>
   1380 </ul> 
   1381 <p>
   1382 <a name="l41"></a>
   1383 <strong>5.1.2: Averaging size and duration</strong>
   1384 <p>
   1385 The configuration statement <code>dispatchmode bysize</code> or <code>byduration</code>
   1386 allows an optional modifier <code>over</code> <em>number</em>, where the stated
   1387 number represents a connection count. When this modifier is present, then
   1388 crossroads will use a moving average over the last <em>n</em> connections to
   1389 compute duration and size figures.
   1390 <p>
   1391 In the real world you'll always want this modifier. E.g., consider two
   1392 back ends that are running for years now, and one of them is suddenly
   1393 overloaded and very busy (it experiences a 'spike' in activity).
   1394 When the <code>over</code> modifier is absent, then
   1395 the sudden load will hardly show up in the usage figures -- it will
   1396 flatten out due to the large usage figures already stored in the years
   1397 of service.
   1398 <p>
   1399 In contrast, when e.g. <code>over 3</code> is in effect, then a sudden load
   1400 does show up -- because it highly contributes to the average of three
   1401 connections.
   1402 <p>
   1403 <a name="l42"></a>
   1404 <strong>5.1.3: Specifying decays</strong>
   1405 <p>
   1406 Decays are also only relevant when crossroads computes the 'next best
   1407 back end' by size (bytes) or duration (seconds). E.g., imagine two
   1408 back ends A and B, both averaged over say 3 connections.
   1409 <p>
   1410 Now when back end A is suddenly hit by a spike,
   1411 its average would go up accordingly. But the back end would never
   1412 again be used, unless B also received a similar spike, because A's
   1413 'usage data' over its last three connections would forever be larger than
   1414 B's data. 
   1415 <p>
   1416 For that reason, you should in real situations probably always
   1417 specify a decay, so that the backend selection algorithm recovers from
   1418 spikes. Note that the usage data of the back end where a decay is
   1419 specified, decay when <strong>other</strong> back ends are hit. The decay parameter
   1420 is like specifying how fast your body regenerates when someone else
   1421 does the work.
   1422 <p>
   1423 The below configuration illustrates this:
   1424 <p>
   1425 <pre>
   1426 /* Definition of the service */
   1427 service soap {
   1428     /* Local TCP port */
   1429     port 8080;
   1430 
   1431     /* We'll select back ends by the processing
   1432      * duration
   1433      */
   1434     dispatchmode byduration over 3;
   1435 
   1436     /* First back end: */
   1437     backend A {
   1438         /* Back end IP address and port */
   1439         server 10.1.1.1:8080;
   1440 
   1441         /* When this back end is NOT hit because
   1442          * the other one was less busy, then the
   1443          * usage parameters decay 10% per connection
   1444          */
   1445         decay 10;
   1446     }
   1447 
   1448     /* Second back end: */
   1449     backend B {
   1450         server 10.1.1.2:8080;
   1451         decay 10;
   1452     }
   1453 }
   1454 </pre>
   1455 
   1456 <p>
   1457 <a name="l43"></a>
   1458 <strong>5.1.4: Adjusting the weights</strong>
   1459 <p>
   1460 The back end modifier <code>weight</code> is useful in situations where your
   1461 back ends differ in respect to performance. E.g,. your back ends may
   1462 be geographically distributed, and you know that a given back end is
   1463 difficult to reach and often experiences network lag.
   1464 <p>
   1465 Or you may have
   1466 one primary back end, a system with a fast CPU and enough memory, and a
   1467 small fall-back back end, with a slow CPU and short on memory. In that
   1468 case you know in advance that the second back end should be used only
   1469 rarely. Most requests should go to the big server, up to a certain load.
   1470 <p>
   1471 In such cases you will know in advance that the best performing back ends
   1472 should be selected the most often. Here's where the <code>weight</code>
   1473 statement comes in: you can simply increase the weight of the back
   1474 ends with the least performance, so that they are selected less
   1475 frequently.
   1476 <p>
   1477 E.g., consider the following configuration:
   1478 <p>
   1479 <pre>
   1480 service soap {
   1481     port 8080;
   1482     dispatchmode byduration over 3;
   1483     backend A {
   1484         server 10.1.1.1:8080;
   1485         decay 20;
   1486     }
   1487     backend B {
   1488         server 10.1.1.2:8080;
   1489         weight 2;
   1490         decay 10;
   1491     }
   1492     backend C {
   1493         server 10.1.1.3:8080;
   1494         weight 4;
   1495         decay 5;
   1496     }
   1497 }
   1498 </pre>
   1499 
   1500 <p>
   1501 This will cause crossroads to select back ends by the processing time,
   1502 averaging over the last three connections. However, backend B will kick
   1503 in only when its usage is half of the usage of A (back end B is
   1504 probably only half as fast as A). Backend C will kick in only when its
   1505 usage is a quarter of the usage of A, which is half of the usage of B
   1506 (back end C is probably very weak, and just a fall-back system incase
   1507 both A and B crash). Note also that A's usage data decay much faster
   1508 than B's and C's: we're assuming that this big server recovers quicker
   1509 than its smaller siblings.
   1510 <p>
   1511 <a name="l44"></a>
   1512 <strong>5.1.5: Throttling the number of concurrent connections</strong>
   1513 <p>
   1514 If you suspect that your service may occasionally receive 'spikes' of
   1515 activity&nbsp;(which you should always assume), then it might be a
   1516 good idea to protect your service by specifying a maximum number of
   1517 concurrent connections. This protection can be specified on two levels:
   1518 <p>
   1519 <dl>
   1520         <p><dt><strong>On the service level</strong><dd> a statement like <code>maxconnections
   1521             100;</code> states that the service as a whole will never
   1522             service more than 100 concurrent connections. This means that
   1523             all your back ends and the crossroads balancer itself
   1524             will be protected from being overloaded.
   1525         <p><dt><strong>On the back end level</strong><dd> a statement like <code>maxconnections 10;</code>
   1526             states that this particular back end will never have more
   1527             than 10 concurrent connections; regardless of the overall
   1528             setting on the service level. This means that this
   1529             particular back end will be protected from being
   1530             overloaded (regardless of what other back ends may
   1531             experience).</dl>
   1532 <p>
   1533 The <code>maxconnections</code> statement, combined with a back end selection
   1534 algorithm, allows very fine granularity. The <code>maxconnections</code> statement
   1535 on the back end level is like a hand brake: even when you specify a
   1536 back end algorithm that would protect a given back end from being used
   1537 too much, a situation may occur where that back end is about to be
   1538 hit. A <code>maxconnections</code> statement on the level of that back may then
   1539 protect it.
   1540 <p>
   1541 <a name="l45"></a>
   1542 <h3>5.2: Using an external program to dispatch</h3>
   1543 <a name="externalhandler"></a>
   1544 <p>
   1545 As mentioned before, Crossroads supports several built-in dispatch
   1546 modes. However, you are always free to hook-in your own dispatch mode
   1547 that determines the next back end using your own specific
   1548 algorithm. This section explains how to do it.
   1549 <p>
   1550 <a name="l46"></a>
   1551 <strong>5.2.1: Configuring the external handler</strong>
   1552 <p>
   1553 First, the <code>dispatchmode</code> statement needs to inform Crossroads that
   1554 an external program will do the job. The syntax is: <code>dispatchmode
   1555 externalhandler</code> <em>program arguments</em>. The <em>program</em> must point to
   1556 an executable program that will be started by Crossroads. The
   1557 specifier <em>arguments</em> can be anything you want; those will be the
   1558 arguments to Crossroads. You can however use the following special
   1559 format specifiers:
   1560 <p>
   1561 <ul>
   1562         <li> <code>%a</code> is the availability of the current back end, when
   1563         a current back end is established;
   1564         <li> <code>%1a</code> is the availability of the first back end (0 when
   1565         unavailable, 1 if available); <code>%2a</code> is the availability of
   1566         the second back end, and so on;
   1567         <li> <code>%b</code> is the name of the current back end, when one is
   1568         established; 
   1569         <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the
   1570         second back end, and so on;
   1571         <li> <code>%e</code> is the count of seconds since start of epoch
   1572         (January 1st 1970 GMT);
   1573         <li> <code>%r</code> is the IP address of the client that requests a
   1574         connection and for whom the external dispatcher should compute
   1575         a back end;
   1576         <li> <code>%s</code> is the name of the current service that the client
   1577         connected to;
   1578         <li> <code>%t</code> is the current local time in ASCII format, in
   1579         <em>YYYY-MM-DD/hhh:mm:ss</em>; 
   1580         <li> <code>%T</code> is the current GMT time in ASCIII format;
   1581         <li> <code>%v</code> is the Crossroads version;
   1582         <li> Any other chararacter following a <code>%</code> sign is taken
   1583         literally; e.g. <code>%z</code> is just a z.</ul>
   1584 <p>
   1585 Note that the format specifiers such as <code>%b</code> don't make sense in the
   1586 phase in which an external handler is called, since there is no
   1587 current back end yet (the job of the handler is to supply one).
   1588 <p>
   1589 <a name="l47"></a>
   1590 <strong>5.2.2: Writing the external handler</strong>
   1591 <p>
   1592 The external handler is activated using the arguments that are
   1593 specified in <code>/etc/crossroads.conf</code>. The external handler can do
   1594 whatever it wants, but ultimately, it must write a back end name on
   1595 its <em>stdout</em>. Crossroads reads this, and if the back end is
   1596 available, uses that back end for the connection.
   1597 <p>
   1598 <a name="l48"></a>
   1599 <strong>5.2.3: Examples of external handlers</strong>
   1600 <p>
   1601 This section shows some examples of Crossroads configurations
   1602 vs. external handlers. The sample handlers that are shown here, are
   1603 also included in the Crossroads distribution, under the directory
   1604 <code>etc/</code>. Also note that the examples shown here are just
   1605 quick-and-dirty Perl scripts, meant to illustrate only. Your
   1606 applications may need other external handlers, but you can use the
   1607 shown scripts as a starting point.
   1608 <p>
   1609 <p><strong>Round-robin dispatching</strong><br>
   1610 <p>
   1611 This example is trivial in the sense that round-robin dispatching is
   1612 already built into Crossroads, so
   1613 that using an external handler for this purpose only slows down
   1614 Crossroads. However, it's a good starting example.
   1615 <p>
   1616 The Crossroads configuration is shown below:
   1617 <p>
   1618 <pre>
   1619 service test {
   1620     port 8001;
   1621     verbosity on;
   1622     revivinginterval 5;
   1623     
   1624     dispatchmode externalhandler
   1625         /usr/local/src/crossroads/etc/dispatcher-roundrobin
   1626             %1b %1a %2b %2a;
   1627 
   1628     backend testone {
   1629         server localhost:3128;
   1630         verbosity on;
   1631     }
   1632     backend testtwo {
   1633         server locallhost:3128;
   1634         verbosity on;
   1635     }
   1636 }
   1637 </pre>
   1638 
   1639 <p>
   1640 The relevant <code>dispatchmode</code> statement invokes the external program
   1641 <code>dispatcher-roundrobin</code> with four arguments: the name of the first
   1642 back end (<code>testone</code>), its availability (0 or 1), the name of the
   1643 second back end (<code>testtwo</code>) and its availability (0 or 1).
   1644 <p>
   1645 The external handler, which is also included in the Crossroads
   1646 distribution, is shown below. It is a Perl script.
   1647 <p>
   1648 <pre>
   1649 #!/usr/bin/perl
   1650 
   1651 use strict;
   1652 
   1653 # Example of a round-robin external dispatcher. This is totally
   1654 # superfluous, Crossroads has this on-board; if you use the external
   1655 # program for determining round-robin dispatching, then you'll only
   1656 # slow things down. This script is just meant as an example.
   1657 
   1658 # Globals / configuration
   1659 # -----------------------
   1660 my $log = '/tmp/exthandler.log';    # Debug log, set to /dev/null to suppress
   1661 my $statefile = '/tmp/rr.last';	    # Where we keep the last used
   1662 
   1663 # Logging
   1664 # -------
   1665 sub msg {
   1666     return if ($log eq '/dev/null' or $log eq '');
   1667     open (my $of, "&gt;&gt;$log") or return;
   1668     print $of (scalar(localtime()), ' ', @_);
   1669 }
   1670 
   1671 # Read the last used back end
   1672 # ---------------------------
   1673 sub readlast() {
   1674     my $ret;
   1675     
   1676     if (open (my $if, $statefile)) {
   1677         $ret = &lt;$if&gt;;
   1678         chomp ($ret);
   1679         close ($if);
   1680         msg ("Last used back end: $ret\n");
   1681         return ($ret);
   1682     }
   1683     msg ("No last-used back end (yet)\n");
   1684     return (undef);    
   1685 }
   1686 
   1687 # Write back the last used back end, reply to Crossroads and stop
   1688 # ---------------------------------------------------------------
   1689 sub reply ($) {
   1690     my $last = shift;
   1691 
   1692     if (open (my $of, "&gt;$statefile")) {
   1693         print $of ("$last\n");
   1694     }
   1695     print ("$last\n");
   1696     exit (0);
   1697 }
   1698 
   1699 # Main starts here
   1700 # ----------------
   1701 
   1702 # Collect the cmdline arguments. We expect pairs of backend-name /
   1703 # backend-availablility, and we'll store only the available ones.
   1704 msg ("Dispatch request received\n");
   1705 my @backend;
   1706 for (my $i = 0; $i &lt;= $#ARGV; $i += 2) {
   1707     push (@backend,  $ARGV[$i]) if ($ARGV[$i + 1]);
   1708 }
   1709 msg ("Available back ends: @backend\n");
   1710 
   1711 # Let's see what the last one is. If none found, then we return the
   1712 # first available back end. Otherwise we need to go thru the list of
   1713 # back ends, and return the next one in line.
   1714 my $last = readlast();
   1715 if ($last eq '') {
   1716     msg ("Returning first available back end $backend[0]\n");
   1717     reply ($backend[0]);
   1718 }
   1719 
   1720 # There **was** a last back end.  Try to match it in the list,
   1721 # then return the next-in-line.
   1722 for (my $i = 0; $i &lt; $#backend; $i++) {
   1723     if ($last eq $backend[$i]) {
   1724         msg ("Returning next back end ", $backend[$i + 1], "\n");
   1725         reply ($backend[$i + 1]);
   1726     }
   1727 }
   1728 
   1729 # No luck.. run back to the first one.
   1730 msg ("Returning first back end $backend[0]\n");
   1731 reply ($backend[0]);
   1732 </pre>
   1733 
   1734 <p>
   1735 The working of the script is basically as follows:
   1736 <p>
   1737 <ul>
   1738         <li> The argument list is scanned. Back ends that are
   1739         available are collected in an array <code>@backend</code>.
   1740 <p>
   1741 <li> The script queries a state file <code>/tmp/rr.last</code>. If a
   1742         back end name occurs there, then the next back end is looked
   1743         up in <code>@backend</code> and returned to Crossroads. If no last back
   1744         is unknown or can't be matched, then the first available back
   1745         end (first element of <code>@backend</code>) is returned to Crossroads.
   1746 <p>
   1747 <li> Informing Crossroads is done via the subroutine
   1748         <code>reply()</code>. This code writes the selected back end to file
   1749         <code>/tmp/rr.last</code> (for future usage) and prints the back end
   1750         name to <em>stdout</em>.
   1751 <p>
   1752 <li> The script logs its actions to a file
   1753         <code>/tmp/exthandler.log</code>. This log file can be inspected for
   1754         the script's actions.</ul>
   1755 <p>
   1756 <p><strong>Dispatching by the client IP address</strong><br>
   1757 <p>
   1758 The following example shows a useful real-life situation. The
   1759 situation is as follows:
   1760 <p>
   1761 <ul>
   1762         <li> Crossroads is used as a single-address point to forward
   1763         Remote Desktop requests to a farm of Windows systems, where
   1764         users can work via remote access;
   1765 <p>
   1766 <li> However, users may stop their session, and when they
   1767         re-connect, they expect to be sent to the Windows system that
   1768         they had worked on previously;
   1769 <p>
   1770 <li> Client PC's have their distinct IP addresses, which
   1771         distinguishes them.
   1772 <p>
   1773 <li> Of four windows systems, two are large servers, and two
   1774         are small ones. We'll want to assign large servers to clients
   1775         when we have a choice.</ul>
   1776 <p>
   1777 The requirements resemble session stickiness in HTTP, except that the remote
   1778 desktop protocol doesn't support stickiness. This situation is a
   1779 perfect example of how an external handler can help:
   1780 <p>
   1781 <ul>
   1782         <li> A suitable dispatch mode isn't yet available in
   1783         Crossroads, but can be easily coded in an external handler;
   1784 <p>
   1785 <li> The potential delay due to the calling of an external
   1786         handler won't even be noticed. This is a network service where
   1787         the connection time isn't critical; we'd expect only a few
   1788         (albeit lengthy) TCP connections.</ul>
   1789 <p>
   1790 The approach to the solution of this problem uses several external
   1791 program hooks:
   1792 <p>
   1793 <ul>
   1794         <li> An external dispatcher handler will be responsible for
   1795         suggesting a back end, given a client IP and given the current
   1796         timestamp. This handler will consult an internal
   1797         administration to see whether the stated IP address should
   1798         re-use a back end, or to determine which back end is free for usage.
   1799         <li> An external hook <code>onstart</code> will be responsible for
   1800         updating the internal administration; i.e., to flag a back end
   1801         as 'occupied'.
   1802         <li> The external hooks <code>onfailure</code> and <code>onend</code> will be
   1803         responsible for flagging a back end as 'free' again; i.e., for
   1804         erasing any previous information that states that the back end
   1805         was occupied.</ul>
   1806 <p>
   1807 The Crossroads configuration is shown below. Only four Windows back
   1808 ends are shown. Each back end is configured on a
   1809 given IP address, port 3389, and is limited to one concurrent connection
   1810 (otherwise a new user might 'steal' a running desktop session).
   1811 <p>
   1812 <pre>
   1813 service rdp {
   1814     port 3389;
   1815     revivinginterval 5;
   1816 
   1817     /* rdp-helper dispatch IP STAMP ...  will suggest a back end to use,
   1818      * arguments are for all back ends: name, availability, weight */
   1819     dispatchmode externalhandler
   1820         /usr/local/src/crossroads/etc/rdp-helper dispatch %r %e
   1821             %1b %1a %1w
   1822             %2b %2a %2w
   1823             %3b %3a %3w
   1824             %4b %4a %4w;
   1825             
   1826     backend win1 {
   1827         server 10.1.1.1:3389;
   1828         maxconnections 1;
   1829         /* rdp-helper start IP STAMP BACKEND will log the actual start
   1830          * of a connection;
   1831          * rdp-helper end IP will log the ending of a connection */
   1832         onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b;
   1833         onend   /usr/local/src/crossroads/etc/rdp-helper end %r;
   1834         onfail  /usr/local/src/crossroads/etc/rdp-helper end %r;
   1835     }
   1836     backend win2 {
   1837         server 10.1.1.2:3389;
   1838         maxconnections 1;
   1839         onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b;
   1840         onend   /usr/local/src/crossroads/etc/rdp-helper end %r;
   1841         onfail  /usr/local/src/crossroads/etc/rdp-helper end %r;
   1842     }
   1843     backend win3 {
   1844         server 10.1.1.3:3389;
   1845         maxconnections 1;
   1846         weight 2;
   1847         onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b;
   1848         onend   /usr/local/src/crossroads/etc/rdp-helper end %r;
   1849         onfail  /usr/local/src/crossroads/etc/rdp-helper end %r;
   1850     }
   1851     backend win4 {
   1852         server 10.1.1.4:3389;
   1853         maxconnections 1;
   1854         weight 3;
   1855         onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b;
   1856         onend   /usr/local/src/crossroads/etc/rdp-helper end %r;
   1857         onfail  /usr/local/src/crossroads/etc/rdp-helper end %r;
   1858     }
   1859 }
   1860 </pre>
   1861 
   1862 <p>
   1863 Depending on the dispatcher stage, the exernal handler <code>rdp-helper</code>
   1864 is invoked in different ways:
   1865 <p>
   1866 <dl>
   1867         <p><dt><strong>During dispatching</strong><dd> the helper is called to suggest a back
   1868         end. The arguments are an action indicator <code>dispatch</code>, the
   1869         client's IP address, the timestamp, and four triplets that
   1870         represent back ends: per back end its name, its availability,
   1871         and its weight. The purpose of the helper is to tell
   1872         Crossroads which back end to use.
   1873 <p>
   1874 <p><dt><strong>During connection start</strong><dd> the helper will be invoked to
   1875         inform it of the start of a connection, given a client IP
   1876         address.
   1877 <p>
   1878 <p><dt><strong>When a connection terminates</strong><dd> the helper will be invoked
   1879         to inform it that the connection has ended.</dl>
   1880 <p>
   1881 Here's the external handler as Perl script. It uses the module
   1882 <code>GDBM_File</code> which most likely will not be part of standard Perl
   1883 distributions, but can be added using CPAN. (Alternatively, any other
   1884 database module can be used.)
   1885 <p>
   1886 <pre>
   1887 #!/usr/bin/perl
   1888 
   1889 use strict;
   1890 use GDBM_File;
   1891 
   1892 # Global variables and configuration
   1893 # ----------------------------------
   1894 my $log = '/tmp/exthandler.log';    # Debug log, set to /dev/null to suppress
   1895 my $cdb = '/tmp/client.db';         # GDBM database of clients
   1896 my %db;                             # .. and memory representation of it
   1897 my $timeout = 24*60*60;             # Timeout of a connection in secs
   1898 
   1899 # Logging
   1900 # -------
   1901 sub msg {
   1902     return if ($log eq '/dev/null' or $log eq '');
   1903     open (my $of, "&gt;&gt;$log") or return;
   1904     print $of (scalar(localtime()), ' ', @_);
   1905     close ($of);
   1906 }
   1907 
   1908 # Reply a back end to the caller and stop processing.
   1909 # ---------------------------------------------------
   1910 sub reply ($) {
   1911     my $b = shift;
   1912     msg ("Suggesting $b to Crossroads.\n");
   1913     print ("$b\n");
   1914     exit (0);
   1915 }
   1916 
   1917 # Is a value in an array
   1918 # ----------------------
   1919 sub inarray {
   1920     my $val = shift;
   1921     for my $other (@_) {
   1922         return (1) if ($other eq $val);
   1923     }
   1924     return (0);
   1925 }
   1926 
   1927 # A connection is starting
   1928 # ------------------------
   1929 sub start {
   1930     my ($ip, $stamp, $backend) = @_;
   1931     msg ("Logging START of connection for IP $ip on stamp $stamp, ",
   1932          "back end $backend\n");
   1933     $db{$ip} = "$backend:$stamp";
   1934 }
   1935 
   1936 # A connection has ended
   1937 # ----------------------
   1938 sub end {
   1939     my $ip = shift;
   1940     msg ("Logging END of connection for IP $ip\n");
   1941     $db{$ip} = undef;
   1942 }
   1943 
   1944 # Request to determine a back end
   1945 # -------------------------------
   1946 sub dispatch {
   1947     my $ip = shift;
   1948     my $stamp = shift;
   1949 
   1950     msg ("Request to dispatch IP $ip on stamp $stamp\n");
   1951     
   1952     # Read the next arguments. They are triplets of
   1953     # backend-name / availability / weight. Store if the back end is
   1954     # available.
   1955     my (@backends, @weights);
   1956     for (my $i = 0; $i &lt; $#_; $i += 3) {
   1957         if ($_[$i + 1] != 0) {
   1958             push (@backends, $_[$i]);
   1959             push (@weights,  $_[$i + 2]);
   1960             msg ("Candidate back end: $_[$i] with weight ", $_[$i + 2], "\n");
   1961         }
   1962     }
   1963 
   1964     # See if this is a reconnect by a previously seen client IP. We'll
   1965     # treat this as a reconnect if the timeout wasn't yet exceeded.
   1966     if ($db{$ip} ne '') {
   1967         my ($last_backend, $last_stamp) = split (/:/, $db{$ip});
   1968         msg ("IP $ip had last connected on $last_stamp to $last_backend\n");
   1969         if ($stamp &lt; $last_stamp + $timeout) {
   1970             msg ("Timeout not yet exceeded, this may be a reconnect\n");
   1971             # We'll allow a reconnect only if the stated last_backend is
   1972             # free (sanity check).
   1973             if (inarray ($last_backend, @backends)) {
   1974                 msg ("Last back end $last_backend is available, ",
   1975                      "letting through\n");
   1976                 reply ($last_backend);
   1977             } else {
   1978                 msg ("Last used back end isn't free, suggesting a new one\n");
   1979             }
   1980         } else {
   1981             msg ("Timeout exceeded, suggesting a new back end\n");
   1982         }
   1983     } else {
   1984         msg ("Np preveious connection data, suggesting a new back end\n");
   1985     }
   1986 
   1987     my $bestweight = -1;
   1988     my $bestbackend;
   1989     for (my $i = 0; $i &lt;= $#weights; $i++) {
   1990         if ($bestweight == -1 or $bestweight &gt; $weights[$i]) {
   1991             $bestweight  = $weights[$i];
   1992             $bestbackend = $backends[$i];
   1993         }
   1994     }
   1995 
   1996     msg ("Best back end: $bestbackend (given weight $bestweight)\n");
   1997     reply ($bestbackend);
   1998 }
   1999 
   2000 # Main starts here
   2001 # ----------------
   2002 msg ("Start of run, attaching GDBM database '$cdb'\n");
   2003 tie (%db, 'GDBM_File', $cdb, &amp;GDBM_WRCREAT, 0600);
   2004 
   2005 # The first argument must be an action 'dispatch', 'start' or 'end'.
   2006 # Depending on the action, we do stuff.
   2007 my $action = shift (@ARGV);
   2008 if ($action eq 'dispatch') {
   2009     dispatch (@ARGV);
   2010 } elsif ($action eq 'start') {
   2011     start (@ARGV);
   2012 } elsif ($action eq 'end') {
   2013     end (@ARGV);
   2014 } else {
   2015     print STDERR ("Usage: rdp-helper {dispatch|start|end} args\n");
   2016     exit (1);
   2017 }
   2018 </pre>
   2019 
   2020 <p>
   2021 <a name="l49"></a>
   2022 <h3>5.3: HTTP Session Stickiness</h3>
   2023 <p>
   2024 This section focuses on HTTP session stickiness. This term refers to
   2025 the ability of a balancer to route a conversation between browser and
   2026 a backend farm always to the same back end. In other words: once a
   2027 back end is selected by the balancer, it will remain the back end of
   2028 choice, even for subsequent connections.
   2029 <p>
   2030 <a name="l50"></a>
   2031 <strong>5.3.1: Don't use stickiness!</strong>
   2032 <p>
   2033 The rule of thumb as far as the balancer is concerned, is: <strong>Do not
   2034 use HTTP session stickiness unless you really have to.</strong> Enabling
   2035 session stickiness hampers failover, balancing and performance:
   2036 <p>
   2037 <ul>
   2038     <li> Failover is hampered because during the session,
   2039          the balancer has to assign new connections to the same back
   2040          end that was selected at the start of a session. If the back
   2041          end suddenly goes 'down', then the session will most likely
   2042          crash. (Actually, when a back end becomes unreachable in the
   2043          middle of a session, Crossroads will assign a new back end to
   2044          that session. This will most likely result in a malfunction
   2045          of the underlying application.)
   2046     <li> Balancing is hampered because at the start of the session,
   2047          the balancer has selected the next-best back end. But during
   2048          the session, that back end may well become overloaded. The
   2049          balancer however must continue to send the requests there.
   2050     <li> Performance is hampered because crossroads needs to 'unpack'
   2051          messages as they are passed to and fro. That's because
   2052          crossroads needs to check the HTTP headers in the messages
   2053          for persistence cookies.</ul>
   2054 <p>
   2055 There is a number of measures that you can take to avoid using session
   2056 stickiness. E.g., session data can be 'shared' between web back
   2057 ends. PHP offers functionality to store session data in a database, so
   2058 that all PHP applications have access to these data. Application
   2059 servers such as Websphere can be configured to replicate session data
   2060 between nodes.
   2061 <p>
   2062 <a name="l51"></a>
   2063 <strong>5.3.2: But if you must..</strong>
   2064 <p>
   2065 However, if you <strong>must</strong> use session stickiness, then proceed as
   2066 follows:
   2067 <p>
   2068 <ul>
   2069     <li> At the level of a <code>service</code> description, set the type to
   2070          <code>http</code>. 
   2071     <li> At the level of each back end description, configure the
   2072          <code>stickycookie</code> and a <code>addclientheader</code> directives.</ul>
   2073 <p>
   2074 Once crossroads sees that, it will examine each HTTP message that it
   2075 shuttles between client and back end:
   2076 <p>
   2077 <ul>
   2078     <li> If there is no persistence cookie in the HTTP headers of a
   2079          client's request, then the message must be the first one and
   2080          a new session should be established. 
   2081          Crossroads selects an appropriate back
   2082          end, sends the message to that back end, catches the reply,
   2083          and inserts a <code>Set-Cookie</code> directive.
   2084     <li> If there is a persistence cookie in the HTTP headers of a
   2085          client's request, then the request is part of an already
   2086          established session. Crossroads analyzes the cookie and
   2087          forwards the request to the appropriate back end.</ul>
   2088 <p>
   2089 Below is a short example of a configuration.
   2090 <p>
   2091 <pre>
   2092 service www {
   2093     port 80;
   2094     type http;
   2095     revivinginterval 15;
   2096     dispatchmode byconnections;
   2097 
   2098     backend one {
   2099         server 10.1.1.100:80;
   2100         stickycookie XRID=100;
   2101         addclientheader "Set-Cookie: XRID=100; Path=/";
   2102     }
   2103 
   2104     backend two {
   2105         server 10.1.1.101:80;
   2106         stickycookie XRID=101;
   2107         addclientheader "Set-Cookie: XRID=101; Path=/";
   2108     }
   2109 }
   2110 </pre>
   2111 
   2112 <p>
   2113 Note how the cookie names and values in the directives
   2114 <code>stickycookie</code> and <code>addclientheader</code> match. That is obviously a
   2115 prerequisite for stickiness.
   2116 <p>
   2117 <a name="l52"></a>
   2118 <h3>5.4: Passing the client's IP address</h3>
   2119 <p>
   2120 Since Crossroads just shuttles bytes to and fro, meta-information of
   2121 network connections is lost. As far as the back ends are concerned,
   2122 their connections originate at the Crossroads junction.
   2123 For example, standard Apache access logs will show the IP address of
   2124 Crossroads. 
   2125 <p>
   2126 In order to compensate for this, Crossroads can insert a special
   2127 header in HTTP connections, to inform the back end of the original
   2128 client's IP address. In order to enable this, the Crossroads
   2129 configuration must state the following:
   2130 <p>
   2131 <ul>
   2132     <li> The service type must be <code>http</code>, and not <code>any</code>;
   2133     <li> In the back end definition, the following statement must
   2134          occur: <br>
   2135          <code>addserverheader "X-Real-IP: %r";</code> <br>
   2136          You are of course free to choose the header name; the here
   2137          used <code>X-Real-IP</code> is a common name for this purpose.</ul>
   2138 <p>
   2139 After this, HTTP traffic that arrives at the back ends has a new
   2140 header: <code>X-Real-IP</code>, holding the client's IP address.
   2141 <strong>Note that</strong> once the type is set to <code>http</code>, Crossroads'
   2142 performance will be hampered -- all passing messages will have to be
   2143 unpacked and analyzed.
   2144 <p>
   2145 <a name="l53"></a>
   2146 <strong>5.4.1: Sample Crossroads configuration</strong>
   2147 <p>
   2148 The below sample configuration shows two HTTP back ends that receive
   2149 the client's IP address:
   2150 <p>
   2151 <pre>
   2152 
   2153 service www {
   2154     port 80;
   2155     type http;
   2156     revivinginterval 5;
   2157     dispatchmode roundrobin;
   2158 
   2159     backend one {
   2160         server 10.1.1.100:80;
   2161         addserverheader "X-Real-IP: %r";
   2162     }
   2163 
   2164     backend two {
   2165         server 10.1.1.200:80;
   2166         addserverheader "X-Real-IP: %r";
   2167     }
   2168 }
   2169 </pre>
   2170 
   2171 <p>
   2172 <a name="l54"></a>
   2173 <strong>5.4.2: Sample Apache configuration</strong>
   2174 <p>
   2175 The method by which each back end analyzes the header <code>X-Real-IP</code>
   2176 will obviously be different per server implementations. However, a
   2177 common method with the Apache webserver is to log the client's IP
   2178 address into the access log.
   2179 <p>
   2180 Often this is accomplished using the log format <code>custom</code>, defined as
   2181 follows:
   2182 <p>
   2183 <pre>
   2184 LogFormat "%h %l %u %t %D \"%r\" %&gt;s %b" common
   2185 CustomLog logs/access_log common
   2186 </pre>
   2187 
   2188 <p>
   2189 The first line defines the format <code>common</code>, with the remote host
   2190 specified by <code>%h</code>. The second line sends access information to a log
   2191 file <code>logs/access_log</code>, using the previously defined format
   2192 <code>common</code>.
   2193 <p>
   2194 Furtunately, Apache's <code>LogFormat</code> allows one to log contents of
   2195 headers. By replacing the <code>%h</code> with <code>%{X-Real-IP}i</code>, the desired
   2196 information is sent to the log. Therefore, normally you can simply
   2197 redefine the <code>common</code> format to 
   2198 <p>
   2199 <pre>
   2200 LogFormat "%{X-Real-IP}i %l %u %t %D \"%r\" %&gt;s %b" common
   2201 </pre>
   2202 
   2203 <p>
   2204 <a name="l55"></a>
   2205 <h3>5.5: Debugging network traffic</h3>
   2206 <p>
   2207 Incase the traffic between
   2208     client and backend
   2209     must be debugged, the statement <code>trafficlog</code> <em>filename</em> can
   2210     be issued. This causes the traffic to be dumped in hexadecimal
   2211     format to the stated filename.
   2212 <p>
   2213 Traffic sent by the client is prefixed by a <strong>C</strong>, traffic sent by
   2214     the back end is prefixed by a <strong>B</strong>. Below is a sample traffic
   2215     dump of a browser trying to get a HTML page. The server replies
   2216     that the page was not modified.
   2217 <p>
   2218 <pre>
   2219 C 0000  47 45 54 20 68 74 74 70 3a 2f 2f 77 77 77 2e 63 GET http://www.c
   2220 C 0010  73 2e 68 65 6c 73 69 6e 6b 69 2e 66 69 2f 6c 69 s.helsinki.fi/li
   2221 C 0020  6e 75 78 2f 6c 69 6e 75 78 2d 6b 65 72 6e 65 6c nux/linux-kernel
   2222 C 0030  2f 32 30 30 31 2d 34 37 2f 30 34 31 37 2e 68 74 /2001-47/0417.ht
   2223 C 0040  6d 6c 20 48 54 54 50 2f 31 2e 31 0d 0a 43 6f 6e ml HTTP/1.1..Con
   2224 C 0050  6e 65 63 74 69 6f 6e 3a 20 63 6c 6f 73 65 0d 0a nection: close..
   2225 .
   2226 . etcetera
   2227 .
   2228 B 0000  48 54 54 50 2f 31 2e 30 20 33 30 34 20 4e 6f 74 HTTP/1.0 304 Not
   2229 B 0010  20 4d 6f 64 69 66 69 65 64 0d 0a 44 61 74 65 3a  Modified..Date:
   2230 B 0020  20 54 75 65 2c 20 31 32 20 4a 75 6c 20 32 30 30  Tue, 12 Jul 200
   2231 B 0030  35 20 30 39 3a 34 39 3a 34 37 20 47 4d 54 0d 0a 5 09:49:47 GMT..
   2232 B 0040  43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 74 65 Content-Type: te
   2233 B 0050  78 74 2f 68 74 6d 6c 3b 20 63 68 61 72 73 65 74 xt/html; charset
   2234 .
   2235 . etcetera
   2236 .
   2237 </pre>
   2238 
   2239 <p>
   2240 Turning on traffic dumps will <em>significantly</em>
   2241     slow down crossroads.
   2242 <p>
   2243 Besides <code>trafficlog</code>, there is also a directive
   2244     <code>throughputlog</code>. This directive also takes one argument, a
   2245     filename. The file is appended, and the following information is
   2246     logged:
   2247 <p>
   2248 <ul>
   2249         <li> The process ID of the crossroads image that serves the
   2250         TCP connection;
   2251         <li> The time of the request, in seconds and microseconds
   2252         since start of the run;
   2253         <li> A <strong>C</strong> when the request originated at the client, or
   2254         <strong>B</strong> when the request originated at the back end;
   2255         <li> The first 100 bytes of the request.</ul>
   2256 <p>
   2257 As an example, consider the following (the lines are shortened for
   2258     brevity and prefixed by line numbers for clarity):
   2259 <p>
   2260 <pre>
   2261 
   2262 1 0000594 0.000001 C GET http://public.e-tunity.com/index.html...
   2263 2 0000594 0.173713 B HTTP/1.0 200 OK..Date: Fri, 18 Nov 2005 0...
   2264 3 0000594 0.278125 B  width="100" bgcolor="#e0e0e0" valign="to...
   2265 4 0000595 0.000001 C GET http://public.e-tunity.com/css/style/...
   2266 5 0000594 0.944339 B /a&gt;&lt;/td&gt;..  &lt;/tr&gt;.&lt;/table&gt;.&lt;/td&gt;&lt;td class...
   2267 6 0000594 0.946356 B smallboxdownl"&gt;Download&lt;/td&gt;..  &lt;td class...
   2268 7 0000594 0.961102 B td&gt;&lt;td class="smallboxodd" valign="top"&gt;&lt;...
   2269 8 0000595 0.698215 B HTTP/1.0 304 Not Modified..Date: Fri, 18 ...
   2270 </pre>
   2271 
   2272 <p>
   2273 This tells us that:
   2274 <p>
   2275 <ul>
   2276         <li> Line 1:  PID 594 served a request that originated at
   2277              the client. The corresponding time is (almost) 0 seconds,
   2278              so this is really the start of the run.
   2279         <li> Line 2: A back end replied 0.17 seconds later, and
   2280              0.28 seconds later, it was still replying (this is the
   2281              third line, again a <strong>B</strong>-type transmission).
   2282         <li> Line 4: PID 595 served a request that originated
   2283              at the client. Again, the corresponding time is (almost)
   2284              0 seconds, since this is the first conversation part of
   2285              this connection.
   2286         <li> Lines 5 to 7: This is the continuation of line 2. Line 7
   2287              is the last line of the <strong>B</strong> series (not visible from
   2288              the example, but trust me, it is), so that we may
   2289              conclude that it took the back end 0.96 seconds to serve
   2290              the file <code>index.html</code> requested in line 1.
   2291         <li> Line 8: This is the answer to the client's request of
   2292              line 4 (you can tell by the process ID  number).
   2293              So the back end took 0.68 seconds to confirm that
   2294              the stylesheet requested in line 4 wasn't modified.</ul>
   2295 <p>
   2296 It is also worth while remembering that the start time of a <strong>C</strong>
   2297     request is the time that crossroads sees the activity. Any latency
   2298     between the true client and crossroads is obviously not
   2299     included. This is illustrated by the below simple ASCII art:
   2300 <p>
   2301 <pre>
   2302 
   2303 client ----&gt;----&gt;----&gt;---&gt;*crossroads ====&gt;====&gt;====&gt;
   2304                                                      \    
   2305                                                   back end
   2306                                                      /
   2307 client ----&lt;----&lt;----&lt;---&lt; crossroads ====&lt;====&lt;====&lt;
   2308 
   2309 </pre>
   2310 
   2311 <p>
   2312 This simple picture shows a typical HTTP request that originates
   2313     at a client, travels to crossroads, and is relayed via the back
   2314     end. The <strong>C</strong> entry in a throughput log is the time when
   2315     crossroads sees the request, indicated by an asterisk. The <strong>B</strong>
   2316     entries are the times that it takes the back end to answer,
   2317     indicated by <code>===</code> style lines. Therefore, the true roundtrip
   2318     time will be longer than the number of seconds that are logged in
   2319     the throughput log: the latency between client and crossroads
   2320     isn't included in that measurement.
   2321 <p>
   2322 Summarizing, the throughput times of a client-back end connection
   2323     can be analyzed using the directive <code>throughputlog</code>. In a
   2324     real-world analysis, you'd probably want to write up a script to
   2325     analyze the output and to compute round trip times. Such scripts
   2326     are not (yet) included in Crossroads.
   2327 <p>
   2328 <a name="l56"></a>
   2329 <h3>5.6: Limiting Access to Crossroads by Client IP Address</h3>
   2330 <p>
   2331 <a name="l57"></a>
   2332 <strong>5.6.1: General Examples</strong>
   2333 <p>
   2334 The directives <code>allowfrom</code>, <code>denyfrom</code>, <code>allowfile</code> and
   2335 <code>denyfile</code> can be used to instruct Crossroads to specifically allow
   2336 access by using a "whitelist" of IP addresses, or to specifically deny
   2337 access by using a "blacklist". E.g., the following configuration
   2338 allows access to service <code>webproxy</code> only to <em>localhost</em>:
   2339 <p>
   2340 <pre>
   2341 service webproxy {
   2342     port 8000;    
   2343     allowfrom 127.0.0.1;
   2344     backend one {
   2345         .
   2346         . Back end definitions occur here
   2347         .
   2348     }
   2349     .
   2350     . Other back ends or other service directives
   2351     . may occur here
   2352     .
   2353 }
   2354 </pre>
   2355 
   2356 <p>
   2357 In this example there is a "whitelist" having only one entry: IP
   2358 address 127.0.0.1, or <em>localhost</em>. (Incidentally, the same behaviour
   2359 could be accomplished by stating <em>bindto 127.0.0.1</em>, in which case
   2360 Crossroads would only listen to the local network device.)
   2361 <p>
   2362 In the same vein, the directive <code>allowfrom 127.0.0.1 192.168.1/24</code>
   2363 would allow access to <em>localhost</em> and to all IP addresses that start
   2364 with 192.168.1. The specifier <code>192.168.1/24</code> states that there are
   2365 three network bytes (192, 168 and 1), and 24 bits (or 3 bytes) are
   2366 relevant; so that the fourth network byte doesn't matter.
   2367 <p>
   2368 <a name="l58"></a>
   2369 <strong>5.6.2: Using External Files</strong>
   2370 <p>
   2371 The directives <code>allowfile</code> and <code>denyfile</code> allow you to specify IP
   2372 addresses in external files. The Crossroads configuration states
   2373 e.g. <code>allowfile /tmp/allow.txt</code>, and the IP addresses are then in
   2374 <code>/tmp/allow.txt</code>. The format of <code>/tmp/allow.txt</code> is as follows:
   2375 <p>
   2376 <ul>
   2377         <li> The specifications follow again <em>p.q.r.s/mask</em>, where
   2378         p, q, r and s are network bytes which can be left out on the
   2379         right hand side when the mask allows it;
   2380 <p>
   2381 <li> The specifications must be separated by white space
   2382         (spaces, tabs or newlines).</ul>
   2383 <p>
   2384 E.g., the following is a valid example of an external specification
   2385 file:
   2386 <p>
   2387 <pre>
   2388 127.0.0.1
   2389 192.168.1/24
   2390 10/8
   2391 </pre>
   2392 
   2393 <p>
   2394 When external files are in effect, then the signal <code>SIGHUP</code> (1)
   2395 causes Crossroads to reload the external file. E.g., while Crossroads
   2396 is running, you may edit <code>/tmp/allow.txt</code>, and then issue <code>killall
   2397 -1 crossroads</code>. The new contents of <code>/tmp/allow.txt</code> will be
   2398 reloaded.
   2399 <p>
   2400 <a name="l59"></a>
   2401 <strong>5.6.3: Mixing Directives</strong>
   2402 <p>
   2403 Crossroads allows to mix all directives in one service
   2404 description. However, some mixes are less meaningful than others. It's
   2405 up to you to take this into account.
   2406 <p>
   2407 The following rules apply:
   2408 <p>
   2409 <ul>
   2410         <li> Blacklisting and whitelisting can be used together. When
   2411         combined, the blacklist will always be interpreted
   2412         first. E.g., consider the following directives:
   2413 <p>
   2414 <pre>
   2415 allowfrom 192.168.1/24
   2416 denyfrom  192.168.1.100
   2417 </pre>
   2418 
   2419 <p>
   2420 Given the fact that the deny list is checked first, client
   2421         192.168.1.100 won't be able to access Crossroads. Then the
   2422         allow list will be checked, stating that all clients whose IP
   2423         address starts with 192.168.1 may connect. The effect will be
   2424         that e.g., client 192.168.1.1 may connect, 192.168.1.2 may
   2425         connect too, 192.168.1.100 will be blocked, and 10.1.1.1 will
   2426         be blocked as well.
   2427 <p>
   2428 Now consider the following directives:
   2429 <p>
   2430 <pre>
   2431 allowfrom 192.168.1.100 127.0.0.1
   2432 denyfrom  192.168.1/24
   2433 </pre>
   2434 
   2435 <p>
   2436 This will first of all deny access to all IP addresses that
   2437         start with 192.168.1. So the rule that allows 192.168.1.100
   2438         won't ever be effective. The net result will be that access
   2439         will be granted to 127.0.0.1, and IP addresses that don't
   2440         match 192.168.1/24.
   2441 <p>
   2442 <li> Blacklisting or whitelisting can be left out.
   2443         A list is considered empty when no appropriate directives
   2444         occur in <code>/etc/crossroads.conf</code>, or when the directive
   2445         points to an empty or non-existent external file.
   2446 <p>
   2447 <li> Using <code>*from</code> and <code>*file</code> statements is allowed, but
   2448         doesn't make sense. E.g., the following configuration sample
   2449         is such a case:
   2450 <p>
   2451 <pre>
   2452 allowfrom 127.0.0.1 192.168.1/24
   2453 allowfile /tmp/allow.txt
   2454 </pre>
   2455 
   2456 <p>
   2457 There is a technical reason for this. Once Crossroads
   2458         processes the <code>allowfile</code> directive, then the whole
   2459         whitelist is cleared (thereby removing the entries 127.0.0.1
   2460         and 192.168.1/24), and new entries are reloaded from the
   2461         file. The net result is that the <code>allowfrom</code> specification
   2462         is overruled.
   2463 <p>
   2464 Crossroads doesn't check for such configurations, which are
   2465         syntactially correct, but make no semantic sense.</ul>
   2466 <p>
   2467 <a name="l60"></a>
   2468 <h3>5.7: Configuration examples</h3>
   2469 <p>
   2470 As a general hint, use <code>crossroads sampleconf</code> to view the most
   2471 up-to-date examples of configurations. The description below shows a
   2472 few examples too.
   2473 <p>
   2474 <a name="l61"></a>
   2475 <strong>5.7.1: A load balancer for three webserver back ends</strong>
   2476 <p>
   2477 The following configuration example binds crossroads to port 80 of the
   2478 current server, and distributes the load over three back ends. This
   2479 configuration shows most of the possible settings.
   2480 <p>
   2481 <pre>
   2482 service www {
   2483     /* We don't need session stickyness. */
   2484     type any;
   2485     
   2486     /* Port on which we'll listen in this service: required. */
   2487     port 8000;
   2488 
   2489     /* What IP address should this service listen? Default is 'any'.
   2490      * Alternatively you can state an explicit IP address, such as
   2491      * 127.0.0.1; that would bind the service only to 'localhost'. */
   2492     bindto any;
   2493     
   2494     /* Verbose reporting or not. Default is off. */    
   2495     verbosity on;
   2496     
   2497     /* Dispatching mode, or: How to select a back end for an incoming
   2498      * request. Possible values:
   2499      *   roundrobin: just the next back end in line
   2500      *   random: like roundrobin, but at random to make things more
   2501      *          confusing. Probably only good for testing.
   2502      *   bysize: The backend that transferred the least nr of bytes
   2503      *          is the next in line. As a modifier you can say e.g.
   2504      *          bysize over 10, meaning that the 10 last connections will
   2505      *          be used to compute the transfer size, instead of all
   2506      *          transfers.
   2507      *   byduration: The backend that was active for the shortest time
   2508      *          is the next in line. As a modifier you can say e.g.
   2509      *          byduration of 10 to compute over the last 10 connections.
   2510      *   byconnections: The back end with the least active connections
   2511      *          is the next ine line.
   2512      *   byorder: The first available back end is always taken.
   2513      */
   2514     dispatchmode byduration over 5;
   2515 
   2516     /* Interval at which we'll check whether a temporarily unavailable
   2517      * backend has woken up.
   2518      */
   2519     revivinginterval 5;
   2520     
   2521     /* TCP backlog of connections. Default is 0 (no backlog, one
   2522      * connection may be active).
   2523      */
   2524     backlog 5;
   2525     
   2526     /* For status reporting: a shared memory key. Default is the same
   2527      * as the port number, OR-ed by a magic number.
   2528      */
   2529     shmkey 8000;
   2530 
   2531     /* This controls when crossroads should consider a connection as
   2532      * finished even when the TCP sockets weren't closed. This is to
   2533      * avoid hanging connections that don't do anything. NOTE THAT when
   2534      * crossroads cuts off a connection due to timeout exceed, this is
   2535      * not marked as a failure, but as a success. Default is 0: no timeout.
   2536      */
   2537     connectiontimeout 300;
   2538 
   2539     /* The max number of allowed client connections. When present, connections
   2540      * won't be accepted if the max is about to be exceeded. When
   2541      * absent, all connections will be accepted, which might be misused
   2542      * for a DOS attack.
   2543      */
   2544     maxconnections 300;
   2545 
   2546     /* Now let's define a couple of back ends. Number 1: */
   2547     backend www_backend_1 {
   2548         /* The server and its port, the minimum configuration. */
   2549         server httpserver1;
   2550         port 9010;
   2551         /* The 'decay' of usage data of this back end. Only relevant
   2552          * when the whole service has 'dispatchmode bysize' or
   2553          * 'byduration'. The number is a percentage by which the usage
   2554          * parameter is decreased upon each connection of an other back
   2555          * end.
   2556          */
   2557         decay 10;
   2558         
   2559         /* To see what's happening in /var/log/messages: */
   2560         verbosity on;
   2561     }
   2562 
   2563     /* The second one: */
   2564     backend www_backend_2 {
   2565         /* Server and port */
   2566         server httpserver2;
   2567         port 9011;
   2568 
   2569         /* Verbosity of reporting when this back end is active */
   2570         verbosity on;
   2571 
   2572         /* Decay */
   2573         decay 10;
   2574 
   2575         /* This back end is twice as weak as the first one */
   2576         weight 2;
   2577 
   2578         /* Event triggers for system commands upon succesful activation
   2579          * and upon failure.
   2580          */
   2581         onsuccess echo 'success on backend 2' | mail root;
   2582         onfailure echo 'failure on backend 2' | mail root;
   2583     }
   2584 
   2585     /* And yet another one.. this time we will dump the traffic
   2586      * to a trace file. Furthermore we don't want more than 10 concurrent
   2587      * connections here. Note that there's also a total maxconnections for the
   2588      * whole service.
   2589      */
   2590     backend www_backend_3 {
   2591         server httpserver3;
   2592         verbosity on;
   2593         port 9000;
   2594         verbosity on;
   2595         decay 10;
   2596         trafficlog /tmp/backend.3.log;
   2597         maxconnections 10;
   2598     }
   2599 }
   2600 </pre>
   2601 
   2602 <p>
   2603 <a name="l62"></a>
   2604 <strong>5.7.2: An HTTP forwarder when travelling</strong>
   2605 <p>
   2606 As another example, here's my <code>crossroads.conf</code> that I use on my
   2607 Unix laptop. The problem that I face is that I need many HTTP proxy
   2608 configurations (at home, at customers' sites and so on) but I'm too
   2609 lazy to reconfigure browsers all the time.
   2610 <p>
   2611 Here's how it used to be before crossroads:
   2612 <p>
   2613 <ul>
   2614         <li> At home, I would surf through a squid proxy on my local
   2615         machine. The browser proxy setting is then
   2616         <code>http://localhost:3128</code>.
   2617 <p>
   2618 <li> Sometimes I start up an SSH tunnel to our offices. The
   2619         tunnel has a local port 3129, and connects to a squid proxy on
   2620         our e-tunity server. Hence, the browser proxy is then
   2621         <code>http://localhost:3129</code>.
   2622 <p>
   2623 <li> At a customer's location I need the proxy
   2624         <code>http://10.120.34.113:8080</code>, because they have configured it
   2625         so.
   2626 <p>
   2627 <li> And in yet other instances, I use a HTTP diagnostic tool
   2628         <a href="http://www.xk72.com/charles">Charles</a>
   2629         that sits between browser and website and shows me
   2630         what's happening. I run charles on my own machine and it
   2631         listens to port 8888, behaving like a proxy. The browser
   2632         configuration for the proxy is then
   2633         <code>http://localhost:8888</code>.</ul>
   2634 <p>
   2635 Here's how it works with a crossroads configuration:    
   2636 <p>
   2637 <ul>
   2638         <li> I have configured my browsers to use
   2639         <code>http://localhost:8080</code> as the proxy. For all situations.
   2640 <p>
   2641 <li> I use the following crossroads configuration, and let
   2642         crossroads figure out which proxy backend works, and which
   2643         doesn't. Note two particularities:
   2644 <p>
   2645 <ul>
   2646                 <li> The statement <code>dispatchmode byorder</code>. This
   2647                 makes sure that once crossroads determines which
   2648                 backend works, it will stick to it. This usage of
   2649                 crossroads doesn't need to balance over more than one
   2650                 back end.
   2651 <p>
   2652 <li> The statement <code>bindto 127.0.0.1</code> makes sure
   2653                 that requests from other interfaces than loopback
   2654                 won't get serviced.</ul>
   2655 <p>
   2656 <pre>
   2657 service HttpProxy {
   2658     port 8080;
   2659     bindto 127.0.0.1;
   2660     verbosity on;
   2661     dispatchmode byorder;
   2662     revivinginterval 15;
   2663 
   2664     backend Charles {
   2665         server localhost:8888;
   2666         verbosity on;
   2667     }
   2668 
   2669     backend CustomerProxy {
   2670         server 10.120.34.113:8080;
   2671         verbosity on;
   2672     }
   2673 
   2674     backend SshTunnel {
   2675         server localhost:3129;
   2676     }
   2677 
   2678     backend LocalSquid {
   2679         server localhost:3128;
   2680     }
   2681 }
   2682 </pre>
   2683 </ul>
   2684 <p>
   2685 As a final note, the commandline argument <code>tell</code> can be used to
   2686 influence crossroad's own detection mechanism of back end availability
   2687 detection. E.g., if in the above example the back ends <code>SshTunnel</code>
   2688 and <code>LocalSquid</code> are both active, then <code>crossroads tell httpproxy
   2689 sshtunnel down</code> will 'take down' the back end <code>SshTunnel</code> -- and
   2690 will automatically cause crossroads to switch to <code>LocalSquid</code>.
   2691 <p>
   2692 <a name="l63"></a>
   2693 <strong>5.7.3: SSH login with enforced idle logout</strong>
   2694 <p>
   2695 The following example shows how crossroads 'throttles' SSH
   2696 logins. Connections are accepted on port
   2697 22 (the normal SSH port) and forwarded to the actual SSH daemon
   2698 which is running on port 2222.
   2699 <p>
   2700 Note the usage of the
   2701 <code>connectiontimeout</code> directive. This makes sure that users are logged
   2702 out after 10 minutes of inactivity. Note also the <code>maxconnections</code>
   2703 setting, this makes sure that no more than 10 concurrent logins occur.
   2704 <p>
   2705 <pre>
   2706 service Ssh {
   2707     port 22;
   2708     backlog 5;
   2709     maxconnections 10;
   2710     connectiontimeout 600;
   2711     backend TrueSshDaemon {
   2712         server localhost:2222;
   2713     }
   2714 }
   2715 </pre>
   2716 
   2717 <p>
   2718 <a name="l64"></a>
   2719 <h2>6: Benchmarking</h2>
   2720 <a name="benchmarking"></a>This section shows how crossroads affects the
   2721 transmitting of HTML data when used as an intermediate 'station'
   2722 through which all data travels.
   2723 <p>
   2724 <a name="l65"></a>
   2725 <h3>6.1: Benchmark 1: Accessing a proxy via crossroads or directly</h3>
   2726 <p>
   2727 The benchmark was run on a system where the following was varied:
   2728 <p>
   2729 <ol>
   2730         <li> A website was recursively spidered through a local squid
   2731         proxy. The spidering was repeated 10 times, the total was recorded.
   2732 <p>
   2733 <li> Crossroads was placed in front of the squid proxy, and
   2734         the website was again recursively spidered. Again, the
   2735         spidering was repeated 10 times and the total was recorded.</ol>
   2736 <p>
   2737 The crossroads configuration of the second alternative is shown below:
   2738 <p>
   2739 <pre>
   2740 service HttpProxy {
   2741     port 8080;
   2742     verbosity on;
   2743     backend LocalSquid {
   2744         server 127.0.0.1;
   2745         port 3128;
   2746         verbosity on;
   2747     }
   2748 }
   2749 </pre>
   2750 
   2751 <p>
   2752 <a name="l66"></a>
   2753 <strong>6.1.1: Results</strong>
   2754 <p>
   2755 The results of this test are that crossroads causes a negligible
   2756 delay, if it is statistically relevant at all. Without crossroads, the
   2757 timing results are:
   2758 <p>
   2759 <pre>
   2760 real	0m8.146s
   2761 user	0m0.130s
   2762 sys	0m0.253s
   2763 </pre>
   2764 
   2765 <p>
   2766 When using crossroads as a middle station, the results are:
   2767 <p>
   2768 <pre>
   2769 real	0m9.481s
   2770 user	0m0.141s
   2771 sys	0m0.230s
   2772 </pre>
   2773 
   2774 <p>
   2775 <a name="l67"></a>
   2776 <strong>6.1.2: Discussion</strong>
   2777 <p>
   2778 The above shown results are quite favorable to crossroads. However,
   2779 one should know that situations will exist where crossroads leans
   2780 towards the 'worst case' scenario, causing up to 50%
   2781 delay.
   2782 <p>
   2783 E.g., imagine a test where a <code>wget</code> command retrieves a
   2784 HTML document from an Apache server on <code>localhost</code>. Now we have
   2785 (almost) no overhead due to network throttling, hostname lookups and
   2786 so on. When this test would be run either with or without crossroads
   2787 in between, then theoretically, crossroads would cause a much larger
   2788 delay, because it has to read from the server, and then write the same
   2789 information to <code>wget</code>.  Each read/write occurs twice when crossroads
   2790 sits in between.
   2791 <p>
   2792 This worst case scenario will however (fortunately) occur only very
   2793 seldom in the real world:
   2794 <p>
   2795 <ul>
   2796         <li> Normally network issues, such as the above mentioned host
   2797         name lookups or throughput restrictions, will add
   2798         significantly to the duration of a request. The 'twice as
   2799         many' read/writes caused by crossroads are then relatively
   2800         irrelevant.
   2801 <p>
   2802 <li> Normally a significant amount of time will be spent in a
   2803         back end, due to processing (e.g., when calling a servlet on a
   2804         back end). Again, this processing time will weigh much heavier
   2805         than the multiple read/writes.</ul>
   2806 <p>
   2807 <a name="l68"></a>
   2808 <h3>6.2: Benchmark 2: Crossroads versus Linux Virtual Server (LVS)</h3>
   2809 <p>
   2810 LVS is a kernel-based balancer that acts like a masquerading
   2811 firewall: TCP packets that arrive at the balancer are sent to one of
   2812 the configured back ends. LVS has the advantage over crossroads that
   2813 there is no stop-and-go in the transmission; in contrast, crossroads
   2814 needs to send data via an internal buffer. Crossroads has the
   2815 advantage that it offers instantaneous failover because it tries to
   2816 contact the back end for upon each new TCP connection; in contrast,
   2817 LVS isn't aware of downtime of back ends (unless one implements an
   2818 external heartbeat). Also, crossroads offers more complex balancing
   2819 than LVS.
   2820 <p>
   2821 <a name="l69"></a>
   2822 <strong>6.2.1: Environment</strong>
   2823 <p>
   2824 On the balancer, LVS was run on port 80, its forwarding set up for two
   2825 equally weighted back ends, using <code>ipvsadm</code>:
   2826 <p>
   2827 <pre>
   2828 ipvsadm -a -t 192.168.1.250:http -r 10.1.1.100:http -m -w 1
   2829 ipvsadm -a -t 192.168.1.250:http -r 10.1.1.101:http -m -w 1
   2830 </pre>
   2831 
   2832 <p>
   2833 Crossroads was run on port 81. The configuration file is shown below:
   2834 <p>
   2835 <pre>
   2836 service http {
   2837     port 81;
   2838     dispatchmode roundrobin;
   2839     revivinginterval 5;
   2840     backend one {
   2841         server 10.1.1.100;
   2842         port 80;
   2843     }
   2844     backend two {
   2845         server 10.1.1.101;
   2846         port 80;
   2847     }
   2848 }
   2849 </pre>
   2850 
   2851 <p>
   2852 <a name="l70"></a>
   2853 <strong>6.2.2: Tests and results</strong>
   2854 <p>
   2855 In the first test, ports 80 and 81 on the balancer were 'bombed' with
   2856 50 concurrent clients, each requesting a small page 50 times. The
   2857 following timings where measured:
   2858 <p>
   2859 <ul>
   2860         <li> How long it takes to establish a connection;
   2861         <li> How long it takes to retrieve the page.</ul>
   2862 <p>
   2863 The results of this test were:
   2864 <p>
   2865 <ul>
   2866         <li> On average, each client took 0.12 seconds to connect
   2867         to LVS, and each page was retrieved in 0.14 seconds;
   2868         <li> On average, each client took 0.11 seconds to connect to
   2869         crossroads, and each page was retrieved in 0.13 seconds.</ul>
   2870 <p>
   2871 In this setup there seems to be no difference between the performance
   2872 of LVS and crossroads!
   2873 <p>
   2874 In a second test, the size of the retrieved page was varied from 2.000
   2875 to 2.000.000 bytes. This test was taken to see whether crossroads would
   2876 show performance degradation when transferring larger amounts of data.
   2877 <p>
   2878 For each page size, 30 concurrent clients were started, that retrieved
   2879 the page 50 times. Again, the connect times and processing times where
   2880 recorded.
   2881 <p>
   2882 The results of the total time (connect time + retrieval time)
   2883 are shown in the below table:
   2884 <p>
   2885 <table>
   2886 
   2887   <td colspan=3><hr></td>
   2888 
   2889   
   2890 <tr>
   2891 
   2892     <td> <strong>Bytes</strong></td> <td> <strong>LVS timing</strong></td> <td> <strong>Crossroads timing</strong></td>
   2893  
   2894 </tr>
   2895 
   2896   
   2897 <tr>
   2898 
   2899     <td> 2000</td> 	    <td> 0.130741688</td> 	<td> 0.12739582</td>
   2900  
   2901 </tr>
   2902 
   2903   
   2904 <tr>
   2905 
   2906     <td> 20000</td>     <td> 0.490916224</td> 	<td> 0.50376901</td>
   2907  
   2908 </tr>
   2909 
   2910   
   2911 <tr>
   2912 
   2913     <td> 200000</td>    <td> 3.799440328</td> 	<td> 4.33125273</td>
   2914  
   2915 </tr>
   2916 
   2917   
   2918 <tr>
   2919 
   2920     <td> 2000000</td>   <td> 45.25090855</td> 	<td> 45.9600728</td>
   2921  
   2922 </tr>
   2923 
   2924   <td colspan=3><hr></td>
   2925 
   2926 </table>
   2927 <p>
   2928 Again, the results show that crossroads performs just as effectively
   2929 as LVS, even with large data chunks!
   2930 <p>
   2931 <a name="l71"></a>
   2932 <h2>7: Compiling and Installing</h2>
   2933 <a name="compiling"></a><a name="l72"></a>
   2934 <h3>7.1: Prerequisites</h3>
   2935 <p>
   2936 The creation of crossroads requires:
   2937 <p>
   2938 <ul>
   2939         <li> Standard Unix tools, such as <code>sed</code>, <code>awk</code>, <code>Perl</code>
   2940         (5.00 or better);
   2941 <p>
   2942 <li> A POSIX-compliant C compiler;
   2943 <p>
   2944 <li> Support for SYSV IPC, networking and so on.
   2945 </ul>
   2946 <p>
   2947 Basically a Linux or Apple MacOSX box will do nicely. To compile and install
   2948 crossroads, follow these steps.
   2949 <p>
   2950 <a name="l73"></a>
   2951 <h3>7.2: Compiling and installing</h3>
   2952 <p>
   2953 <ul>
   2954         <li> Obtain the source distribution. It can be found on
   2955         <a href="http://crossroads.e-tunity.com">http://crossroads.e-tunity.com</a>. The distribution comes as an
   2956         archive <code>crossroads-</code><em>type</em><code>.tar.gz</code>, where <em>type</em> is
   2957         <code>stable</code> or <code>devel</code>.
   2958 <p>
   2959 <li> Unpack the archive in a sources directory using <code>tar
   2960         xzf crossroads-</code><em>X.YY</em><code>.tar.gz</code>. The contents spill into a
   2961         subdirectory <code>crossroads-</code><em>X.YY/</em>.
   2962 <p>
   2963 <li> Change-dir into the directory.
   2964 <p>
   2965 <li> Next, edit <code>etc/Makefile.def</code> and verify that all
   2966         compilation settings are to your likings. The settings are
   2967         explained in the file. <strong>Note that</strong> the default distribution
   2968         of <code>Makefile.def</code> is suited for Linux or Apple MacOSX
   2969         systems. On other Unices, or on non-Unix systems, you must
   2970         particularly pay attention to <code>SET_PROC_TITLE_BY...</code>. When
   2971         in doubt, comment out all <code>SET_PROC_TITLE...</code>
   2972         settings. Crossroads will work nevertheless, but it won't show
   2973         nice titles in <code>ps</code> listings. Also there's a macro
   2974         <code>EXTRA_LIBS</code> to add linkage flags (an example for a Solaris
   2975         build is included).
   2976 <p>
   2977 <li> Now crossroads is ready for compilation. Do a <code>make
   2978         local</code> followed by <code>make install</code>. The latter step may have
   2979         to be done by the user <code>root</code> if the <code>BINDIR</code> setting of
   2980         <code>etc/Makefile.def</code> points to a root-owned directory.
   2981 <p>
   2982 <li> The documentation doesn't install in this process. If you
   2983         want to install the documentation, then proceed as follows:
   2984 <p>
   2985 <ul>
   2986         	<li> Optionally, <code>cp doc/crossroads.html</code>
   2987         	<em>htmldirectory/</em>; where <em>htmldirectory</em> is the destination
   2988         	directory for your HTML manuals;
   2989 <p>
   2990 <li> Optionally, <code>cp doc/crossroads.pdf</code>
   2991         	<em>pdfdirectory/</em>; where <em>pdfdirectory</em> is the
   2992         	destination directory for your PDF manuals;
   2993 <p>
   2994 <li> Optionally, <code>cp doc/crossroads.man</code>
   2995         	<em>manualdirectory</em><code>/crossroads.1</code>, where
   2996         	<em>manualdirectory</em> is e.g. <code>/usr/man/man1</code>,
   2997         	<code>/usr/share/man1</code>, <code>/usr/local/man/man1</code>,
   2998         	<code>/usr/local/share/man1</code>. Any possibility is valid, as
   2999         	long as <em>manualdirectory</em> is one of the directories
   3000         	where manual pages are stored;
   3001 <p>
   3002 <li> If your manual page system supports compressed
   3003         	manual pages, then you can save some space with
   3004         	<code>gzip</code> <em>manualdirectory</em><code>/crossroads.1</code>.</ul>
   3005 <p>
   3006 </ul>
   3007 <p>
   3008 <a name="l74"></a>
   3009 <h3>7.3: Configuring crossroads</h3>
   3010 <p>
   3011 Now that the binary is available on your system, you need to create a
   3012 suitable <code>/etc/crossroads.conf</code>. Use this manual or the output of
   3013 <code>crossroads samplconf</code> to get started.
   3014 <p>
   3015 Once you have the configuration ready, start crossroads with
   3016 <code>crossroads start</code>. Test the availability of your services and back
   3017 ends. Monitor how crossroads is doing with:
   3018 <p>
   3019 <ul>
   3020         <li> In one terminal, run the script:
   3021         <pre>
   3022 while [ 1 ] ; do
   3023     tput clear
   3024     crossroads status
   3025     sleep 3
   3026 done
   3027 </pre>
   3028 
   3029 <p>
   3030 <strong>Note</strong> that depending on your system you might need
   3031         <code>sleep 3s</code>, i.e., with an <code>s</code> appended.
   3032 <p>
   3033 <li> In another terminal, run:
   3034         <pre>
   3035 while [ 1 ] ; do
   3036     tput clear
   3037     ps ax | grep crossroads | grep -v grep
   3038     sleep 3		    	
   3039 done
   3040 </pre>
   3041 
   3042 <p>
   3043 <strong>Note</strong> that depending on your system you might need
   3044         <code>ps -ef</code> instead of <code>ps ax</code>.
   3045 <p>
   3046 <li> In yet another terminal, run <code>tail -f
   3047         /var/log/messages</code> (supply the appropriate system log file if
   3048         <code>/var/log/messages</code> doesn't work for you).</ul>
   3049 <p>
   3050 Now thoroughly test the availability of your back ends through
   3051 crossroads. The status display will show an updated view of which back
   3052 ends are selected and how busy they are. The process list will show
   3053 which crossroads daemons are running. Finally, the tailing of
   3054 <code>/var/log/messages</code> shows what's going on -- especially if you have
   3055 <code>verbosity true</code> statements in the configuration.
   3056 <p>
   3057 <a name="l75"></a>
   3058 <h3>7.4: A boot script</h3>
   3059 <p>
   3060 Finally, you may want to create a boot-time startup script. The exact
   3061 procedure depends on the used Unix flavor.
   3062 <p>
   3063 <a name="l76"></a>
   3064 <strong>7.4.1: SysV Style Startup</strong>
   3065 <p>
   3066 On SysV style systems, there's a startup script directory
   3067 <code>/etc/init.d</code> where bootscripts for all utilities are located.
   3068 You may have the <code>chkconfig</code> utility to automate the task of
   3069 inserting scripts into the boot sequence, but
   3070 otherwise the steps will resemble the following.
   3071 <p>
   3072 <ul>
   3073         <li> Create a script <code>crossroads</code> in <code>/etc/init.d</code> similar to the
   3074         following:
   3075 <p>
   3076 <pre>
   3077 #!/bin/sh
   3078 /usr/local/bin/crossroads -v $@
   3079 </pre>
   3080 
   3081 <p>
   3082 The stated directory <code>/usr/local/bin</code> must correspond with
   3083         the installation path. The flag <code>-v</code> causes the startup to
   3084         be more 'verbose'. However, once daemonized, the verbosity is
   3085         controlled by the appropriate statements in the configuration.
   3086 <p>
   3087 <li> Determine your 'runlevel': usually 3 when your system is
   3088         running in text-mode only, or 5 when you are using a graphical
   3089         interface. If your runlevel is 3, then:
   3090 <p>
   3091 <pre>
   3092 root&gt; cd /etc/rc.d/rc3.d
   3093 root&gt; ln -s /etc/init.d/crossroads S99crossroads
   3094 root&gt; ln -s /etc/init.d/crossroads K99crossroads
   3095 </pre>
   3096 
   3097 <p>
   3098 This creates startup (<code>S*</code>) and stop (<code>K*</code>) links that
   3099         will be run when the system enters or leaves a given runlevel.
   3100 <p>
   3101 If your runlevel is 5, then the right <code>cd</code> command is to
   3102         <code>/etc/rc.d/rc5.d</code>. Alternatively, you can create the
   3103         symlinks in both runlevel directories.</ul>
   3104 <p>
   3105 <a name="l77"></a>
   3106 <strong>7.4.2: BSD Style Startup</strong>
   3107 <p>
   3108 On BSD style systems, daemons are booted directly from <code>/etc/rc</code> and
   3109 related scripts. Incase you have a file <code>/etc/rc.local</code>, edit it,
   3110 and add the statement:
   3111 <p>
   3112 <pre>
   3113 /usr/local/bin/crossroads start
   3114 </pre>
   3115 
   3116 <p>
   3117 If your BSD system lacks <code>/etc/rc.local</code>, then you may need to start
   3118 Crossroads from <code>/etc/rc</code>. Your mileage may vary.
   3119 <p>
   3120 </body>
   3121 </html>