crossroads.html (129102B)
1 <a name="defs.yo"></a><html><head> 2 <title>Crossroads 1.23</title> 3 <link rel="stylesheet" type="text/css" href="http://www.e-tunity.com/css/yodl.css"> 4 <link rel="stylesheet" type="text/css" href="http://www.e-tunity.com/css/yodl.css"> 5 <link rev="made" href="mailto:info@e-tunity.com"> 6 </head> 7 <body> 8 <hr> 9 <h1>Crossroads 1.23</h1> 10 <h2>Karel Kubat</h2> 11 12 <h2>e-tunity</h2><h2>2005, 2006, ff.</h2> 13 14 <blockquote><em>Crossroads is a load balance and fail over utility for TCP 15 based services. It is a daemon program running in user 16 space, and features extensive configurability, polling of 17 back ends using 'wakeup calls', detailed status reporting, 18 'hooks' for special actions when backend calls fail, and much 19 more. Crossroads is service-independent: it is usable for 20 HTTP/HTTPS, SSH, SMTP, DNS, etc. In the case of HTTP 21 balancing, Crossroads can modify HTTP headers, e.g. to 22 provide 'session stickiness' for 23 back-end processes that need sessions, but aren't 24 session-aware of other back-ends.</em></blockquote> 25 26 <h1>Table of Contents</h1> 27 <dl> 28 <dl> 29 <dt><h3><a href="#l1">1: Introduction</a></h3></dt> 30 <dl> 31 <dt><a href="#l2">1.1: Obtaining Crossroads</a></dt> 32 <dt><a href="#l3">1.2: Copyright and Disclaimer</a></dt> 33 <dt><a href="#l4">1.3: Terminology</a></dt> 34 <dt><a href="#l5">1.4: Porting issues for pre-1.21 installations</a></dt> 35 <dt><a href="#l6">1.5: Porting issues for pre-0.26 installations</a></dt> 36 <dt><a href="#l7">1.6: Porting issues for pre-1.08 installations</a></dt> 37 </dl> 38 <dt><h3><a href="#l8">2: Installation for the impatient</a></h3></dt> 39 <dt><h3><a href="#l9">3: Using Crossroads</a></h3></dt> 40 <dl> 41 <dt><a href="#l10">3.1: General Commandline Syntax</a></dt> 42 <dt><a href="#l11">3.2: Logging-related options</a></dt> 43 <dt><a href="#l12">3.3: Reloading Configurations</a></dt> 44 </dl> 45 <dt><h3><a href="#l13">4: The configuration</a></h3></dt> 46 <dl> 47 <dt><a href="#l14">4.1: General language elements</a></dt> 48 <dl> 49 <dt><a href="#l15">4.1.1: Empty lines and comments</a></dt> 50 <dt><a href="#l16">4.1.2: Keywords, numbers, identifiers, generic strings</a></dt> 51 </dl> 52 <dt><a href="#l17">4.2: Service definitions</a></dt> 53 <dl> 54 <dt><a href="#l18">4.2.1: type - Defining the service type</a></dt> 55 <dt><a href="#l19">4.2.2: port - Specifying the listen port</a></dt> 56 <dt><a href="#l20">4.2.3: bindto - Binding to a specific IP address</a></dt> 57 <dt><a href="#l21">4.2.4: verbosity - Controlling debug output</a></dt> 58 <dt><a href="#l22">4.2.5: dispatchmode - How are back ends selected</a></dt> 59 <dt><a href="#l23">4.2.6: revivinginterval - Back end wakeup calls</a></dt> 60 <dt><a href="#l24">4.2.7: maxconnections - Limiting concurrent clients at service level</a></dt> 61 <dt><a href="#l25">4.2.8: backlog - The TCP Back Log size</a></dt> 62 <dt><a href="#l26">4.2.9: shmkey - Shared Memory Access</a></dt> 63 <dt><a href="#l27">4.2.10: allow* and deny* - Allowing or denying connections</a></dt> 64 <dt><a href="#l28">4.2.11: useraccount - Limiting the effective ID of external processes</a></dt> 65 </dl> 66 <dt><a href="#l29">4.3: Backend definitions</a></dt> 67 <dl> 68 <dt><a href="#l30">4.3.1: server - Specifying the back end address</a></dt> 69 <dt><a href="#l31">4.3.2: verbosity - Controlling verbosity at the back end level</a></dt> 70 <dt><a href="#l32">4.3.3: weight - When a back end is more equal than others</a></dt> 71 <dt><a href="#l33">4.3.4: decay - Levelling out activity of a back end</a></dt> 72 <dt><a href="#l34">4.3.5: onstart, onend, onfail - Action Hooks</a></dt> 73 <dt><a href="#l35">4.3.6: trafficlog and throughputlog - Debugging and Performance Aids</a></dt> 74 <dt><a href="#l36">4.3.7: stickycookie - Back end selection with an HTTP cookie</a></dt> 75 <dt><a href="#l37">4.3.8: HTTP Header Modification Directives</a></dt> 76 </dl> 77 </dl> 78 <dt><h3><a href="#l38">5: Tips, Tricks and Remarks</a></h3></dt> 79 <dl> 80 <dt><a href="#l39">5.1: How back ends are selected in load balancing</a></dt> 81 <dl> 82 <dt><a href="#l40">5.1.1: Bysize, byduration or byconnections?</a></dt> 83 <dt><a href="#l41">5.1.2: Averaging size and duration</a></dt> 84 <dt><a href="#l42">5.1.3: Specifying decays</a></dt> 85 <dt><a href="#l43">5.1.4: Adjusting the weights</a></dt> 86 <dt><a href="#l44">5.1.5: Throttling the number of concurrent connections</a></dt> 87 </dl> 88 <dt><a href="#l45">5.2: Using an external program to dispatch</a></dt> 89 <dl> 90 <dt><a href="#l46">5.2.1: Configuring the external handler</a></dt> 91 <dt><a href="#l47">5.2.2: Writing the external handler</a></dt> 92 <dt><a href="#l48">5.2.3: Examples of external handlers</a></dt> 93 </dl> 94 <dt><a href="#l49">5.3: HTTP Session Stickiness</a></dt> 95 <dl> 96 <dt><a href="#l50">5.3.1: Don't use stickiness!</a></dt> 97 <dt><a href="#l51">5.3.2: But if you must..</a></dt> 98 </dl> 99 <dt><a href="#l52">5.4: Passing the client's IP address</a></dt> 100 <dl> 101 <dt><a href="#l53">5.4.1: Sample Crossroads configuration</a></dt> 102 <dt><a href="#l54">5.4.2: Sample Apache configuration</a></dt> 103 </dl> 104 <dt><a href="#l55">5.5: Debugging network traffic</a></dt> 105 <dt><a href="#l56">5.6: Limiting Access to Crossroads by Client IP Address</a></dt> 106 <dl> 107 <dt><a href="#l57">5.6.1: General Examples</a></dt> 108 <dt><a href="#l58">5.6.2: Using External Files</a></dt> 109 <dt><a href="#l59">5.6.3: Mixing Directives</a></dt> 110 </dl> 111 <dt><a href="#l60">5.7: Configuration examples</a></dt> 112 <dl> 113 <dt><a href="#l61">5.7.1: A load balancer for three webserver back ends</a></dt> 114 <dt><a href="#l62">5.7.2: An HTTP forwarder when travelling</a></dt> 115 <dt><a href="#l63">5.7.3: SSH login with enforced idle logout</a></dt> 116 </dl> 117 </dl> 118 <dt><h3><a href="#l64">6: Benchmarking</a></h3></dt> 119 <dl> 120 <dt><a href="#l65">6.1: Benchmark 1: Accessing a proxy via crossroads or directly</a></dt> 121 <dl> 122 <dt><a href="#l66">6.1.1: Results</a></dt> 123 <dt><a href="#l67">6.1.2: Discussion</a></dt> 124 </dl> 125 <dt><a href="#l68">6.2: Benchmark 2: Crossroads versus Linux Virtual Server (LVS)</a></dt> 126 <dl> 127 <dt><a href="#l69">6.2.1: Environment</a></dt> 128 <dt><a href="#l70">6.2.2: Tests and results</a></dt> 129 </dl> 130 </dl> 131 <dt><h3><a href="#l71">7: Compiling and Installing</a></h3></dt> 132 <dl> 133 <dt><a href="#l72">7.1: Prerequisites</a></dt> 134 <dt><a href="#l73">7.2: Compiling and installing</a></dt> 135 <dt><a href="#l74">7.3: Configuring crossroads</a></dt> 136 <dt><a href="#l75">7.4: A boot script</a></dt> 137 <dl> 138 <dt><a href="#l76">7.4.1: SysV Style Startup</a></dt> 139 <dt><a href="#l77">7.4.2: BSD Style Startup</a></dt> 140 </dl> 141 142 <p><hr><p> 143 <p> 144 <a name="l1"></a> 145 <h2>1: Introduction</h2> 146 <a name="intro"></a>Crossroads is a daemon that basically accepts TCP connections 147 at preconfigured ports, and given a list of 'back ends' 148 distributes each incoming connection to one of the back ends, 149 so that a client request is 150 served. Additionally, crossroads maintains an internal 151 administration of the back end connectivity: if a back end isn't 152 usable, then the client request is handled using another back 153 end. Crossroads will then periodically check whether a previously not 154 usable back end has come to life yet. Also, crossroads can select 155 back ends by estimating the load, so that balancing is achieved. 156 <p> 157 Using this approach, crossroads serves as load balancer and fail over 158 utility. Crossroads will very likely not be as reliable as 159 hardware based balancers, since it always will require a server to 160 run on. This server, in turn, may become a new Single Point of 161 Failure (SPOS). However, in situations where cost efficiency is an issue, 162 crossroads may be a good choice. Furthermore, crossroads can be 163 deployed in situations where a hardware based balancing already 164 exists and augmenting service reliability is needed. Or, crossroads may be 165 run off a diskless system, which again improves reliability of the 166 underlying hardware. 167 <p> 168 This document describes how to use crossroads, how to configure it 169 in order to increase the reliability of your systems, and how to 170 compile the program from its sources. This document is 171 also available in <a href="crossroads.pdf">PDF</a> format. 172 <p> 173 <a name="l2"></a> 174 <h3>1.1: Obtaining Crossroads</h3> 175 <p> 176 As quick reference, here are some important URL's for Crossroads: 177 <p> 178 <ul> 179 <li> <a href="http:/crossroads.e-tunity.com">http:/crossroads.e-tunity.com</a> is the site that serves 180 Crossroads. You can browse this at leisure 181 for documentation, sources, and so on. 182 <p> 183 <li> <a href="http://freshmeat.net/projects/crossr">http://freshmeat.net/projects/crossr</a> is the 184 Freshmeat announcement page. 185 <p> 186 <li> <a href="svn://svn.e-tunity.com/crossroads">svn://svn.e-tunity.com/crossroads</a> is the SVN 187 repository; anonymous reading (fetching) is allowed. In order 188 to commit changes, <a href="mailto:karel@e-tunity.com">mail me</a> for 189 credentials.</ul> 190 <p> 191 <a name="l3"></a> 192 <h3>1.2: Copyright and Disclaimer</h3> 193 <p> 194 Crossroads is distributed as-is, without assumptions of fitness 195 or usability. You are free to use crossroads to your 196 liking. It's free, and as with everything that's free: there's 197 also no warranty. 198 <p> 199 You are allowed to make modifications to the source code of 200 crossroads, and you are allowed to (re)distribute crossroads, as 201 long as you include this text, all sources, and if applicable: all 202 your modifications, with each distribution. 203 <p> 204 While you are allowed to make any and all changes to the sources, 205 I would appreciate hearing about them. If the changes concern new 206 functionality or bugfixes, then I'll include them in a next 207 release, stating full credits. If you want to seriously contribute (to 208 which you are heartily encouraged), then mail me and I'll get you 209 access to the Crossroads SVN repository, so that you can update and 210 commit as you like. 211 <p> 212 <a name="l4"></a> 213 <h3>1.3: Terminology</h3> 214 <p> 215 Throughout this document, the following terms are used: (Many 216 more meanings of the terms will exist -- yes, I am aware of that. I'm 217 using the terms here in a very strict sense.) 218 <p> 219 <dl> 220 <p><dt><strong>A client</strong><dd> is a process that initiates a network connection 221 to get contact with some service. 222 <p><dt><strong>A service</strong><dd> or <strong>server process</strong> or <strong>listener</strong> 223 is a central application 224 that accepts network connections from clients and sevices 225 them. 226 <p><dt><strong>Back ends</strong><dd> are locations where crossroads looks in 227 order to service its clients. Crossroads sits 'in between' 228 and does its tricks. Therefore, as far as the back ends 229 are concerned, crossroads behaves like a client. As far as 230 the true client is concerned, crossroads behaves like the 231 service. The communication is however transparent: neither 232 client nor back end are aware of the middle position of 233 crossroads. 234 <p><dt><strong>A connection</strong><dd> is a network conversation between client and service, 235 where data are transferred to and fro. As 236 far as crossroads is concerned, success means that a 237 connection can be established without errors on 238 the network level. Crossroads isn't aware of service 239 pecularities. E.g., when a webserver answers <code>HTTP/1.0 240 500 Server Error</code> then crossroads will see this as a 241 succesful connection, though the user behind a browser may 242 think otherwise. 243 <p><dt><strong>Back end selection algorithms</strong><dd> are methods by which 244 crossroads determines which back end it will talk to 245 next. Crossroads has a number of built-in algorithms, 246 which may be configured per service. 247 <p><dt><strong>Back end states</strong><dd> are the statusses of each back end that 248 is known to crossroads. A back end may be available, 249 (temporarily) unavailble or truly down. When a back end is 250 temporarily unavailable, then crossroads will periodically 251 check whether the back end has come to life yet (that is, 252 if configured so). 253 <p><dt><strong>A spike</strong><dd> is a sudden increase in activity, leading to 254 extra load on a given service. When crossroads is in 255 effect and when the spike occurs in one connection, 256 then obviously the spike will also appear at one 257 of the back ends. However, crossroads will see the spike 258 and will make sure that a subsequent request goes to an 259 other back end. In contrast, when several connections 260 arrive simultaneously and cause a spike, then crossroads 261 will be able to distribute the connections over several 262 back ends, thereby 'flattening out' the increase. 263 <p><dt><strong>Load balancing</strong><dd> means that incoming client requests are 264 distributed over more than just one back end (which wouldn't be the 265 case if you wouldn't be running crossroads). Enabling load 266 balancing is nothing more than duplicating services over 267 more than one back end, and having something (in this 268 case: crossroads) distribute the requests, so that per 269 back end the load doesn't get too high. 270 <p><dt><strong>An HTTP session</strong><dd> is a series of separate network connections 271 that originate from one browser. E.g., to fill the display 272 with text and images, the browser hits a website several times. 273 An HTTP session may even span several 274 screens. E.g., a website registration dialog may involve 3 275 screens that when called from the same browser, 276 form a logical group of some sort. 277 <p><dt><strong>Headers</strong><dd> or <strong>header lines</strong> are specific parts of an HTTP 278 message. Crossroads has directives to add or modify 279 headers that are part of the request that a browser sends 280 to server, or those that are part of the server. 281 <p><dt><strong>Session stickiness</strong><dd> means that when a browser starts an 282 HTTP dialog, the balancer makes sure that it 'sticks' to 283 the same back end (i.e., subsequent requests from the 284 browser are forced to go to the same back end, instead of 285 being balanced to other ones). 286 <p><dt><strong>Back end usage</strong><dd> is measured by crossroads in order to be 287 able to determine back end selection. Crossroads stores 288 information about the number of active connections, the 289 transferred bytes and 290 about the connection duration. These numbers can be used to 291 estimate which back end is the least used -- and 292 therefore, presumably, the best candidate for a new 293 request. 294 <p><dt><strong>Fail over</strong><dd> is almost always used when load balancing is in 295 effect. The distributor of client requests (crossroads of 296 course) can also monitor back ends, so that incase a back 297 end is 'down', it is no longer accessed. 298 <p><dt><strong>Service downtime</strong><dd> normally occurs when a service is 299 switched off. Downtime is obviously avoided when fail over 300 is in effect: a back end can be taken out of service in a 301 controlled manner, without any client noticing it. 302 </dl> 303 <p> 304 <a name="l5"></a> 305 <h3>1.4: Porting issues for pre-1.21 installations</h3> 306 <p> 307 As of version 1.21, the event-hook directives <code>onsuccess</code> and 308 <code>onfailure</code> no longer exists. 309 <p> 310 <ul> 311 <li> Please replace <code>onsuccess</code> by <code>onstart</code>; 312 <li> Please replace <code>onfailure</code> bu <code>onfail</code>; 313 <li> Note that there is a new hook <code>onend</code>.</ul> 314 <p> 315 The commands that are run via <code>onstart</code>, <code>onend</code> or <code>onfail</code> 316 are subject to format expansion; e.g., <code>%1w</code> is expanded to the 317 weight of the first back end, etc.. See section <a href="crossroads.html#config">4</a> for details. 318 <p> 319 <a name="l6"></a> 320 <h3>1.5: Porting issues for pre-0.26 installations</h3> 321 <p> 322 As of version 0.26 the syntax of the configuration file has 323 changed. In particular: 324 <p> 325 <ul> 326 <li> The keyword <code>maxconnections</code> is now used instead of 327 <code>maxclients</code>; 328 <li> The keyword <code>connectiontimeout</code> is now used instead of 329 <code>sessiontimeout</code>.</ul> 330 <p> 331 Therefore when converting configuration files to the new syntax, 332 the above keywords must be changed. (The reason for these changes 333 is that 0.26 introduces <em>sticky HTTP sessions</em> that span 334 multiple TCP connections, and the term 335 <em>session</em> is used strictly in that sense -- and no longer for a 336 TCP connection.) 337 <p> 338 <a name="l7"></a> 339 <h3>1.6: Porting issues for pre-1.08 installations</h3> 340 <p> 341 As of version 1.08, the following directives no longer are 342 supported: 343 <p> 344 <ul> 345 <li> <code>insertstickycookie</code> was replaced by the more generic 346 directive <code>addclientheader</code>. E.g., instead of <br> 347 <code>insertstickycookie "XRID=100; Path=/";</code> <br> 348 the syntax is now <br> 349 <code>addclientheader "Set-Cookie: XRID=100; Path=/";</code> 350 <p> 351 <li> <code>insertrealip</code> was replaced by the more generic 352 directive <code>setserverheader</code>. E.g., instead of <br> 353 <code>insertrealip on;</code> <br> 354 the syntax is now <br> 355 <code>setserverheader "XR-Real-IP: %r";</code> <br> 356 This incidentally also makes it possible to change the header 357 name (here: <code>XR-Real-IP</code>).</ul> 358 <p> 359 <a name="l8"></a> 360 <h2>2: Installation for the impatient</h2> 361 <a name="impatient"></a> 362 For the impatient, here's the very-quick-but-very-superficial recipy 363 for getting crossroads up and running: 364 <p> 365 <ul> 366 <p> 367 <li> If you don't have SVN or don't want to use it: 368 <p> 369 <ul> 370 <li> Obtain the crossroads source archive at 371 <a href="http://crossroads.e-tunity.com">http://crossroads.e-tunity.com</a>. 372 <p> 373 <li> Change-dir to a 'sources' directory on your system and 374 unpack the archive. 375 <p> 376 <li> Change-dir into the create directory <code>crossroads/</code>.</ul> 377 <p> 378 <li> If you have SVN and want to go for the newest snapshot: 379 <p> 380 <ul> 381 <li> Get the latest sources and snapshots using SVN from <br> 382 <code>svn://svn.e-tunity.com/crossroads</code>. 383 <p> 384 <li> You'll find the newest alpha version under 385 <code>crossroads/trunk</code> and the stable versions under 386 <code>crossroads/tags</code>, 387 e.g. <code>crossroads/tags/release-1.00</code>. 388 <p> 389 <li> Choose which you want to use: the latest stable 390 release, or the bleeding edge alpha? In the former case, 391 change-dir to <code>crossroads/tags/release-</code><em>X.YY</em>, where 392 <em>X.YY</em> is a release ID. In the latter case, change-dir to 393 <code>crossroads/trunk</code>.</ul> 394 <p> 395 <li> Type <code>make install</code>. This installs the crossroads 396 binary into <code>/usr/local/bin/</code>. If the compilation doesn't 397 work on your system, check <code>etc/Makefile.def</code> for hints. 398 <p> 399 <li> Create a file <code>/etc/crossroads.conf</code>. In it state 400 something like: 401 <p> 402 <pre> 403 service www { 404 port 80; 405 revivinginterval 15; 406 backend one { 407 server 10.1.1.100:80; 408 } 409 backend two { 410 server 10.1.1.101:80; 411 } 412 } 413 </pre> 414 415 <p> 416 That's off course assuming that you want to balance HTTP on 417 port 80 to two back ends at 10.1.1.100 and 10.1.1.101. 418 <p> 419 <li> Type <code>crossroads start</code>. 420 <p> 421 <li> Surf to the machine where crossroads is running. You will 422 see the pages served by the back ends 10.1.1.100 or 423 10.1.1.101. 424 <p> 425 <li> To monitor the status of crossroads, type <code>crossroads 426 status</code>. 427 </ul> 428 <p> 429 <a name="l9"></a> 430 <h2>3: Using Crossroads</h2> 431 <a name="using"></a>Crossroads is started from the commandline, and highly depends on 432 <code>/etc/crossroads.conf</code> (the default configuration file). It 433 supports a number of flags (e.g., to overrule the location of the 434 configuration file). The actual usage information is always obtained 435 by typing <code>crossroads</code> without any arguments. Crossroads then 436 displays the allowed arguments. 437 <p> 438 <a name="l10"></a> 439 <h3>3.1: General Commandline Syntax</h3> 440 <p> 441 This section shows the most basic usage. As said above, start 442 <code>crossroads</code> without arguments to view the full listing of options. 443 <p> 444 <ul> 445 <li> <code>crossroads start</code> and <code>crossroads stop</code> are typical 446 actions that are run from system startup scripts. The 447 meaning is self-explanatory. 448 <li> <code>crossroads restart</code> is a combination of the former 449 two. Beware that a restart may cause discontinuity in 450 service; it is just a shorthand for typing the 'stop' and 451 'start' actions after one another. 452 <li> <code>crossroad status</code> reports on each running 453 service. Per service, the state of each back end is 454 reported. 455 <li> <code>crossroads tell</code> <em>service backend state</em> is a 456 command line way of telling crossroads that a given back 457 end, of a given service, is in a given state. Normally 458 crossroads maintains state information itself, but by 459 using <code>crossroads tell</code>, a back end can be e.g. taken 460 'off line' for servicing. 461 <li> <code>crossroads configtest</code> tells you whether the 462 configuration is syntactially correct. 463 <li> <code>crossroads services</code> reports on the configured 464 services. In contrast to <code>crossroads status</code>, this 465 option only shows what's configured -- not what's up and 466 running. Therefore, <code>crossroads services</code> doesn't 467 report on back end states. 468 <li> <code>crossroads sampleconf</code> shows a sample configuration on 469 screen. A good way of quicky viewing the configuration 470 file syntax, or of getting a start for your own 471 configuration <code>/etc/crossroads.conf</code>. 472 </ul> 473 <p> 474 <a name="l11"></a> 475 <h3>3.2: Logging-related options</h3> 476 <p> 477 Two 'flags' of Crossroads are specifically logging-related. This 478 section elaborates on these flags. 479 <p> 480 First, there's flag <code>-a</code>. When present, the start and end of 481 activity is logged using statements like 482 <p> 483 <center><em>YYYY-MM-DD HH/MM/SS starting http from 61.45.32.189 to 10.1.1.1</em></center> 484 <p> 485 Similarly, there are 'ending' statements. Using this flag and 486 scanning your logs for these statements may be helpful in quickly 487 determining your system load. 488 <p> 489 Second, there's flag <code>-l</code>. This flag selects the 'facility' of 490 logging and defaults to <code>LOG_DAEMON</code>. You can supply a number 491 between 0 and 7 to flag <code>-l</code> to select <code>LOG_LOCAL0</code> to 492 <code>LOG_LOCAL7</code>. This would separate the Crossroads-related logging 493 from other streams. Here's a very short guide; please read your Unix 494 manpages of <code>syslogd</code> for more information. 495 <p> 496 <ul> 497 <li> First edit <code>/etc/syslog.conf</code> and add a line: 498 <p> 499 <pre> 500 local7.* /var/log/crossroads.log 501 </pre> 502 503 <p> 504 That instructs <code>syslogd</code> to send <code>LOG_LOCAL7</code> requests to the 505 logfile <code>/var/log/crossroads.log</code>. 506 <p> 507 <li> Next, restart <code>syslogd</code>. On most Unices that's done by 508 issuing <code>killall -1 syslogd</code>. (As a side-note, I tried this once 509 on an Bull/AIX system, and the box just shut down. The <code>killall</code> 510 command killed every process...) 511 <p> 512 <li> Now start <code>crossroads</code> with the flag <code>-l7</code>. 513 <p> 514 <li> Finally, monitor <code>/var/log/crossroads.log</code> for Crossroads' 515 messages.</ul> 516 <p> 517 <a name="l12"></a> 518 <h3>3.3: Reloading Configurations</h3> 519 <p> 520 Crossroads doesn't support the reloading of a configuration while 521 running (such as other programs, e.g. Apache do). There are various 522 technical reasons for this. 523 <p> 524 However, external lists of allowed or denied IP addresses can be 525 reloaded by sending a signal -1 (<code>SIGHUP</code>) to Crossroads. See 526 section <a href="crossroads.html#servicedef">4.2</a> for the details. 527 <p> 528 <a name="l13"></a> 529 <h2>4: The configuration</h2> 530 <a name="config"></a>The configuration that crossroads uses is normally stored in the file 531 <code>/etc/crossroads.conf</code>. This location can be overruled using the 532 command line flag <code>-c</code>. 533 <p> 534 This section explains the syntax of the configuration file, and what 535 all settings do. 536 <p> 537 <a name="l14"></a> 538 <h3>4.1: General language elements</h3> 539 <p> 540 This section describes the general elements of the crossroads 541 configuration language. 542 <p> 543 <a name="l15"></a> 544 <strong>4.1.1: Empty lines and comments</strong> 545 <p> 546 Empty lines are of course allowed in the 547 configuration. Crossroads recognizes three formats of comment: 548 <p> 549 <ul> 550 <li> C-style, between <code>/*</code> and <code>*/</code>, 551 <li> C++-style, starting with <code>//</code> and ending with the end 552 of the text line; 553 <li> Shell-style, starting with <code>#</code> and ending with the end 554 of the text line.</ul> 555 <p> 556 Simply choose your favorite editor and use the comment that 'looks 557 best'. (I favor C or C++ comment. My favorite editor <em>emacs</em> 558 can be put in <code>cmode</code> and nicely highlight what's comment and what's 559 not. And as a bonus it will auto-indent the configuration!) 560 <p> 561 <a name="l16"></a> 562 <strong>4.1.2: Keywords, numbers, identifiers, generic strings</strong> 563 <p> 564 In a configuration file, statements are identified by <em>keywords</em>, 565 such as <code>service</code>, <code>verbosity</code>. These are reserved words. 566 <p> 567 Many keywords require an <em>identifier</em> as the argument. E.g, a 568 service has a unique name, which must start with a letter or 569 underscore, followed by zero or more letters, underscores, or 570 digits. Therefore, in the statement <code>service myservice</code>, the keyword is 571 <code>service</code> and the identifier is <code>myservice</code>. 572 <p> 573 Other keywords require a numeric argument. Crossroads knows only 574 non-negative integer numbers, as in <code>port 8000</code>. Here, <code>port</code> is 575 the keyword and <code>8000</code> is the number. 576 <p> 577 Yet other keywords require 'generic strings', such as hostname 578 specifications or system commands. Such generic strings contain any 579 characters (including white space) up to the terminating statement 580 character <code>;</code>. If a string must contain a semicolon, then it must 581 be enclosed in single or double quotes: 582 <p> 583 <ul> 584 <li> <code>This is a string;</code> is a string that starts at <code>T</code> 585 and ends with <code>g</code> 586 <li> <code>"This is a string";</code> is the same, the double quotes 587 are not necessary 588 <li> <code>"This is ; a string";</code> has double quotes to protect 589 the inner ;</ul> 590 <p> 591 Finally, an argument can be a 'boolean' value. Crossroads knows 592 <code>true</code>, <code>false</code>, <code>yes</code>, <code>no</code>, <code>on</code>, <code>off</code>. The keywords 593 <code>true</code>, <code>yes</code> and <code>on</code> all mean the same and can be used 594 interchangeably; as can the keywords <code>false</code>, <code>no</code> and <code>off</code>. 595 <p> 596 <a name="l17"></a> 597 <h3>4.2: Service definitions</h3> <a name="servicedef"></a> 598 <p> 599 Service definitions are blocks in the configuration file that 600 state what is for each service. A service definition starts with 601 <code>service</code>, followed by a unique identifier, and by statements in 602 <code>{</code> and <code>}</code>. For example: 603 <p> 604 <pre> 605 // Definition of service 'www': 606 service www { 607 ... 608 ... // statements that define the 609 ... // service named 'www' 610 ... 611 } 612 </pre> 613 614 <p> 615 The configuration file can contain many service blocks, as long as the 616 identifying names differ. The following list shows possible 617 statements. Each statement must end with a semicolon, except for the 618 <code>backend</code> statement, which has is own block (more on this later). 619 <p> 620 <a name="conf/type"></a><a name="l18"></a> 621 <strong>4.2.1: type - Defining the service type</strong> <a name="conftype - Defining the service type"></a> 622 <dl> 623 <p><dt><strong>Description:</strong><dd> The <code>type</code> statement defines how crossroads handles the stated 624 service. There are currently two types: <code>any</code> and 625 <code>http</code>. The type <code>any</code> means that crossroads doesn't 626 interpret the contents of a TCP stream, but only distributes streams 627 over back ends. The type <code>http</code> means that crossroads has to 628 analyze what's in the messages, does magical HTTP header tricks, and 629 so on -- all to ensure that multiple connections are treated as one 630 session, or that the back end is notified of the client's IP 631 address. 632 <p> 633 Unless you really need such special features, use the type <code>any</code> (the 634 default), even for HTTP protocols. 635 <p><dt><strong>Syntax:</strong><dd> <code>type</code> <em>specifier</em>, where <em>specifier</em> is <code>any</code> or 636 <code>http</code> 637 <p><dt><strong>Default:</strong><dd> <code>any</code> 638 </dl> 639 <p> 640 <a name="conf/port"></a><a name="l19"></a> 641 <strong>4.2.2: port - Specifying the listen port</strong> <a name="confport - Specifying the listen port"></a> 642 <dl> 643 <p><dt><strong>Description:</strong><dd> The <code>port</code> statement defines to which TCP port a service 644 'listens'. E.g. <code>port 8000</code> says that this service will accept 645 connections on port 8000. 646 <p><dt><strong>Syntax:</strong><dd> <code>port</code> <em>number</em> 647 <p><dt><strong>Default:</strong><dd> There is no default. This is a required setting. 648 </dl> 649 <p> 650 <a name="conf/bindto"></a><a name="l20"></a> 651 <strong>4.2.3: bindto - Binding to a specific IP address</strong> <a name="confbindto - Binding to a specific IP address"></a> 652 <dl> 653 <p><dt><strong>Description:</strong><dd> The <code>bindto</code> statement is used in situations where crossroads 654 should only listen to the stated port at a given IP address. E.g., 655 <code>bindto 127.0.0.1</code> causes crossroads to 'bind' the service only to 656 the local IP address. Network connections from other hosts won't be 657 serviced. By default, crossroads binds a service to all presently 658 active IP addresses at the invoking host. 659 <p><dt><strong>Syntax:</strong><dd> <code>bindto</code> <em>address</em>, where <em>address</em> is a numeric IP 660 address, such as 127.0.0.1, or the keyword <code>any</code>. 661 <p><dt><strong>Default:</strong><dd> <code>any</code> 662 </dl> 663 <p> 664 <a name="conf/verbose"></a><a name="l21"></a> 665 <strong>4.2.4: verbosity - Controlling debug output</strong> <a name="confverbosity - Controlling debug output"></a> 666 <dl> 667 <p><dt><strong>Description:</strong><dd> Verbosity statements come in two forms: <code>verbosity on</code> or 668 <code>verbosity off</code>. When 'on', log messages to <code>/var/log/messages</code> 669 are generated that show what's going on. (Actually, the 670 messages go to <code>syslog(3)</code>, using facility <code>LOG_DAEMON</code> and 671 priority <code>LOG_INFO</code>. In most (Linux) cases this will mean: output to 672 <code>/var/log/messages</code>. On Mac OSX the messages go to 673 <code>/var/log/system.log</code>.) The keyword <code>verbose</code> is an alias for 674 <code>verbosity</code>. 675 <p><dt><strong>Syntax:</strong><dd> <code>verbosity</code> <em>setting</em> or <code>verbose</code> <em>setting</em>, where 676 <em>setting</em> is <code>true</code>, <code>yes</code> or <code>on</code> to turn 677 verbosity on; or <code>false</code>, <code>no</code>, <code>off</code> to turn it off. 678 <p><dt><strong>Default:</strong><dd> <code>off</code> 679 </dl> 680 <p> 681 <a name="conf/dispatchmode"></a><a name="l22"></a> 682 <strong>4.2.5: dispatchmode - How are back ends selected</strong> <a name="confdispatchmode - How are back ends selected"></a> 683 <dl> 684 <p><dt><strong>Description:</strong><dd> The dispatch mode controls how crossroads selects a back end from 685 a list of active back ends. The below text shows the bare 686 syntax. See section <a href="crossroads.html#howselected">5.1</a> for a textual explanation. 687 <p> 688 The settings can be: 689 <p> 690 <ul> 691 <li> <code>dispatchmode roundrobin</code>: Simply the 'next in line' is 692 chosen. E.g, when 3 back ends are active, then the usage 693 series is 1, 2, 3, 1, 2, 3, and so on. 694 <p> 695 Roundrobin dispatching is the default method, when no 696 <code>dispatchmode</code> statement occurs. 697 <p> 698 <li> <code>dispatchmode random</code>: Random selection. Probably only 699 for stress testing, though when used with weights (see below) 700 it is a good distributor of new connections too. 701 <p> 702 <li> <code>dispatchmode bysize [ over</code> <em>connections</em> <code>]</code>: 703 The next back end is the one 704 that has transferred the least number of bytes. This 705 selection mechanism assumes that the more bytes, the heavier 706 the load. 707 <p> 708 The modifier <code>over</code> <em>connections</em> is optional. (The square 709 brackets shown above are not part of the statement but 710 indicate optionality.) When given, 711 the load is computed as an average of the last stated number of 712 connections. When this modifier is absent, then the load is 713 computed over all connections since startup. 714 <p> 715 <li> <code>dispatchmode byduration [ over</code> <em>connections</em> <code>]</code>: 716 The next back end is the one 717 that served connections for the shortest time. This mechanism 718 assumes that the longer the connection, the heavier the load. 719 <p> 720 <li> <code>dispatchmode byconnections</code>: The next back end is the one 721 with the least active connections. This mechanism assumes that 722 each connection to a back end represents load. It is usable 723 for e.g. database connections. 724 <p> 725 <li> <code>dispatchmode byorder</code>: The first back end is selected 726 every time, unless it's unavailable. In that case the second 727 is taken, and so on. 728 <p> 729 <li> <code>dispatchmode externalhandler</code> <em>program arguments</em>: 730 This is a special mode, where an external program is delegated 731 the responsibility to say which back end should be used 732 next. In this case, Crossroads will call the external program, 733 and this will of course be slower than one of the 'built-in' 734 dispatch modes. However, this is the ultimate escape when 735 custom-made dispatch modes are needed. 736 <p> 737 The dispatch mode that uses an <code>externalhandler</code> is 738 discussed separately in section <a href="crossroads.html#externalhandler">5.2</a>.</ul> 739 <p> 740 The selection algorithm is only used when clients are serviced that 741 aren't part of a sticky HTTP session. This is the case during: 742 <p> 743 <ul> 744 <li> all client requests of a service type <code>any</code>; 745 <li> new sessions of a service type <code>http</code>.</ul> 746 <p> 747 When type <code>http</code> is in effect and a session is underway, then the 748 previously used back end is always selected -- regardless of 749 dispatching mode. 750 <p> 751 Your 'right' dispatch mode will depend on the type of service. Given 752 the fact that crossroads doesn't know (and doesn't care) how to 753 estimate load from a network traffic stream, you have to choose an 754 appropriate dispatch mode to optimize load balancing. In most cases, 755 <code>roundrobin</code> or <code>byconnections</code> will do the job just fine. 756 <p><dt><strong>Syntax:</strong><dd> <code>dispatchmode</code> <em>mode</em> (see above for the modes), optionally 757 followed by <code>over</code> <em>number</em>, or when the <em>mode</em> is 758 <code>externalhandler</code>, followed by <em>program</em>. 759 <p><dt><strong>Default:</strong><dd> <code>roundrobin</code> 760 </dl> 761 <p> 762 <a name="conf/revivinginterval"></a><a name="l23"></a> 763 <strong>4.2.6: revivinginterval - Back end wakeup calls</strong> <a name="confrevivinginterval - Back end wakeup calls"></a> 764 <dl> 765 <p><dt><strong>Description:</strong><dd> A reviving interval definition is needed when crossroads 766 determines that a back end is temporarily unavailable. This will 767 happen when: 768 <p> 769 <ul> 770 <li> The back end cannot be reached (network connection 771 fails); 772 <li> The network connection to the back end suddenly dies.</ul> 773 <p> 774 An example of the definition is <code>revivinginterval 10</code>. When this 775 reviving interval is given, crossroads will check each 10 seconds 776 whether unavailable back ends have woken up yet. A back end is 777 considered awake when a network connection to that back end can 778 succesfully be established. 779 <p><dt><strong>Syntax:</strong><dd> <code>revivinginterval</code> <em>number</em>, where the number is the interval 780 in seconds. 781 <p><dt><strong>Default:</strong><dd> 0 (no wakeup calls) 782 </dl> 783 <p> 784 <a name="conf/maxconnections"></a><a name="l24"></a> 785 <strong>4.2.7: maxconnections - Limiting concurrent clients at service level</strong> <a name="confmaxconnections - Limiting concurrent clients at service level"></a> 786 <dl> 787 <p><dt><strong>Description:</strong><dd> The maximum number of connections is specified using 788 <code>maxconnections</code>. There is one argument; the number of concurrent 789 established connections that may be active within one service. 790 <p> 791 'Throttling' the number of connections is a way of preventing Denial of 792 Service (DOS) attacks. Without a limit, numerous network connections 793 may spawn so many server instances, that the service ultimately breaks 794 down and becomes unavailable. 795 <p><dt><strong>Syntax:</strong><dd> <code>maxconnections</code> <em>number</em>, where the number specifies the 796 maximum of concurrent connections to the service. 797 <p><dt><strong>Default:</strong><dd> 0, meaning that all connections will be accepted. 798 </dl> 799 <p> 800 <a name="conf/backlog"></a><a name="l25"></a> 801 <strong>4.2.8: backlog - The TCP Back Log size</strong> <a name="confbacklog - The TCP Back Log size"></a> 802 <dl> 803 <p><dt><strong>Description:</strong><dd> The TCP back log size is a number that controls how many 804 'waiting' network connections may be queued, before a client simply 805 cannot connect. The syntax is e.g. <code>backlog 5</code> to cause crossroads 806 to have 5 waiting connections for 1 active connection. 807 The backlog queue shouldn't be too 808 high, or clients will experience timeouts before they can actually 809 connect. The queue shouldn't be too small either, because clients 810 would be simply rejected. Your mileage may vary. 811 <p><dt><strong>Syntax:</strong><dd> <code>backlog</code> <em>number</em> 812 <p><dt><strong>Default:</strong><dd> 0, which takes the operating system's default 813 value for socket back log size. 814 </dl> 815 <p> 816 <a name="conf/shmkey"></a><a name="l26"></a> 817 <strong>4.2.9: shmkey - Shared Memory Access</strong> <a name="confshmkey - Shared Memory Access"></a> 818 <dl> 819 <p><dt><strong>Description:</strong><dd> Different Crossroads 820 invocations must 'know' of each others activity. E.g, <code>crossroad 821 status</code> must be able to get to the actual state information of all 822 running services. This is internally implemented through shared 823 memory, which is reserved using a key. 824 <p> 825 Normally crossroads will supply a shared memory key, based on the 826 service port and bitwise or-ed with a magic number. In situations 827 where this conflicts with existing keys (of other programs, having 828 their own keys), you may supply a chosen value. 829 <p> 830 The actual key value doesn't matter much, as long as it's unique 831 and as long as each invocation of crossroads uses it. 832 <p><dt><strong>Syntax:</strong><dd> <code>shmkey</code> <em>number</em> 833 <p><dt><strong>Default:</strong><dd> 0, which means that crossroads will 'guess' its 834 own key, based on TCP port and a magic number. 835 </dl> 836 <p> 837 <a name="conf/allow"></a><a name="l27"></a> 838 <strong>4.2.10: allow* and deny* - Allowing or denying connections</strong> <a name="confallow* and deny* - Allowing or denying connections"></a> 839 <dl> 840 <p><dt><strong>Description:</strong><dd> Crossroads can allow or deny 841 connections based on the IP address of a client. There are four 842 directives that are relevant: <code>allowfrom</code>, <code>allowfile</code>, 843 <code>denyfrom</code> and <code>denyfile</code>. When using <code>allowfrom</code> and 844 <code>denyfrom</code> then the IP addresses to allow or deny connections are 845 stated in <code>/etc/crossroads.conf</code>. 846 <p> 847 When <code>allow*</code> directives are used, then all connections are denied 848 unless they match the stated allowed IP's. When <code>deny*</code> directives 849 are used, then all connections are allowed unless they match the 850 stated disallowed IP's. When denying and allowing is both used, 851 then the Crossroads checks the deny list first. 852 <p> 853 The statements <code>allowfrom</code> and <code>denyfrom</code> are followed by a 854 list of filter specifications. The statements <code>allowfile</code> and 855 <code>denyfile</code> are followed by a filename; Crossroads will read 856 filter specifications from those external files. In both cases, 857 Crossroads obtains filter specifications and places them in its 858 lists of allowed or denied IP addresses. The difference between 859 specifying filters in <code>/etc/crossroads.conf</code> or in external 860 files, is that Crossroads will reload the external files when it 861 receives signal 1 (<code>SIGHUP</code>), as in <code>killall -1 crossroads</code>. 862 <p> 863 The filter specifications must obey the following syntax: it 864 consists of up to 865 four numbers ranging from 0 to 255 and separated by a decimal 866 sign. Optionally a slash follows, with a bitmask which is also a 867 decimal number. 868 <p> 869 This is probably best explained by a few examples: 870 <p> 871 <ul> 872 <li> <code>allowfrom 10/8;</code> will allow connections from 873 <code>10.*.*.*</code> (a full Class A network). The mask <code>/8</code> means 874 that the first 8 bits of the number (ie., only the <code>10</code>) are 875 significant. On the last 3 positions of the IP address, all 876 numbers are allowed. Given this directive, client connections 877 from e.g. 10.1.1.1 and 10.2.3.4 will be allowed. 878 <p> 879 <li> <code>allowfrom 10.3/16;</code> will allow all IP addresses that 880 start with <code>10.3</code>. 881 <p> 882 <li> <code>allowfrom 10.3.1/16;</code> is the same as above. The third 883 byte of the IP address is superfluous because the netmask 884 specifies that only the first 16 bits (2 numbers) are taken 885 into account. 886 <p> 887 <li> <code>allowfrom 10.3.1.15;</code> allows traffic from only the 888 specified IP address. There is no bitmask; all four numbers 889 are relevant. 890 <p> 891 <li> <code>allowfrom 10.3.1.15 10.2/16;</code> allows traffic from one 892 IP address <code>10.3.1.15</code> or from a complete Class B network 893 <code>10.2.*.*</code> 894 <p> 895 <li> <code>allowfile /tmp/myfile.txt;</code> in combination with a file 896 <code>/tmp/myfile.txt</code>, with the contents <code>10.3.1.15 10.2/16</code>, 897 is the same as above.</ul> 898 <p><dt><strong>Syntax:</strong><dd> <ul> 899 <li> <code>allowfrom</code> <em>filter-specificication(s)</em> 900 <li> <code>denyfrom</code> <em>filter-specificication(s)</em> 901 <li> <code>allowfile</code> <em>filename</em> 902 <li> <code>denyfile</code> <em>filename</em></ul> 903 <p><dt><strong>Default:</strong><dd> In absence of these statements, all client IP's are accepted. 904 </dl> 905 <p> 906 <a name="conf/useraccount"></a><a name="l28"></a> 907 <strong>4.2.11: useraccount - Limiting the effective ID of external processes</strong> <a name="confuseraccount - Limiting the effective ID of external processes"></a> 908 <dl> 909 <p><dt><strong>Description:</strong><dd> Using the directive <code>useraccount</code>, the effective user and group 910 ID can be restricted. This comes into effect when Crossroads runs 911 external commands, such as: 912 <ul> 913 <li> Hooks for <code>onstart</code>, <code>onend</code> or <code>onfail</code>; 914 <li> External dispatchers, when <code>dispatchmode 915 externalhandler</code> is in effect.</ul> 916 Once a user name for external commands is specified, Crossroads 917 assumes the associated user ID and group ID before running those 918 commands. 919 <p><dt><strong>Syntax:</strong><dd> <code>useraccount</code> <em>username</em> 920 <p><dt><strong>Default:</strong><dd> None; when unspecified, external commands are run with the 921 ID that was in effect when Crossroads was started. 922 </dl> 923 <p> 924 <a name="l29"></a> 925 <h3>4.3: Backend definitions</h3> 926 <p> 927 Inside the service definitions as are described in the previous 928 section, <em>backend definitions</em> must also occur. Backend definitions 929 are started by the keyword <code>backend</code>, followed by an identifier 930 (the back end name) , and statements inside <code>{</code> and <code>}</code>: 931 <p> 932 <pre> 933 service myservice { 934 ... 935 ... // statements that define the 936 ... // service named 'myservice' 937 ... 938 939 backend mybackend { 940 ... 941 ... // statements that define the 942 ... // backend named 'mybackend' 943 ... 944 } 945 } 946 </pre> 947 948 <p> 949 Each service definition must have at least one backend 950 definition. There may be more (and probably will, if you want 951 balancing and fail over) as long as the backend names differ. 952 The statements in the backend definition blocks are described in the 953 following sections. 954 <p> 955 Some directives (<code>stickycookie</code> etc.) only have effect when 956 Crossroads treats the network traffic as a stream of HTTP messages; 957 i.e., when the service is declared with <code>type http</code>. Incase of 958 <code>type any</code>, the HTTP-specific directives have no effect. 959 <p> 960 <a name="conf/server.yo"></a><a name="l30"></a> 961 <strong>4.3.1: server - Specifying the back end address</strong> <a name="confserver - Specifying the back end address"></a> 962 <dl> 963 <p><dt><strong>Description:</strong><dd> Each back end must be identified by the network name 964 (server name) where it is located. For example: <code>server 965 10.1.1.23</code>, or <code>server web.mydomain.org</code>. A TCP port specifier 966 can follow the server name, as in <code>server web.mydomain.org:80</code>. 967 <p><dt><strong>Syntax:</strong><dd> <ul> 968 <li> <code>server</code> <em>servername</em>, where <em>servername</em> is a 969 network name or IP address; 970 <li> <code>server</code> <em>servername:port</em></ul> 971 <p><dt><strong>Default:</strong><dd> There is no default. This is a required setting. 972 </dl> 973 <p> 974 <a name="conf/verbose-backend.yo"></a><a name="l31"></a> 975 <strong>4.3.2: verbosity - Controlling verbosity at the back end level</strong> <a name="confverbosity - Controlling verbosity at the back end level"></a> 976 <dl> 977 <p><dt><strong>Description:</strong><dd> Similar to <code>service</code> specifications, a 978 <code>backend</code> can have its own verbosity (<code>on</code> or <code>off</code>). When 979 <code>on</code>, traffic to and fro this back end is reported. 980 <p><dt><strong>Syntax:</strong><dd> <ul> 981 <li> <code>verbosity</code> <em>setting</em>, or 982 <li> <code>verbose</code> <em>setting</em>, where <em>setting</em> is <code>true</code>, 983 <code>yes</code> or <code>on</code>, or <code>false</code>, <code>no</code>, <code>off</code> to turn it 984 off.</ul> 985 <p><dt><strong>Default:</strong><dd> <code>off</code> 986 </dl> 987 <p> 988 <a name="conf/weight"></a><a name="l32"></a> 989 <strong>4.3.3: weight - When a back end is more equal than others</strong> <a name="confweight - When a back end is more equal than others"></a> 990 <dl> 991 <p><dt><strong>Description:</strong><dd> To influence how backends are selected, a backend can specify its 992 'weight' in the process. The higher the weight, the less likely a 993 back end will be chosen. The default is 1. 994 <p> 995 The weighing mechanism only applies to the dispatch modes 996 <code>random</code>, <code>byconnections</code>, <code>bysize</code> and <code>byduration</code>. 997 The weight is in fact a penalty factor. E.g., if backend A has 998 <code>weight 2</code> and backend B has <code>weight 1</code>, then backend B will 999 be selected all the time, until its usage parameter is twice as 1000 large as the parameter of A. Think of it as a 'sluggishness' 1001 statement. 1002 <p><dt><strong>Syntax:</strong><dd> <code>weight</code> <em>number</em>; the higher the number, the more 'sluggish' 1003 a back end is 1004 <p><dt><strong>Default:</strong><dd> 1; all back ends have equal weight. 1005 </dl> 1006 <p> 1007 <a name="conf/decay"></a><a name="l33"></a> 1008 <strong>4.3.4: decay - Levelling out activity of a back end</strong> <a name="confdecay - Levelling out activity of a back end"></a> 1009 <dl> 1010 <p><dt><strong>Description:</strong><dd> To make sure that a 'spike' of activity doesn't 1011 influence the perceived load of a back end forever, you may 1012 specify a certain decay. E.g, the statement <code>decay 10</code> makes 1013 sure that the load that crossroads computes for this back end (be 1014 it in seconds or in bytes) is decreased by 10% each time that 1015 <strong>an other</strong> back end is hit. Decays are not applied to the count 1016 of concurrent connections. 1017 <p> 1018 This means that when a given back end is hit, then its usage data 1019 of the transferred bytes and the connection duration are updated 1020 using the actual number of bytes and actual duration. However, 1021 when a different back end is hit, then the usage data are 1022 decreased by the specified decay. 1023 <p><dt><strong>Syntax:</strong><dd> <code>decay</code> <em>number</em>, where <em>number</em> is a percentage that 1024 decreases the back end usage data when other back ends are 1025 hit. 1026 <p><dt><strong>Default:</strong><dd> 0, meaning that no decay is applied to usage statistics. 1027 </dl> 1028 <p> 1029 <a name="conf/onhooks"></a><a name="l34"></a> 1030 <strong>4.3.5: onstart, onend, onfail - Action Hooks</strong> <a name="confonstart, onend, onfail - Action Hooks"></a> 1031 <dl> 1032 <p><dt><strong>Description:</strong><dd> The three directives <code>onstart</code>, <code>onend</code> and <code>onfail</code> can be 1033 specified to start system commands (external programs) when a 1034 connection to a back end starts, fails or ends: 1035 <ul> 1036 <li> <code>onstart</code> commands will be run when Crossroads 1037 successfully connects to a back end, and starts servicing; 1038 <li> <code>onend</code> commands will be run when a (previously 1039 established) connection stops; 1040 <li> <code>onfail</code> commands will be run when Crossroads tries to 1041 contact a back end to serve a client, but the back end can't 1042 be reached.</ul> 1043 <p> 1044 The format is always <code>on</code><em>type</em> <em>command</em>. The <em>command</em> 1045 is an external program, optionally followed by arguments. The 1046 command is expanded according to the following table: 1047 <p> 1048 <ul> 1049 <li> <code>%a</code> is the availability of the current back end, when 1050 a current back end is established; 1051 <li> <code>%1a</code> is the availability of the first back end (0 when 1052 unavailable, 1 if available); <code>%2a</code> is the availability of 1053 the second back end, and so on; 1054 <li> <code>%b</code> is the name of the current back end, when one is 1055 established; 1056 <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the 1057 second back end, and so on; 1058 <li> <code>%e</code> is the count of seconds since start of epoch 1059 (January 1st 1970 GMT); 1060 <li> <code>%r</code> is the IP address of the client that requests a 1061 connection and for whom the external dispatcher should compute 1062 a back end; 1063 <li> <code>%s</code> is the name of the current service that the client 1064 connected to; 1065 <li> <code>%t</code> is the current local time in ASCII format, in 1066 <em>YYYY-MM-DD/hhh:mm:ss</em>; 1067 <li> <code>%T</code> is the current GMT time in ASCIII format; 1068 <li> <code>%v</code> is the Crossroads version; 1069 <li> Any other chararacter following a <code>%</code> sign is taken 1070 literally; e.g. <code>%z</code> is just a z.</ul> 1071 <p> 1072 <p><dt><strong>Syntax:</strong><dd> <ul> 1073 <li> <code>onstart</code> <em>commandline</em> 1074 <li> <code>onend</code> <em>commandline</em> 1075 <li> <code>onfail</code> <em>commandline</em> 1076 <li> <code>onsuccess</code> <em>commandline</em></ul> 1077 <p><dt><strong>Default:</strong><dd> There is no default. Normally no external programs are run upon 1078 connection, success or failure of a back end. 1079 </dl> 1080 <p> 1081 <a name="conf/trafficlog"></a><a name="l35"></a> 1082 <strong>4.3.6: trafficlog and throughputlog - Debugging and Performance Aids</strong> <a name="conftrafficlog and throughputlog - Debugging and Performance Aids"></a> 1083 <dl> 1084 <p><dt><strong>Description:</strong><dd> Two directives are available 1085 to log network traffic to files. They are <code>trafficlog</code> and 1086 <code>throughputlog</code>. 1087 <p> 1088 The <code>trafficlog</code> statement causes all traffic to be logged in 1089 hexadecimal format. Each line is prefixed by <code>B</code> or <code>C</code>, 1090 depending on whether the information was received from the back 1091 end or from the client. 1092 <p> 1093 The <code>throughputlog</code> statement writes shorthand transmissions to 1094 its log, accompanied by timings. 1095 <p><dt><strong>Syntax:</strong><dd> <ul> 1096 <li> <code>trafficlog</code> <em>filename</em> 1097 <li> <code>throughputlog</code> <em>filename</em></ul> 1098 <p><dt><strong>Default:</strong><dd> none 1099 </dl> 1100 <p> 1101 <a name="conf/stickycookie"></a><a name="l36"></a> 1102 <strong>4.3.7: stickycookie - Back end selection with an HTTP cookie</strong> <a name="confstickycookie - Back end selection with an HTTP cookie"></a> 1103 <dl> 1104 <p><dt><strong>Description:</strong><dd> The directive <code>stickycookie</code> <em>value</em> 1105 causes Crossroads to unpack clients' requests, to check for 1106 <em>value</em> in the cookies. When found, the message is routed to the 1107 back end having the appropriate <code>stickycookie</code> directive. 1108 <p> 1109 E.g., consider the following configuration: 1110 <p> 1111 <pre> 1112 service ... { 1113 ... 1114 backend one { 1115 ... 1116 stickycookie "BalancerID=first"; 1117 } 1118 backend two { 1119 ... 1120 stickycookie "BalancerID=second"; 1121 } 1122 } 1123 </pre> 1124 1125 <p> 1126 When clients' messages contain cookies named <code>BalancerID</code> with 1127 the value <code>first</code>, then such messages are routed to backend 1128 <code>one</code>. When the value is <code>second</code> then they are routed to the 1129 backend <code>two</code>. 1130 <p> 1131 There are basically to provide such cookies to a browser. First, a 1132 back end can insert such a cookie into the HTTP response. E.g., 1133 the webserver of back end <code>one</code> might insert a cookie named 1134 <code>BalancerID</code>, having value <code>first</code>. 1135 Second, Crossroads can insert such cookies using a carefully 1136 crafted directive <code>addclientheader</code>. 1137 <p><dt><strong>Syntax:</strong><dd> <code>stickycookie</code> <em>cookievalue</em> 1138 <p><dt><strong>Default:</strong><dd> There is no default. 1139 </dl> 1140 <p> 1141 <a name="conf/addclientheader"></a><a name="l37"></a> 1142 <strong>4.3.8: HTTP Header Modification Directives</strong> <a name="confHTTP Header Modification Directives"></a> 1143 <dl> 1144 <p><dt><strong>Description:</strong><dd> Crossroads understands the following 1145 header modification directives: <code>addclientheader</code>, 1146 <code>appendclientheader</code>, <code>setclientheader</code>, <code>addserverheader</code>, 1147 <code>appendserverheader</code>, <code>setserverheader</code>. 1148 <p> 1149 The directive names always consist of 1150 <em>Action</em><em>Destination</em><code>header</code>, where: 1151 <p> 1152 <ul> 1153 <li> The action is <code>add</code>, <code>append</code> or <code>insert</code>. 1154 <p> 1155 <ul> 1156 <li> Action <code>add</code> adds a header, even when headers with 1157 the same name already are present in an HTTP 1158 message. Adding headers is useful for e.g. <code>Set-Cookie</code> 1159 headers; a message may contain several of such headers. 1160 <p> 1161 <li> Action <code>append</code> adds a header if it isn't present 1162 yet in an HTTP message. If such a header is already 1163 present, then the value is appended to the pre-existing 1164 header. This is useful for e.g. <code>Via</code> headers. Imagine 1165 an HTTP message with a header <code>Via: someproxy</code>. Then the 1166 directive <code>appendclientheader "Via: crossroads"</code> will 1167 rewrite the header to <code>Via: someproxy; crossroads</code>. 1168 <p> 1169 <li> Action <code>set</code> overwrites headers with the same 1170 name; or adds a new header if no pre-existing is found. 1171 This is useful for e.g. <code>Host</code> headers.</ul> 1172 <p> 1173 <li> The destination is one of <code>client</code> or <code>server</code>. When 1174 the destination is <code>server</code>, then Crossroads will apply such 1175 directives to HTTP messages that originate from the browser 1176 and are being forwarded to back ends. When the destination is 1177 <code>client</code>, then Crossroads will apply such directives to 1178 backend responses that are shuttled to the browser.</ul> 1179 <p> 1180 The format of the directives is e.g. <code>addclientheader 1181 "X-Processed-By: Crossroads"</code>. The directives expect one 1182 argument; a string, consisting of a header name, a colon, and a 1183 header value. As usual, the directive must end with a semicolon. 1184 <p> 1185 The header value may contain one of the following formatting 1186 directives: 1187 <p> 1188 <ul> 1189 <li> <code>%a</code> is the availability of the current back end, when 1190 a current back end is established; 1191 <li> <code>%1a</code> is the availability of the first back end (0 when 1192 unavailable, 1 if available); <code>%2a</code> is the availability of 1193 the second back end, and so on; 1194 <li> <code>%b</code> is the name of the current back end, when one is 1195 established; 1196 <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the 1197 second back end, and so on; 1198 <li> <code>%e</code> is the count of seconds since start of epoch 1199 (January 1st 1970 GMT); 1200 <li> <code>%r</code> is the IP address of the client that requests a 1201 connection and for whom the external dispatcher should compute 1202 a back end; 1203 <li> <code>%s</code> is the name of the current service that the client 1204 connected to; 1205 <li> <code>%t</code> is the current local time in ASCII format, in 1206 <em>YYYY-MM-DD/hhh:mm:ss</em>; 1207 <li> <code>%T</code> is the current GMT time in ASCIII format; 1208 <li> <code>%v</code> is the Crossroads version; 1209 <li> Any other chararacter following a <code>%</code> sign is taken 1210 literally; e.g. <code>%z</code> is just a z.</ul> 1211 <p> 1212 The following examples show common uses of header modifications. 1213 <p> 1214 <dl> 1215 <p><dt><strong>Enforcing session stickiness:</strong><dd> By combining 1216 <code>stickycookie</code> and <code>addclientheader</code>, HTTP session 1217 stickiness is enforced. Consider the following configuration: 1218 <p> 1219 <pre> 1220 service ... { 1221 ... 1222 backend one { 1223 ... 1224 addclientheader "Set-Cookie: BalancerID=first; path=/"; 1225 stickycookie "BalancerID=first"; 1226 } 1227 backend two { 1228 ... 1229 addclientheader "Set-Cookie: BalancerID=second; path=/"; 1230 stickycookie "BalancerID=second"; 1231 } 1232 } 1233 </pre> 1234 1235 <p> 1236 The first request of an HTTP session is balanced to either 1237 backend <code>one</code> or <code>two</code>. The server response is enriched 1238 using <code>addclientheader</code> with an appropriate cookie. A 1239 subsequent request from the same browser now has that cookie 1240 in place; and is therefore sent to the same back end where the 1241 its predecessors went. 1242 <p> 1243 <p><dt><strong>Hiding the server software version:</strong><dd> Many servers 1244 (e.g. Apache) advertize their version, as in <code>Server: Apache 1245 1.27</code>. This potentially provides information to attackers. The 1246 following configuration hides such information: 1247 <p> 1248 <pre> 1249 service ... { 1250 ... 1251 backend one { 1252 ... 1253 setclientheader "Server: WWW-Server"; 1254 } 1255 } 1256 </pre> 1257 1258 <p> 1259 <p><dt><strong>Informing the server of the clients' IP address:</strong><dd> Since 1260 Crossroads sits 'in the middle' between a client and a back 1261 end, the back end perceives Crossroads as its client. The 1262 following sends the true clients' IP address to the server, in 1263 a header <code>X-Real-IP</code>: 1264 <p> 1265 <pre> 1266 service ... { 1267 ... 1268 backend one { 1269 ... 1270 setserverheader "X-Real-IP: %r"; 1271 } 1272 } 1273 </pre> 1274 1275 <p> 1276 <p><dt><strong>Keep-Alive Downgrading:</strong><dd> The directives 1277 <code>setclientheader</code> and <code>setserverheader</code> also play a key 1278 role in downgrading Keep-Alive connections to 1279 'single-shot'. E.g., the following configuration makes sure 1280 that no Keep-Alive connections occur. 1281 <p> 1282 <pre> 1283 service ... { 1284 ... 1285 backend one { 1286 ... 1287 setserverheader "Connection: close"; 1288 setclientheader "Connection: close"; 1289 } 1290 } 1291 </pre> 1292 </dl> 1293 <p><dt><strong>Syntax:</strong><dd> <ul> 1294 <li> <code>addclientheader</code> <em>Headername: headervalue</em> to add a 1295 header in the traffic towards the client, even when another 1296 header <em>Headername</em> exists; 1297 <li> <code>appendclientheader</code> <em>Headername: headervalue</em> to 1298 append <em>headervalue</em> to an existing header <em>Headername</em> 1299 in the traffic towards the client, 1300 or to add the whole header alltogether; 1301 <li> <code>setclientheader</code> <em>Headername: headervalue</em> to 1302 overwrite an existing header in the traffic towards the 1303 client, or to add such a header; 1304 <li> <code>addserverheader</code> <em>Headername: headervalue</em> to add a 1305 header in the traffic towards the server, even when another 1306 header <em>Headername</em> exists; 1307 <li> <code>appendserverheader</code> <em>Headername: headervalue</em> to 1308 append <em>headervalue</em> to an existing header <em>Headername</em> 1309 in the traffic towards the server, 1310 or to add the whole header alltogether; 1311 <li> <code>setserverheader</code> <em>Headername: headervalue</em> to 1312 overwrite an existing header in the traffic towards the 1313 server, or to add such a header.</ul> 1314 <p><dt><strong>Default:</strong><dd> There is no default. 1315 </dl> 1316 <p> 1317 <a name="l38"></a> 1318 <h2>5: Tips, Tricks and Remarks</h2> 1319 <a name="tips"></a>The following sections elaborate on the directives as described in 1320 section <a href="crossroads.html#config">4</a> to illustrate how crossroads works and to help you 1321 achieve the "optimal" balancing configuration. 1322 <p> 1323 <a name="l39"></a> 1324 <h3>5.1: How back ends are selected in load balancing</h3><a name="howselected"></a> 1325 <p> 1326 In order to tune your load balancing, you'll need to understand how 1327 crossroads computes usage, how weighing works, and so on. In this 1328 section we'll focus on the dispatching modes <code>bysize</code>, <code>byduration</code> 1329 and <code>byconnections</code> only. The other dispatching types are 1330 self-explanatory. 1331 <p> 1332 <a name="l40"></a> 1333 <strong>5.1.1: Bysize, byduration or byconnections?</strong> 1334 <p> 1335 As stated before, crossroads doesn't know 'what a service does' and 1336 how to judge whether a given back end is very busy or not. You 1337 must therefore give the right hints: 1338 <p> 1339 <ul> 1340 <li> In general, a service which is CPU bound, will be more 1341 busy when it takes longer to process a request. The dispatch 1342 mode <code>byduration</code> is appropriate here. 1343 <p> 1344 <li> In contrast, a service which is filesystem bound, will be 1345 more busy when more data are transferred. The dispatch mode 1346 <code>bysize</code> is apppropriate. 1347 <p> 1348 <li> The dispatch mode <code>byduration</code> can also be used when 1349 network latency is an issue. E.g., if your balancer has back 1350 ends that are geograpically distributed, then <code>byduration</code> 1351 would be a good way to select best available back ends. 1352 <p> 1353 <li> Furthermore it is noteworthy that <code>dispatchmode 1354 byduration</code> is not usable for interactive processes such as 1355 SSH logins. Idle time of a 1356 login adds to the duration, while causing (almost) no 1357 load. Mode <code>byduration</code> should only be used for automated 1358 processes that don't wait for user interaction (e.g., SOAP 1359 calls and other HTTP requests). 1360 <p> 1361 <li> As a last remark, the dispatching mode <code>byconnections</code> can 1362 be used if you don't have other clues for load 1363 estimations. 1364 <p> 1365 E.g., consider a database connection. What's 1366 heavier on the back end, time-consuming connections, or connections 1367 where loads of bytes are transferred? Well, that depends. A 1368 tough <code>select</code> query that joins multiple tables can be very 1369 heavy on the back end, though the response set can be quite 1370 small - and hence the number of 1371 transferred bytes. That would suggest 1372 dispatching by duration. However, <code>byduration</code> 1373 balancing doesn't respresent the true world, when interactive 1374 connections can occur where users have an idle TCP connection to 1375 the database: 1376 this consumes time, but no bytes (see the SSH login example 1377 above). In this case, the dispatch mode <code>byconnections</code> may be 1378 your best bet. 1379 <p> 1380 </ul> 1381 <p> 1382 <a name="l41"></a> 1383 <strong>5.1.2: Averaging size and duration</strong> 1384 <p> 1385 The configuration statement <code>dispatchmode bysize</code> or <code>byduration</code> 1386 allows an optional modifier <code>over</code> <em>number</em>, where the stated 1387 number represents a connection count. When this modifier is present, then 1388 crossroads will use a moving average over the last <em>n</em> connections to 1389 compute duration and size figures. 1390 <p> 1391 In the real world you'll always want this modifier. E.g., consider two 1392 back ends that are running for years now, and one of them is suddenly 1393 overloaded and very busy (it experiences a 'spike' in activity). 1394 When the <code>over</code> modifier is absent, then 1395 the sudden load will hardly show up in the usage figures -- it will 1396 flatten out due to the large usage figures already stored in the years 1397 of service. 1398 <p> 1399 In contrast, when e.g. <code>over 3</code> is in effect, then a sudden load 1400 does show up -- because it highly contributes to the average of three 1401 connections. 1402 <p> 1403 <a name="l42"></a> 1404 <strong>5.1.3: Specifying decays</strong> 1405 <p> 1406 Decays are also only relevant when crossroads computes the 'next best 1407 back end' by size (bytes) or duration (seconds). E.g., imagine two 1408 back ends A and B, both averaged over say 3 connections. 1409 <p> 1410 Now when back end A is suddenly hit by a spike, 1411 its average would go up accordingly. But the back end would never 1412 again be used, unless B also received a similar spike, because A's 1413 'usage data' over its last three connections would forever be larger than 1414 B's data. 1415 <p> 1416 For that reason, you should in real situations probably always 1417 specify a decay, so that the backend selection algorithm recovers from 1418 spikes. Note that the usage data of the back end where a decay is 1419 specified, decay when <strong>other</strong> back ends are hit. The decay parameter 1420 is like specifying how fast your body regenerates when someone else 1421 does the work. 1422 <p> 1423 The below configuration illustrates this: 1424 <p> 1425 <pre> 1426 /* Definition of the service */ 1427 service soap { 1428 /* Local TCP port */ 1429 port 8080; 1430 1431 /* We'll select back ends by the processing 1432 * duration 1433 */ 1434 dispatchmode byduration over 3; 1435 1436 /* First back end: */ 1437 backend A { 1438 /* Back end IP address and port */ 1439 server 10.1.1.1:8080; 1440 1441 /* When this back end is NOT hit because 1442 * the other one was less busy, then the 1443 * usage parameters decay 10% per connection 1444 */ 1445 decay 10; 1446 } 1447 1448 /* Second back end: */ 1449 backend B { 1450 server 10.1.1.2:8080; 1451 decay 10; 1452 } 1453 } 1454 </pre> 1455 1456 <p> 1457 <a name="l43"></a> 1458 <strong>5.1.4: Adjusting the weights</strong> 1459 <p> 1460 The back end modifier <code>weight</code> is useful in situations where your 1461 back ends differ in respect to performance. E.g,. your back ends may 1462 be geographically distributed, and you know that a given back end is 1463 difficult to reach and often experiences network lag. 1464 <p> 1465 Or you may have 1466 one primary back end, a system with a fast CPU and enough memory, and a 1467 small fall-back back end, with a slow CPU and short on memory. In that 1468 case you know in advance that the second back end should be used only 1469 rarely. Most requests should go to the big server, up to a certain load. 1470 <p> 1471 In such cases you will know in advance that the best performing back ends 1472 should be selected the most often. Here's where the <code>weight</code> 1473 statement comes in: you can simply increase the weight of the back 1474 ends with the least performance, so that they are selected less 1475 frequently. 1476 <p> 1477 E.g., consider the following configuration: 1478 <p> 1479 <pre> 1480 service soap { 1481 port 8080; 1482 dispatchmode byduration over 3; 1483 backend A { 1484 server 10.1.1.1:8080; 1485 decay 20; 1486 } 1487 backend B { 1488 server 10.1.1.2:8080; 1489 weight 2; 1490 decay 10; 1491 } 1492 backend C { 1493 server 10.1.1.3:8080; 1494 weight 4; 1495 decay 5; 1496 } 1497 } 1498 </pre> 1499 1500 <p> 1501 This will cause crossroads to select back ends by the processing time, 1502 averaging over the last three connections. However, backend B will kick 1503 in only when its usage is half of the usage of A (back end B is 1504 probably only half as fast as A). Backend C will kick in only when its 1505 usage is a quarter of the usage of A, which is half of the usage of B 1506 (back end C is probably very weak, and just a fall-back system incase 1507 both A and B crash). Note also that A's usage data decay much faster 1508 than B's and C's: we're assuming that this big server recovers quicker 1509 than its smaller siblings. 1510 <p> 1511 <a name="l44"></a> 1512 <strong>5.1.5: Throttling the number of concurrent connections</strong> 1513 <p> 1514 If you suspect that your service may occasionally receive 'spikes' of 1515 activity (which you should always assume), then it might be a 1516 good idea to protect your service by specifying a maximum number of 1517 concurrent connections. This protection can be specified on two levels: 1518 <p> 1519 <dl> 1520 <p><dt><strong>On the service level</strong><dd> a statement like <code>maxconnections 1521 100;</code> states that the service as a whole will never 1522 service more than 100 concurrent connections. This means that 1523 all your back ends and the crossroads balancer itself 1524 will be protected from being overloaded. 1525 <p><dt><strong>On the back end level</strong><dd> a statement like <code>maxconnections 10;</code> 1526 states that this particular back end will never have more 1527 than 10 concurrent connections; regardless of the overall 1528 setting on the service level. This means that this 1529 particular back end will be protected from being 1530 overloaded (regardless of what other back ends may 1531 experience).</dl> 1532 <p> 1533 The <code>maxconnections</code> statement, combined with a back end selection 1534 algorithm, allows very fine granularity. The <code>maxconnections</code> statement 1535 on the back end level is like a hand brake: even when you specify a 1536 back end algorithm that would protect a given back end from being used 1537 too much, a situation may occur where that back end is about to be 1538 hit. A <code>maxconnections</code> statement on the level of that back may then 1539 protect it. 1540 <p> 1541 <a name="l45"></a> 1542 <h3>5.2: Using an external program to dispatch</h3> 1543 <a name="externalhandler"></a> 1544 <p> 1545 As mentioned before, Crossroads supports several built-in dispatch 1546 modes. However, you are always free to hook-in your own dispatch mode 1547 that determines the next back end using your own specific 1548 algorithm. This section explains how to do it. 1549 <p> 1550 <a name="l46"></a> 1551 <strong>5.2.1: Configuring the external handler</strong> 1552 <p> 1553 First, the <code>dispatchmode</code> statement needs to inform Crossroads that 1554 an external program will do the job. The syntax is: <code>dispatchmode 1555 externalhandler</code> <em>program arguments</em>. The <em>program</em> must point to 1556 an executable program that will be started by Crossroads. The 1557 specifier <em>arguments</em> can be anything you want; those will be the 1558 arguments to Crossroads. You can however use the following special 1559 format specifiers: 1560 <p> 1561 <ul> 1562 <li> <code>%a</code> is the availability of the current back end, when 1563 a current back end is established; 1564 <li> <code>%1a</code> is the availability of the first back end (0 when 1565 unavailable, 1 if available); <code>%2a</code> is the availability of 1566 the second back end, and so on; 1567 <li> <code>%b</code> is the name of the current back end, when one is 1568 established; 1569 <li> <code>%1b</code> is the name of the first back end, <code>%2b</code> of the 1570 second back end, and so on; 1571 <li> <code>%e</code> is the count of seconds since start of epoch 1572 (January 1st 1970 GMT); 1573 <li> <code>%r</code> is the IP address of the client that requests a 1574 connection and for whom the external dispatcher should compute 1575 a back end; 1576 <li> <code>%s</code> is the name of the current service that the client 1577 connected to; 1578 <li> <code>%t</code> is the current local time in ASCII format, in 1579 <em>YYYY-MM-DD/hhh:mm:ss</em>; 1580 <li> <code>%T</code> is the current GMT time in ASCIII format; 1581 <li> <code>%v</code> is the Crossroads version; 1582 <li> Any other chararacter following a <code>%</code> sign is taken 1583 literally; e.g. <code>%z</code> is just a z.</ul> 1584 <p> 1585 Note that the format specifiers such as <code>%b</code> don't make sense in the 1586 phase in which an external handler is called, since there is no 1587 current back end yet (the job of the handler is to supply one). 1588 <p> 1589 <a name="l47"></a> 1590 <strong>5.2.2: Writing the external handler</strong> 1591 <p> 1592 The external handler is activated using the arguments that are 1593 specified in <code>/etc/crossroads.conf</code>. The external handler can do 1594 whatever it wants, but ultimately, it must write a back end name on 1595 its <em>stdout</em>. Crossroads reads this, and if the back end is 1596 available, uses that back end for the connection. 1597 <p> 1598 <a name="l48"></a> 1599 <strong>5.2.3: Examples of external handlers</strong> 1600 <p> 1601 This section shows some examples of Crossroads configurations 1602 vs. external handlers. The sample handlers that are shown here, are 1603 also included in the Crossroads distribution, under the directory 1604 <code>etc/</code>. Also note that the examples shown here are just 1605 quick-and-dirty Perl scripts, meant to illustrate only. Your 1606 applications may need other external handlers, but you can use the 1607 shown scripts as a starting point. 1608 <p> 1609 <p><strong>Round-robin dispatching</strong><br> 1610 <p> 1611 This example is trivial in the sense that round-robin dispatching is 1612 already built into Crossroads, so 1613 that using an external handler for this purpose only slows down 1614 Crossroads. However, it's a good starting example. 1615 <p> 1616 The Crossroads configuration is shown below: 1617 <p> 1618 <pre> 1619 service test { 1620 port 8001; 1621 verbosity on; 1622 revivinginterval 5; 1623 1624 dispatchmode externalhandler 1625 /usr/local/src/crossroads/etc/dispatcher-roundrobin 1626 %1b %1a %2b %2a; 1627 1628 backend testone { 1629 server localhost:3128; 1630 verbosity on; 1631 } 1632 backend testtwo { 1633 server locallhost:3128; 1634 verbosity on; 1635 } 1636 } 1637 </pre> 1638 1639 <p> 1640 The relevant <code>dispatchmode</code> statement invokes the external program 1641 <code>dispatcher-roundrobin</code> with four arguments: the name of the first 1642 back end (<code>testone</code>), its availability (0 or 1), the name of the 1643 second back end (<code>testtwo</code>) and its availability (0 or 1). 1644 <p> 1645 The external handler, which is also included in the Crossroads 1646 distribution, is shown below. It is a Perl script. 1647 <p> 1648 <pre> 1649 #!/usr/bin/perl 1650 1651 use strict; 1652 1653 # Example of a round-robin external dispatcher. This is totally 1654 # superfluous, Crossroads has this on-board; if you use the external 1655 # program for determining round-robin dispatching, then you'll only 1656 # slow things down. This script is just meant as an example. 1657 1658 # Globals / configuration 1659 # ----------------------- 1660 my $log = '/tmp/exthandler.log'; # Debug log, set to /dev/null to suppress 1661 my $statefile = '/tmp/rr.last'; # Where we keep the last used 1662 1663 # Logging 1664 # ------- 1665 sub msg { 1666 return if ($log eq '/dev/null' or $log eq ''); 1667 open (my $of, ">>$log") or return; 1668 print $of (scalar(localtime()), ' ', @_); 1669 } 1670 1671 # Read the last used back end 1672 # --------------------------- 1673 sub readlast() { 1674 my $ret; 1675 1676 if (open (my $if, $statefile)) { 1677 $ret = <$if>; 1678 chomp ($ret); 1679 close ($if); 1680 msg ("Last used back end: $ret\n"); 1681 return ($ret); 1682 } 1683 msg ("No last-used back end (yet)\n"); 1684 return (undef); 1685 } 1686 1687 # Write back the last used back end, reply to Crossroads and stop 1688 # --------------------------------------------------------------- 1689 sub reply ($) { 1690 my $last = shift; 1691 1692 if (open (my $of, ">$statefile")) { 1693 print $of ("$last\n"); 1694 } 1695 print ("$last\n"); 1696 exit (0); 1697 } 1698 1699 # Main starts here 1700 # ---------------- 1701 1702 # Collect the cmdline arguments. We expect pairs of backend-name / 1703 # backend-availablility, and we'll store only the available ones. 1704 msg ("Dispatch request received\n"); 1705 my @backend; 1706 for (my $i = 0; $i <= $#ARGV; $i += 2) { 1707 push (@backend, $ARGV[$i]) if ($ARGV[$i + 1]); 1708 } 1709 msg ("Available back ends: @backend\n"); 1710 1711 # Let's see what the last one is. If none found, then we return the 1712 # first available back end. Otherwise we need to go thru the list of 1713 # back ends, and return the next one in line. 1714 my $last = readlast(); 1715 if ($last eq '') { 1716 msg ("Returning first available back end $backend[0]\n"); 1717 reply ($backend[0]); 1718 } 1719 1720 # There **was** a last back end. Try to match it in the list, 1721 # then return the next-in-line. 1722 for (my $i = 0; $i < $#backend; $i++) { 1723 if ($last eq $backend[$i]) { 1724 msg ("Returning next back end ", $backend[$i + 1], "\n"); 1725 reply ($backend[$i + 1]); 1726 } 1727 } 1728 1729 # No luck.. run back to the first one. 1730 msg ("Returning first back end $backend[0]\n"); 1731 reply ($backend[0]); 1732 </pre> 1733 1734 <p> 1735 The working of the script is basically as follows: 1736 <p> 1737 <ul> 1738 <li> The argument list is scanned. Back ends that are 1739 available are collected in an array <code>@backend</code>. 1740 <p> 1741 <li> The script queries a state file <code>/tmp/rr.last</code>. If a 1742 back end name occurs there, then the next back end is looked 1743 up in <code>@backend</code> and returned to Crossroads. If no last back 1744 is unknown or can't be matched, then the first available back 1745 end (first element of <code>@backend</code>) is returned to Crossroads. 1746 <p> 1747 <li> Informing Crossroads is done via the subroutine 1748 <code>reply()</code>. This code writes the selected back end to file 1749 <code>/tmp/rr.last</code> (for future usage) and prints the back end 1750 name to <em>stdout</em>. 1751 <p> 1752 <li> The script logs its actions to a file 1753 <code>/tmp/exthandler.log</code>. This log file can be inspected for 1754 the script's actions.</ul> 1755 <p> 1756 <p><strong>Dispatching by the client IP address</strong><br> 1757 <p> 1758 The following example shows a useful real-life situation. The 1759 situation is as follows: 1760 <p> 1761 <ul> 1762 <li> Crossroads is used as a single-address point to forward 1763 Remote Desktop requests to a farm of Windows systems, where 1764 users can work via remote access; 1765 <p> 1766 <li> However, users may stop their session, and when they 1767 re-connect, they expect to be sent to the Windows system that 1768 they had worked on previously; 1769 <p> 1770 <li> Client PC's have their distinct IP addresses, which 1771 distinguishes them. 1772 <p> 1773 <li> Of four windows systems, two are large servers, and two 1774 are small ones. We'll want to assign large servers to clients 1775 when we have a choice.</ul> 1776 <p> 1777 The requirements resemble session stickiness in HTTP, except that the remote 1778 desktop protocol doesn't support stickiness. This situation is a 1779 perfect example of how an external handler can help: 1780 <p> 1781 <ul> 1782 <li> A suitable dispatch mode isn't yet available in 1783 Crossroads, but can be easily coded in an external handler; 1784 <p> 1785 <li> The potential delay due to the calling of an external 1786 handler won't even be noticed. This is a network service where 1787 the connection time isn't critical; we'd expect only a few 1788 (albeit lengthy) TCP connections.</ul> 1789 <p> 1790 The approach to the solution of this problem uses several external 1791 program hooks: 1792 <p> 1793 <ul> 1794 <li> An external dispatcher handler will be responsible for 1795 suggesting a back end, given a client IP and given the current 1796 timestamp. This handler will consult an internal 1797 administration to see whether the stated IP address should 1798 re-use a back end, or to determine which back end is free for usage. 1799 <li> An external hook <code>onstart</code> will be responsible for 1800 updating the internal administration; i.e., to flag a back end 1801 as 'occupied'. 1802 <li> The external hooks <code>onfailure</code> and <code>onend</code> will be 1803 responsible for flagging a back end as 'free' again; i.e., for 1804 erasing any previous information that states that the back end 1805 was occupied.</ul> 1806 <p> 1807 The Crossroads configuration is shown below. Only four Windows back 1808 ends are shown. Each back end is configured on a 1809 given IP address, port 3389, and is limited to one concurrent connection 1810 (otherwise a new user might 'steal' a running desktop session). 1811 <p> 1812 <pre> 1813 service rdp { 1814 port 3389; 1815 revivinginterval 5; 1816 1817 /* rdp-helper dispatch IP STAMP ... will suggest a back end to use, 1818 * arguments are for all back ends: name, availability, weight */ 1819 dispatchmode externalhandler 1820 /usr/local/src/crossroads/etc/rdp-helper dispatch %r %e 1821 %1b %1a %1w 1822 %2b %2a %2w 1823 %3b %3a %3w 1824 %4b %4a %4w; 1825 1826 backend win1 { 1827 server 10.1.1.1:3389; 1828 maxconnections 1; 1829 /* rdp-helper start IP STAMP BACKEND will log the actual start 1830 * of a connection; 1831 * rdp-helper end IP will log the ending of a connection */ 1832 onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b; 1833 onend /usr/local/src/crossroads/etc/rdp-helper end %r; 1834 onfail /usr/local/src/crossroads/etc/rdp-helper end %r; 1835 } 1836 backend win2 { 1837 server 10.1.1.2:3389; 1838 maxconnections 1; 1839 onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b; 1840 onend /usr/local/src/crossroads/etc/rdp-helper end %r; 1841 onfail /usr/local/src/crossroads/etc/rdp-helper end %r; 1842 } 1843 backend win3 { 1844 server 10.1.1.3:3389; 1845 maxconnections 1; 1846 weight 2; 1847 onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b; 1848 onend /usr/local/src/crossroads/etc/rdp-helper end %r; 1849 onfail /usr/local/src/crossroads/etc/rdp-helper end %r; 1850 } 1851 backend win4 { 1852 server 10.1.1.4:3389; 1853 maxconnections 1; 1854 weight 3; 1855 onstart /usr/local/src/crossroads/etc/rdp-helper start %r %e %b; 1856 onend /usr/local/src/crossroads/etc/rdp-helper end %r; 1857 onfail /usr/local/src/crossroads/etc/rdp-helper end %r; 1858 } 1859 } 1860 </pre> 1861 1862 <p> 1863 Depending on the dispatcher stage, the exernal handler <code>rdp-helper</code> 1864 is invoked in different ways: 1865 <p> 1866 <dl> 1867 <p><dt><strong>During dispatching</strong><dd> the helper is called to suggest a back 1868 end. The arguments are an action indicator <code>dispatch</code>, the 1869 client's IP address, the timestamp, and four triplets that 1870 represent back ends: per back end its name, its availability, 1871 and its weight. The purpose of the helper is to tell 1872 Crossroads which back end to use. 1873 <p> 1874 <p><dt><strong>During connection start</strong><dd> the helper will be invoked to 1875 inform it of the start of a connection, given a client IP 1876 address. 1877 <p> 1878 <p><dt><strong>When a connection terminates</strong><dd> the helper will be invoked 1879 to inform it that the connection has ended.</dl> 1880 <p> 1881 Here's the external handler as Perl script. It uses the module 1882 <code>GDBM_File</code> which most likely will not be part of standard Perl 1883 distributions, but can be added using CPAN. (Alternatively, any other 1884 database module can be used.) 1885 <p> 1886 <pre> 1887 #!/usr/bin/perl 1888 1889 use strict; 1890 use GDBM_File; 1891 1892 # Global variables and configuration 1893 # ---------------------------------- 1894 my $log = '/tmp/exthandler.log'; # Debug log, set to /dev/null to suppress 1895 my $cdb = '/tmp/client.db'; # GDBM database of clients 1896 my %db; # .. and memory representation of it 1897 my $timeout = 24*60*60; # Timeout of a connection in secs 1898 1899 # Logging 1900 # ------- 1901 sub msg { 1902 return if ($log eq '/dev/null' or $log eq ''); 1903 open (my $of, ">>$log") or return; 1904 print $of (scalar(localtime()), ' ', @_); 1905 close ($of); 1906 } 1907 1908 # Reply a back end to the caller and stop processing. 1909 # --------------------------------------------------- 1910 sub reply ($) { 1911 my $b = shift; 1912 msg ("Suggesting $b to Crossroads.\n"); 1913 print ("$b\n"); 1914 exit (0); 1915 } 1916 1917 # Is a value in an array 1918 # ---------------------- 1919 sub inarray { 1920 my $val = shift; 1921 for my $other (@_) { 1922 return (1) if ($other eq $val); 1923 } 1924 return (0); 1925 } 1926 1927 # A connection is starting 1928 # ------------------------ 1929 sub start { 1930 my ($ip, $stamp, $backend) = @_; 1931 msg ("Logging START of connection for IP $ip on stamp $stamp, ", 1932 "back end $backend\n"); 1933 $db{$ip} = "$backend:$stamp"; 1934 } 1935 1936 # A connection has ended 1937 # ---------------------- 1938 sub end { 1939 my $ip = shift; 1940 msg ("Logging END of connection for IP $ip\n"); 1941 $db{$ip} = undef; 1942 } 1943 1944 # Request to determine a back end 1945 # ------------------------------- 1946 sub dispatch { 1947 my $ip = shift; 1948 my $stamp = shift; 1949 1950 msg ("Request to dispatch IP $ip on stamp $stamp\n"); 1951 1952 # Read the next arguments. They are triplets of 1953 # backend-name / availability / weight. Store if the back end is 1954 # available. 1955 my (@backends, @weights); 1956 for (my $i = 0; $i < $#_; $i += 3) { 1957 if ($_[$i + 1] != 0) { 1958 push (@backends, $_[$i]); 1959 push (@weights, $_[$i + 2]); 1960 msg ("Candidate back end: $_[$i] with weight ", $_[$i + 2], "\n"); 1961 } 1962 } 1963 1964 # See if this is a reconnect by a previously seen client IP. We'll 1965 # treat this as a reconnect if the timeout wasn't yet exceeded. 1966 if ($db{$ip} ne '') { 1967 my ($last_backend, $last_stamp) = split (/:/, $db{$ip}); 1968 msg ("IP $ip had last connected on $last_stamp to $last_backend\n"); 1969 if ($stamp < $last_stamp + $timeout) { 1970 msg ("Timeout not yet exceeded, this may be a reconnect\n"); 1971 # We'll allow a reconnect only if the stated last_backend is 1972 # free (sanity check). 1973 if (inarray ($last_backend, @backends)) { 1974 msg ("Last back end $last_backend is available, ", 1975 "letting through\n"); 1976 reply ($last_backend); 1977 } else { 1978 msg ("Last used back end isn't free, suggesting a new one\n"); 1979 } 1980 } else { 1981 msg ("Timeout exceeded, suggesting a new back end\n"); 1982 } 1983 } else { 1984 msg ("Np preveious connection data, suggesting a new back end\n"); 1985 } 1986 1987 my $bestweight = -1; 1988 my $bestbackend; 1989 for (my $i = 0; $i <= $#weights; $i++) { 1990 if ($bestweight == -1 or $bestweight > $weights[$i]) { 1991 $bestweight = $weights[$i]; 1992 $bestbackend = $backends[$i]; 1993 } 1994 } 1995 1996 msg ("Best back end: $bestbackend (given weight $bestweight)\n"); 1997 reply ($bestbackend); 1998 } 1999 2000 # Main starts here 2001 # ---------------- 2002 msg ("Start of run, attaching GDBM database '$cdb'\n"); 2003 tie (%db, 'GDBM_File', $cdb, &GDBM_WRCREAT, 0600); 2004 2005 # The first argument must be an action 'dispatch', 'start' or 'end'. 2006 # Depending on the action, we do stuff. 2007 my $action = shift (@ARGV); 2008 if ($action eq 'dispatch') { 2009 dispatch (@ARGV); 2010 } elsif ($action eq 'start') { 2011 start (@ARGV); 2012 } elsif ($action eq 'end') { 2013 end (@ARGV); 2014 } else { 2015 print STDERR ("Usage: rdp-helper {dispatch|start|end} args\n"); 2016 exit (1); 2017 } 2018 </pre> 2019 2020 <p> 2021 <a name="l49"></a> 2022 <h3>5.3: HTTP Session Stickiness</h3> 2023 <p> 2024 This section focuses on HTTP session stickiness. This term refers to 2025 the ability of a balancer to route a conversation between browser and 2026 a backend farm always to the same back end. In other words: once a 2027 back end is selected by the balancer, it will remain the back end of 2028 choice, even for subsequent connections. 2029 <p> 2030 <a name="l50"></a> 2031 <strong>5.3.1: Don't use stickiness!</strong> 2032 <p> 2033 The rule of thumb as far as the balancer is concerned, is: <strong>Do not 2034 use HTTP session stickiness unless you really have to.</strong> Enabling 2035 session stickiness hampers failover, balancing and performance: 2036 <p> 2037 <ul> 2038 <li> Failover is hampered because during the session, 2039 the balancer has to assign new connections to the same back 2040 end that was selected at the start of a session. If the back 2041 end suddenly goes 'down', then the session will most likely 2042 crash. (Actually, when a back end becomes unreachable in the 2043 middle of a session, Crossroads will assign a new back end to 2044 that session. This will most likely result in a malfunction 2045 of the underlying application.) 2046 <li> Balancing is hampered because at the start of the session, 2047 the balancer has selected the next-best back end. But during 2048 the session, that back end may well become overloaded. The 2049 balancer however must continue to send the requests there. 2050 <li> Performance is hampered because crossroads needs to 'unpack' 2051 messages as they are passed to and fro. That's because 2052 crossroads needs to check the HTTP headers in the messages 2053 for persistence cookies.</ul> 2054 <p> 2055 There is a number of measures that you can take to avoid using session 2056 stickiness. E.g., session data can be 'shared' between web back 2057 ends. PHP offers functionality to store session data in a database, so 2058 that all PHP applications have access to these data. Application 2059 servers such as Websphere can be configured to replicate session data 2060 between nodes. 2061 <p> 2062 <a name="l51"></a> 2063 <strong>5.3.2: But if you must..</strong> 2064 <p> 2065 However, if you <strong>must</strong> use session stickiness, then proceed as 2066 follows: 2067 <p> 2068 <ul> 2069 <li> At the level of a <code>service</code> description, set the type to 2070 <code>http</code>. 2071 <li> At the level of each back end description, configure the 2072 <code>stickycookie</code> and a <code>addclientheader</code> directives.</ul> 2073 <p> 2074 Once crossroads sees that, it will examine each HTTP message that it 2075 shuttles between client and back end: 2076 <p> 2077 <ul> 2078 <li> If there is no persistence cookie in the HTTP headers of a 2079 client's request, then the message must be the first one and 2080 a new session should be established. 2081 Crossroads selects an appropriate back 2082 end, sends the message to that back end, catches the reply, 2083 and inserts a <code>Set-Cookie</code> directive. 2084 <li> If there is a persistence cookie in the HTTP headers of a 2085 client's request, then the request is part of an already 2086 established session. Crossroads analyzes the cookie and 2087 forwards the request to the appropriate back end.</ul> 2088 <p> 2089 Below is a short example of a configuration. 2090 <p> 2091 <pre> 2092 service www { 2093 port 80; 2094 type http; 2095 revivinginterval 15; 2096 dispatchmode byconnections; 2097 2098 backend one { 2099 server 10.1.1.100:80; 2100 stickycookie XRID=100; 2101 addclientheader "Set-Cookie: XRID=100; Path=/"; 2102 } 2103 2104 backend two { 2105 server 10.1.1.101:80; 2106 stickycookie XRID=101; 2107 addclientheader "Set-Cookie: XRID=101; Path=/"; 2108 } 2109 } 2110 </pre> 2111 2112 <p> 2113 Note how the cookie names and values in the directives 2114 <code>stickycookie</code> and <code>addclientheader</code> match. That is obviously a 2115 prerequisite for stickiness. 2116 <p> 2117 <a name="l52"></a> 2118 <h3>5.4: Passing the client's IP address</h3> 2119 <p> 2120 Since Crossroads just shuttles bytes to and fro, meta-information of 2121 network connections is lost. As far as the back ends are concerned, 2122 their connections originate at the Crossroads junction. 2123 For example, standard Apache access logs will show the IP address of 2124 Crossroads. 2125 <p> 2126 In order to compensate for this, Crossroads can insert a special 2127 header in HTTP connections, to inform the back end of the original 2128 client's IP address. In order to enable this, the Crossroads 2129 configuration must state the following: 2130 <p> 2131 <ul> 2132 <li> The service type must be <code>http</code>, and not <code>any</code>; 2133 <li> In the back end definition, the following statement must 2134 occur: <br> 2135 <code>addserverheader "X-Real-IP: %r";</code> <br> 2136 You are of course free to choose the header name; the here 2137 used <code>X-Real-IP</code> is a common name for this purpose.</ul> 2138 <p> 2139 After this, HTTP traffic that arrives at the back ends has a new 2140 header: <code>X-Real-IP</code>, holding the client's IP address. 2141 <strong>Note that</strong> once the type is set to <code>http</code>, Crossroads' 2142 performance will be hampered -- all passing messages will have to be 2143 unpacked and analyzed. 2144 <p> 2145 <a name="l53"></a> 2146 <strong>5.4.1: Sample Crossroads configuration</strong> 2147 <p> 2148 The below sample configuration shows two HTTP back ends that receive 2149 the client's IP address: 2150 <p> 2151 <pre> 2152 2153 service www { 2154 port 80; 2155 type http; 2156 revivinginterval 5; 2157 dispatchmode roundrobin; 2158 2159 backend one { 2160 server 10.1.1.100:80; 2161 addserverheader "X-Real-IP: %r"; 2162 } 2163 2164 backend two { 2165 server 10.1.1.200:80; 2166 addserverheader "X-Real-IP: %r"; 2167 } 2168 } 2169 </pre> 2170 2171 <p> 2172 <a name="l54"></a> 2173 <strong>5.4.2: Sample Apache configuration</strong> 2174 <p> 2175 The method by which each back end analyzes the header <code>X-Real-IP</code> 2176 will obviously be different per server implementations. However, a 2177 common method with the Apache webserver is to log the client's IP 2178 address into the access log. 2179 <p> 2180 Often this is accomplished using the log format <code>custom</code>, defined as 2181 follows: 2182 <p> 2183 <pre> 2184 LogFormat "%h %l %u %t %D \"%r\" %>s %b" common 2185 CustomLog logs/access_log common 2186 </pre> 2187 2188 <p> 2189 The first line defines the format <code>common</code>, with the remote host 2190 specified by <code>%h</code>. The second line sends access information to a log 2191 file <code>logs/access_log</code>, using the previously defined format 2192 <code>common</code>. 2193 <p> 2194 Furtunately, Apache's <code>LogFormat</code> allows one to log contents of 2195 headers. By replacing the <code>%h</code> with <code>%{X-Real-IP}i</code>, the desired 2196 information is sent to the log. Therefore, normally you can simply 2197 redefine the <code>common</code> format to 2198 <p> 2199 <pre> 2200 LogFormat "%{X-Real-IP}i %l %u %t %D \"%r\" %>s %b" common 2201 </pre> 2202 2203 <p> 2204 <a name="l55"></a> 2205 <h3>5.5: Debugging network traffic</h3> 2206 <p> 2207 Incase the traffic between 2208 client and backend 2209 must be debugged, the statement <code>trafficlog</code> <em>filename</em> can 2210 be issued. This causes the traffic to be dumped in hexadecimal 2211 format to the stated filename. 2212 <p> 2213 Traffic sent by the client is prefixed by a <strong>C</strong>, traffic sent by 2214 the back end is prefixed by a <strong>B</strong>. Below is a sample traffic 2215 dump of a browser trying to get a HTML page. The server replies 2216 that the page was not modified. 2217 <p> 2218 <pre> 2219 C 0000 47 45 54 20 68 74 74 70 3a 2f 2f 77 77 77 2e 63 GET http://www.c 2220 C 0010 73 2e 68 65 6c 73 69 6e 6b 69 2e 66 69 2f 6c 69 s.helsinki.fi/li 2221 C 0020 6e 75 78 2f 6c 69 6e 75 78 2d 6b 65 72 6e 65 6c nux/linux-kernel 2222 C 0030 2f 32 30 30 31 2d 34 37 2f 30 34 31 37 2e 68 74 /2001-47/0417.ht 2223 C 0040 6d 6c 20 48 54 54 50 2f 31 2e 31 0d 0a 43 6f 6e ml HTTP/1.1..Con 2224 C 0050 6e 65 63 74 69 6f 6e 3a 20 63 6c 6f 73 65 0d 0a nection: close.. 2225 . 2226 . etcetera 2227 . 2228 B 0000 48 54 54 50 2f 31 2e 30 20 33 30 34 20 4e 6f 74 HTTP/1.0 304 Not 2229 B 0010 20 4d 6f 64 69 66 69 65 64 0d 0a 44 61 74 65 3a Modified..Date: 2230 B 0020 20 54 75 65 2c 20 31 32 20 4a 75 6c 20 32 30 30 Tue, 12 Jul 200 2231 B 0030 35 20 30 39 3a 34 39 3a 34 37 20 47 4d 54 0d 0a 5 09:49:47 GMT.. 2232 B 0040 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 74 65 Content-Type: te 2233 B 0050 78 74 2f 68 74 6d 6c 3b 20 63 68 61 72 73 65 74 xt/html; charset 2234 . 2235 . etcetera 2236 . 2237 </pre> 2238 2239 <p> 2240 Turning on traffic dumps will <em>significantly</em> 2241 slow down crossroads. 2242 <p> 2243 Besides <code>trafficlog</code>, there is also a directive 2244 <code>throughputlog</code>. This directive also takes one argument, a 2245 filename. The file is appended, and the following information is 2246 logged: 2247 <p> 2248 <ul> 2249 <li> The process ID of the crossroads image that serves the 2250 TCP connection; 2251 <li> The time of the request, in seconds and microseconds 2252 since start of the run; 2253 <li> A <strong>C</strong> when the request originated at the client, or 2254 <strong>B</strong> when the request originated at the back end; 2255 <li> The first 100 bytes of the request.</ul> 2256 <p> 2257 As an example, consider the following (the lines are shortened for 2258 brevity and prefixed by line numbers for clarity): 2259 <p> 2260 <pre> 2261 2262 1 0000594 0.000001 C GET http://public.e-tunity.com/index.html... 2263 2 0000594 0.173713 B HTTP/1.0 200 OK..Date: Fri, 18 Nov 2005 0... 2264 3 0000594 0.278125 B width="100" bgcolor="#e0e0e0" valign="to... 2265 4 0000595 0.000001 C GET http://public.e-tunity.com/css/style/... 2266 5 0000594 0.944339 B /a></td>.. </tr>.</table>.</td><td class... 2267 6 0000594 0.946356 B smallboxdownl">Download</td>.. <td class... 2268 7 0000594 0.961102 B td><td class="smallboxodd" valign="top"><... 2269 8 0000595 0.698215 B HTTP/1.0 304 Not Modified..Date: Fri, 18 ... 2270 </pre> 2271 2272 <p> 2273 This tells us that: 2274 <p> 2275 <ul> 2276 <li> Line 1: PID 594 served a request that originated at 2277 the client. The corresponding time is (almost) 0 seconds, 2278 so this is really the start of the run. 2279 <li> Line 2: A back end replied 0.17 seconds later, and 2280 0.28 seconds later, it was still replying (this is the 2281 third line, again a <strong>B</strong>-type transmission). 2282 <li> Line 4: PID 595 served a request that originated 2283 at the client. Again, the corresponding time is (almost) 2284 0 seconds, since this is the first conversation part of 2285 this connection. 2286 <li> Lines 5 to 7: This is the continuation of line 2. Line 7 2287 is the last line of the <strong>B</strong> series (not visible from 2288 the example, but trust me, it is), so that we may 2289 conclude that it took the back end 0.96 seconds to serve 2290 the file <code>index.html</code> requested in line 1. 2291 <li> Line 8: This is the answer to the client's request of 2292 line 4 (you can tell by the process ID number). 2293 So the back end took 0.68 seconds to confirm that 2294 the stylesheet requested in line 4 wasn't modified.</ul> 2295 <p> 2296 It is also worth while remembering that the start time of a <strong>C</strong> 2297 request is the time that crossroads sees the activity. Any latency 2298 between the true client and crossroads is obviously not 2299 included. This is illustrated by the below simple ASCII art: 2300 <p> 2301 <pre> 2302 2303 client ---->---->---->--->*crossroads ====>====>====> 2304 \ 2305 back end 2306 / 2307 client ----<----<----<---< crossroads ====<====<====< 2308 2309 </pre> 2310 2311 <p> 2312 This simple picture shows a typical HTTP request that originates 2313 at a client, travels to crossroads, and is relayed via the back 2314 end. The <strong>C</strong> entry in a throughput log is the time when 2315 crossroads sees the request, indicated by an asterisk. The <strong>B</strong> 2316 entries are the times that it takes the back end to answer, 2317 indicated by <code>===</code> style lines. Therefore, the true roundtrip 2318 time will be longer than the number of seconds that are logged in 2319 the throughput log: the latency between client and crossroads 2320 isn't included in that measurement. 2321 <p> 2322 Summarizing, the throughput times of a client-back end connection 2323 can be analyzed using the directive <code>throughputlog</code>. In a 2324 real-world analysis, you'd probably want to write up a script to 2325 analyze the output and to compute round trip times. Such scripts 2326 are not (yet) included in Crossroads. 2327 <p> 2328 <a name="l56"></a> 2329 <h3>5.6: Limiting Access to Crossroads by Client IP Address</h3> 2330 <p> 2331 <a name="l57"></a> 2332 <strong>5.6.1: General Examples</strong> 2333 <p> 2334 The directives <code>allowfrom</code>, <code>denyfrom</code>, <code>allowfile</code> and 2335 <code>denyfile</code> can be used to instruct Crossroads to specifically allow 2336 access by using a "whitelist" of IP addresses, or to specifically deny 2337 access by using a "blacklist". E.g., the following configuration 2338 allows access to service <code>webproxy</code> only to <em>localhost</em>: 2339 <p> 2340 <pre> 2341 service webproxy { 2342 port 8000; 2343 allowfrom 127.0.0.1; 2344 backend one { 2345 . 2346 . Back end definitions occur here 2347 . 2348 } 2349 . 2350 . Other back ends or other service directives 2351 . may occur here 2352 . 2353 } 2354 </pre> 2355 2356 <p> 2357 In this example there is a "whitelist" having only one entry: IP 2358 address 127.0.0.1, or <em>localhost</em>. (Incidentally, the same behaviour 2359 could be accomplished by stating <em>bindto 127.0.0.1</em>, in which case 2360 Crossroads would only listen to the local network device.) 2361 <p> 2362 In the same vein, the directive <code>allowfrom 127.0.0.1 192.168.1/24</code> 2363 would allow access to <em>localhost</em> and to all IP addresses that start 2364 with 192.168.1. The specifier <code>192.168.1/24</code> states that there are 2365 three network bytes (192, 168 and 1), and 24 bits (or 3 bytes) are 2366 relevant; so that the fourth network byte doesn't matter. 2367 <p> 2368 <a name="l58"></a> 2369 <strong>5.6.2: Using External Files</strong> 2370 <p> 2371 The directives <code>allowfile</code> and <code>denyfile</code> allow you to specify IP 2372 addresses in external files. The Crossroads configuration states 2373 e.g. <code>allowfile /tmp/allow.txt</code>, and the IP addresses are then in 2374 <code>/tmp/allow.txt</code>. The format of <code>/tmp/allow.txt</code> is as follows: 2375 <p> 2376 <ul> 2377 <li> The specifications follow again <em>p.q.r.s/mask</em>, where 2378 p, q, r and s are network bytes which can be left out on the 2379 right hand side when the mask allows it; 2380 <p> 2381 <li> The specifications must be separated by white space 2382 (spaces, tabs or newlines).</ul> 2383 <p> 2384 E.g., the following is a valid example of an external specification 2385 file: 2386 <p> 2387 <pre> 2388 127.0.0.1 2389 192.168.1/24 2390 10/8 2391 </pre> 2392 2393 <p> 2394 When external files are in effect, then the signal <code>SIGHUP</code> (1) 2395 causes Crossroads to reload the external file. E.g., while Crossroads 2396 is running, you may edit <code>/tmp/allow.txt</code>, and then issue <code>killall 2397 -1 crossroads</code>. The new contents of <code>/tmp/allow.txt</code> will be 2398 reloaded. 2399 <p> 2400 <a name="l59"></a> 2401 <strong>5.6.3: Mixing Directives</strong> 2402 <p> 2403 Crossroads allows to mix all directives in one service 2404 description. However, some mixes are less meaningful than others. It's 2405 up to you to take this into account. 2406 <p> 2407 The following rules apply: 2408 <p> 2409 <ul> 2410 <li> Blacklisting and whitelisting can be used together. When 2411 combined, the blacklist will always be interpreted 2412 first. E.g., consider the following directives: 2413 <p> 2414 <pre> 2415 allowfrom 192.168.1/24 2416 denyfrom 192.168.1.100 2417 </pre> 2418 2419 <p> 2420 Given the fact that the deny list is checked first, client 2421 192.168.1.100 won't be able to access Crossroads. Then the 2422 allow list will be checked, stating that all clients whose IP 2423 address starts with 192.168.1 may connect. The effect will be 2424 that e.g., client 192.168.1.1 may connect, 192.168.1.2 may 2425 connect too, 192.168.1.100 will be blocked, and 10.1.1.1 will 2426 be blocked as well. 2427 <p> 2428 Now consider the following directives: 2429 <p> 2430 <pre> 2431 allowfrom 192.168.1.100 127.0.0.1 2432 denyfrom 192.168.1/24 2433 </pre> 2434 2435 <p> 2436 This will first of all deny access to all IP addresses that 2437 start with 192.168.1. So the rule that allows 192.168.1.100 2438 won't ever be effective. The net result will be that access 2439 will be granted to 127.0.0.1, and IP addresses that don't 2440 match 192.168.1/24. 2441 <p> 2442 <li> Blacklisting or whitelisting can be left out. 2443 A list is considered empty when no appropriate directives 2444 occur in <code>/etc/crossroads.conf</code>, or when the directive 2445 points to an empty or non-existent external file. 2446 <p> 2447 <li> Using <code>*from</code> and <code>*file</code> statements is allowed, but 2448 doesn't make sense. E.g., the following configuration sample 2449 is such a case: 2450 <p> 2451 <pre> 2452 allowfrom 127.0.0.1 192.168.1/24 2453 allowfile /tmp/allow.txt 2454 </pre> 2455 2456 <p> 2457 There is a technical reason for this. Once Crossroads 2458 processes the <code>allowfile</code> directive, then the whole 2459 whitelist is cleared (thereby removing the entries 127.0.0.1 2460 and 192.168.1/24), and new entries are reloaded from the 2461 file. The net result is that the <code>allowfrom</code> specification 2462 is overruled. 2463 <p> 2464 Crossroads doesn't check for such configurations, which are 2465 syntactially correct, but make no semantic sense.</ul> 2466 <p> 2467 <a name="l60"></a> 2468 <h3>5.7: Configuration examples</h3> 2469 <p> 2470 As a general hint, use <code>crossroads sampleconf</code> to view the most 2471 up-to-date examples of configurations. The description below shows a 2472 few examples too. 2473 <p> 2474 <a name="l61"></a> 2475 <strong>5.7.1: A load balancer for three webserver back ends</strong> 2476 <p> 2477 The following configuration example binds crossroads to port 80 of the 2478 current server, and distributes the load over three back ends. This 2479 configuration shows most of the possible settings. 2480 <p> 2481 <pre> 2482 service www { 2483 /* We don't need session stickyness. */ 2484 type any; 2485 2486 /* Port on which we'll listen in this service: required. */ 2487 port 8000; 2488 2489 /* What IP address should this service listen? Default is 'any'. 2490 * Alternatively you can state an explicit IP address, such as 2491 * 127.0.0.1; that would bind the service only to 'localhost'. */ 2492 bindto any; 2493 2494 /* Verbose reporting or not. Default is off. */ 2495 verbosity on; 2496 2497 /* Dispatching mode, or: How to select a back end for an incoming 2498 * request. Possible values: 2499 * roundrobin: just the next back end in line 2500 * random: like roundrobin, but at random to make things more 2501 * confusing. Probably only good for testing. 2502 * bysize: The backend that transferred the least nr of bytes 2503 * is the next in line. As a modifier you can say e.g. 2504 * bysize over 10, meaning that the 10 last connections will 2505 * be used to compute the transfer size, instead of all 2506 * transfers. 2507 * byduration: The backend that was active for the shortest time 2508 * is the next in line. As a modifier you can say e.g. 2509 * byduration of 10 to compute over the last 10 connections. 2510 * byconnections: The back end with the least active connections 2511 * is the next ine line. 2512 * byorder: The first available back end is always taken. 2513 */ 2514 dispatchmode byduration over 5; 2515 2516 /* Interval at which we'll check whether a temporarily unavailable 2517 * backend has woken up. 2518 */ 2519 revivinginterval 5; 2520 2521 /* TCP backlog of connections. Default is 0 (no backlog, one 2522 * connection may be active). 2523 */ 2524 backlog 5; 2525 2526 /* For status reporting: a shared memory key. Default is the same 2527 * as the port number, OR-ed by a magic number. 2528 */ 2529 shmkey 8000; 2530 2531 /* This controls when crossroads should consider a connection as 2532 * finished even when the TCP sockets weren't closed. This is to 2533 * avoid hanging connections that don't do anything. NOTE THAT when 2534 * crossroads cuts off a connection due to timeout exceed, this is 2535 * not marked as a failure, but as a success. Default is 0: no timeout. 2536 */ 2537 connectiontimeout 300; 2538 2539 /* The max number of allowed client connections. When present, connections 2540 * won't be accepted if the max is about to be exceeded. When 2541 * absent, all connections will be accepted, which might be misused 2542 * for a DOS attack. 2543 */ 2544 maxconnections 300; 2545 2546 /* Now let's define a couple of back ends. Number 1: */ 2547 backend www_backend_1 { 2548 /* The server and its port, the minimum configuration. */ 2549 server httpserver1; 2550 port 9010; 2551 /* The 'decay' of usage data of this back end. Only relevant 2552 * when the whole service has 'dispatchmode bysize' or 2553 * 'byduration'. The number is a percentage by which the usage 2554 * parameter is decreased upon each connection of an other back 2555 * end. 2556 */ 2557 decay 10; 2558 2559 /* To see what's happening in /var/log/messages: */ 2560 verbosity on; 2561 } 2562 2563 /* The second one: */ 2564 backend www_backend_2 { 2565 /* Server and port */ 2566 server httpserver2; 2567 port 9011; 2568 2569 /* Verbosity of reporting when this back end is active */ 2570 verbosity on; 2571 2572 /* Decay */ 2573 decay 10; 2574 2575 /* This back end is twice as weak as the first one */ 2576 weight 2; 2577 2578 /* Event triggers for system commands upon succesful activation 2579 * and upon failure. 2580 */ 2581 onsuccess echo 'success on backend 2' | mail root; 2582 onfailure echo 'failure on backend 2' | mail root; 2583 } 2584 2585 /* And yet another one.. this time we will dump the traffic 2586 * to a trace file. Furthermore we don't want more than 10 concurrent 2587 * connections here. Note that there's also a total maxconnections for the 2588 * whole service. 2589 */ 2590 backend www_backend_3 { 2591 server httpserver3; 2592 verbosity on; 2593 port 9000; 2594 verbosity on; 2595 decay 10; 2596 trafficlog /tmp/backend.3.log; 2597 maxconnections 10; 2598 } 2599 } 2600 </pre> 2601 2602 <p> 2603 <a name="l62"></a> 2604 <strong>5.7.2: An HTTP forwarder when travelling</strong> 2605 <p> 2606 As another example, here's my <code>crossroads.conf</code> that I use on my 2607 Unix laptop. The problem that I face is that I need many HTTP proxy 2608 configurations (at home, at customers' sites and so on) but I'm too 2609 lazy to reconfigure browsers all the time. 2610 <p> 2611 Here's how it used to be before crossroads: 2612 <p> 2613 <ul> 2614 <li> At home, I would surf through a squid proxy on my local 2615 machine. The browser proxy setting is then 2616 <code>http://localhost:3128</code>. 2617 <p> 2618 <li> Sometimes I start up an SSH tunnel to our offices. The 2619 tunnel has a local port 3129, and connects to a squid proxy on 2620 our e-tunity server. Hence, the browser proxy is then 2621 <code>http://localhost:3129</code>. 2622 <p> 2623 <li> At a customer's location I need the proxy 2624 <code>http://10.120.34.113:8080</code>, because they have configured it 2625 so. 2626 <p> 2627 <li> And in yet other instances, I use a HTTP diagnostic tool 2628 <a href="http://www.xk72.com/charles">Charles</a> 2629 that sits between browser and website and shows me 2630 what's happening. I run charles on my own machine and it 2631 listens to port 8888, behaving like a proxy. The browser 2632 configuration for the proxy is then 2633 <code>http://localhost:8888</code>.</ul> 2634 <p> 2635 Here's how it works with a crossroads configuration: 2636 <p> 2637 <ul> 2638 <li> I have configured my browsers to use 2639 <code>http://localhost:8080</code> as the proxy. For all situations. 2640 <p> 2641 <li> I use the following crossroads configuration, and let 2642 crossroads figure out which proxy backend works, and which 2643 doesn't. Note two particularities: 2644 <p> 2645 <ul> 2646 <li> The statement <code>dispatchmode byorder</code>. This 2647 makes sure that once crossroads determines which 2648 backend works, it will stick to it. This usage of 2649 crossroads doesn't need to balance over more than one 2650 back end. 2651 <p> 2652 <li> The statement <code>bindto 127.0.0.1</code> makes sure 2653 that requests from other interfaces than loopback 2654 won't get serviced.</ul> 2655 <p> 2656 <pre> 2657 service HttpProxy { 2658 port 8080; 2659 bindto 127.0.0.1; 2660 verbosity on; 2661 dispatchmode byorder; 2662 revivinginterval 15; 2663 2664 backend Charles { 2665 server localhost:8888; 2666 verbosity on; 2667 } 2668 2669 backend CustomerProxy { 2670 server 10.120.34.113:8080; 2671 verbosity on; 2672 } 2673 2674 backend SshTunnel { 2675 server localhost:3129; 2676 } 2677 2678 backend LocalSquid { 2679 server localhost:3128; 2680 } 2681 } 2682 </pre> 2683 </ul> 2684 <p> 2685 As a final note, the commandline argument <code>tell</code> can be used to 2686 influence crossroad's own detection mechanism of back end availability 2687 detection. E.g., if in the above example the back ends <code>SshTunnel</code> 2688 and <code>LocalSquid</code> are both active, then <code>crossroads tell httpproxy 2689 sshtunnel down</code> will 'take down' the back end <code>SshTunnel</code> -- and 2690 will automatically cause crossroads to switch to <code>LocalSquid</code>. 2691 <p> 2692 <a name="l63"></a> 2693 <strong>5.7.3: SSH login with enforced idle logout</strong> 2694 <p> 2695 The following example shows how crossroads 'throttles' SSH 2696 logins. Connections are accepted on port 2697 22 (the normal SSH port) and forwarded to the actual SSH daemon 2698 which is running on port 2222. 2699 <p> 2700 Note the usage of the 2701 <code>connectiontimeout</code> directive. This makes sure that users are logged 2702 out after 10 minutes of inactivity. Note also the <code>maxconnections</code> 2703 setting, this makes sure that no more than 10 concurrent logins occur. 2704 <p> 2705 <pre> 2706 service Ssh { 2707 port 22; 2708 backlog 5; 2709 maxconnections 10; 2710 connectiontimeout 600; 2711 backend TrueSshDaemon { 2712 server localhost:2222; 2713 } 2714 } 2715 </pre> 2716 2717 <p> 2718 <a name="l64"></a> 2719 <h2>6: Benchmarking</h2> 2720 <a name="benchmarking"></a>This section shows how crossroads affects the 2721 transmitting of HTML data when used as an intermediate 'station' 2722 through which all data travels. 2723 <p> 2724 <a name="l65"></a> 2725 <h3>6.1: Benchmark 1: Accessing a proxy via crossroads or directly</h3> 2726 <p> 2727 The benchmark was run on a system where the following was varied: 2728 <p> 2729 <ol> 2730 <li> A website was recursively spidered through a local squid 2731 proxy. The spidering was repeated 10 times, the total was recorded. 2732 <p> 2733 <li> Crossroads was placed in front of the squid proxy, and 2734 the website was again recursively spidered. Again, the 2735 spidering was repeated 10 times and the total was recorded.</ol> 2736 <p> 2737 The crossroads configuration of the second alternative is shown below: 2738 <p> 2739 <pre> 2740 service HttpProxy { 2741 port 8080; 2742 verbosity on; 2743 backend LocalSquid { 2744 server 127.0.0.1; 2745 port 3128; 2746 verbosity on; 2747 } 2748 } 2749 </pre> 2750 2751 <p> 2752 <a name="l66"></a> 2753 <strong>6.1.1: Results</strong> 2754 <p> 2755 The results of this test are that crossroads causes a negligible 2756 delay, if it is statistically relevant at all. Without crossroads, the 2757 timing results are: 2758 <p> 2759 <pre> 2760 real 0m8.146s 2761 user 0m0.130s 2762 sys 0m0.253s 2763 </pre> 2764 2765 <p> 2766 When using crossroads as a middle station, the results are: 2767 <p> 2768 <pre> 2769 real 0m9.481s 2770 user 0m0.141s 2771 sys 0m0.230s 2772 </pre> 2773 2774 <p> 2775 <a name="l67"></a> 2776 <strong>6.1.2: Discussion</strong> 2777 <p> 2778 The above shown results are quite favorable to crossroads. However, 2779 one should know that situations will exist where crossroads leans 2780 towards the 'worst case' scenario, causing up to 50% 2781 delay. 2782 <p> 2783 E.g., imagine a test where a <code>wget</code> command retrieves a 2784 HTML document from an Apache server on <code>localhost</code>. Now we have 2785 (almost) no overhead due to network throttling, hostname lookups and 2786 so on. When this test would be run either with or without crossroads 2787 in between, then theoretically, crossroads would cause a much larger 2788 delay, because it has to read from the server, and then write the same 2789 information to <code>wget</code>. Each read/write occurs twice when crossroads 2790 sits in between. 2791 <p> 2792 This worst case scenario will however (fortunately) occur only very 2793 seldom in the real world: 2794 <p> 2795 <ul> 2796 <li> Normally network issues, such as the above mentioned host 2797 name lookups or throughput restrictions, will add 2798 significantly to the duration of a request. The 'twice as 2799 many' read/writes caused by crossroads are then relatively 2800 irrelevant. 2801 <p> 2802 <li> Normally a significant amount of time will be spent in a 2803 back end, due to processing (e.g., when calling a servlet on a 2804 back end). Again, this processing time will weigh much heavier 2805 than the multiple read/writes.</ul> 2806 <p> 2807 <a name="l68"></a> 2808 <h3>6.2: Benchmark 2: Crossroads versus Linux Virtual Server (LVS)</h3> 2809 <p> 2810 LVS is a kernel-based balancer that acts like a masquerading 2811 firewall: TCP packets that arrive at the balancer are sent to one of 2812 the configured back ends. LVS has the advantage over crossroads that 2813 there is no stop-and-go in the transmission; in contrast, crossroads 2814 needs to send data via an internal buffer. Crossroads has the 2815 advantage that it offers instantaneous failover because it tries to 2816 contact the back end for upon each new TCP connection; in contrast, 2817 LVS isn't aware of downtime of back ends (unless one implements an 2818 external heartbeat). Also, crossroads offers more complex balancing 2819 than LVS. 2820 <p> 2821 <a name="l69"></a> 2822 <strong>6.2.1: Environment</strong> 2823 <p> 2824 On the balancer, LVS was run on port 80, its forwarding set up for two 2825 equally weighted back ends, using <code>ipvsadm</code>: 2826 <p> 2827 <pre> 2828 ipvsadm -a -t 192.168.1.250:http -r 10.1.1.100:http -m -w 1 2829 ipvsadm -a -t 192.168.1.250:http -r 10.1.1.101:http -m -w 1 2830 </pre> 2831 2832 <p> 2833 Crossroads was run on port 81. The configuration file is shown below: 2834 <p> 2835 <pre> 2836 service http { 2837 port 81; 2838 dispatchmode roundrobin; 2839 revivinginterval 5; 2840 backend one { 2841 server 10.1.1.100; 2842 port 80; 2843 } 2844 backend two { 2845 server 10.1.1.101; 2846 port 80; 2847 } 2848 } 2849 </pre> 2850 2851 <p> 2852 <a name="l70"></a> 2853 <strong>6.2.2: Tests and results</strong> 2854 <p> 2855 In the first test, ports 80 and 81 on the balancer were 'bombed' with 2856 50 concurrent clients, each requesting a small page 50 times. The 2857 following timings where measured: 2858 <p> 2859 <ul> 2860 <li> How long it takes to establish a connection; 2861 <li> How long it takes to retrieve the page.</ul> 2862 <p> 2863 The results of this test were: 2864 <p> 2865 <ul> 2866 <li> On average, each client took 0.12 seconds to connect 2867 to LVS, and each page was retrieved in 0.14 seconds; 2868 <li> On average, each client took 0.11 seconds to connect to 2869 crossroads, and each page was retrieved in 0.13 seconds.</ul> 2870 <p> 2871 In this setup there seems to be no difference between the performance 2872 of LVS and crossroads! 2873 <p> 2874 In a second test, the size of the retrieved page was varied from 2.000 2875 to 2.000.000 bytes. This test was taken to see whether crossroads would 2876 show performance degradation when transferring larger amounts of data. 2877 <p> 2878 For each page size, 30 concurrent clients were started, that retrieved 2879 the page 50 times. Again, the connect times and processing times where 2880 recorded. 2881 <p> 2882 The results of the total time (connect time + retrieval time) 2883 are shown in the below table: 2884 <p> 2885 <table> 2886 2887 <td colspan=3><hr></td> 2888 2889 2890 <tr> 2891 2892 <td> <strong>Bytes</strong></td> <td> <strong>LVS timing</strong></td> <td> <strong>Crossroads timing</strong></td> 2893 2894 </tr> 2895 2896 2897 <tr> 2898 2899 <td> 2000</td> <td> 0.130741688</td> <td> 0.12739582</td> 2900 2901 </tr> 2902 2903 2904 <tr> 2905 2906 <td> 20000</td> <td> 0.490916224</td> <td> 0.50376901</td> 2907 2908 </tr> 2909 2910 2911 <tr> 2912 2913 <td> 200000</td> <td> 3.799440328</td> <td> 4.33125273</td> 2914 2915 </tr> 2916 2917 2918 <tr> 2919 2920 <td> 2000000</td> <td> 45.25090855</td> <td> 45.9600728</td> 2921 2922 </tr> 2923 2924 <td colspan=3><hr></td> 2925 2926 </table> 2927 <p> 2928 Again, the results show that crossroads performs just as effectively 2929 as LVS, even with large data chunks! 2930 <p> 2931 <a name="l71"></a> 2932 <h2>7: Compiling and Installing</h2> 2933 <a name="compiling"></a><a name="l72"></a> 2934 <h3>7.1: Prerequisites</h3> 2935 <p> 2936 The creation of crossroads requires: 2937 <p> 2938 <ul> 2939 <li> Standard Unix tools, such as <code>sed</code>, <code>awk</code>, <code>Perl</code> 2940 (5.00 or better); 2941 <p> 2942 <li> A POSIX-compliant C compiler; 2943 <p> 2944 <li> Support for SYSV IPC, networking and so on. 2945 </ul> 2946 <p> 2947 Basically a Linux or Apple MacOSX box will do nicely. To compile and install 2948 crossroads, follow these steps. 2949 <p> 2950 <a name="l73"></a> 2951 <h3>7.2: Compiling and installing</h3> 2952 <p> 2953 <ul> 2954 <li> Obtain the source distribution. It can be found on 2955 <a href="http://crossroads.e-tunity.com">http://crossroads.e-tunity.com</a>. The distribution comes as an 2956 archive <code>crossroads-</code><em>type</em><code>.tar.gz</code>, where <em>type</em> is 2957 <code>stable</code> or <code>devel</code>. 2958 <p> 2959 <li> Unpack the archive in a sources directory using <code>tar 2960 xzf crossroads-</code><em>X.YY</em><code>.tar.gz</code>. The contents spill into a 2961 subdirectory <code>crossroads-</code><em>X.YY/</em>. 2962 <p> 2963 <li> Change-dir into the directory. 2964 <p> 2965 <li> Next, edit <code>etc/Makefile.def</code> and verify that all 2966 compilation settings are to your likings. The settings are 2967 explained in the file. <strong>Note that</strong> the default distribution 2968 of <code>Makefile.def</code> is suited for Linux or Apple MacOSX 2969 systems. On other Unices, or on non-Unix systems, you must 2970 particularly pay attention to <code>SET_PROC_TITLE_BY...</code>. When 2971 in doubt, comment out all <code>SET_PROC_TITLE...</code> 2972 settings. Crossroads will work nevertheless, but it won't show 2973 nice titles in <code>ps</code> listings. Also there's a macro 2974 <code>EXTRA_LIBS</code> to add linkage flags (an example for a Solaris 2975 build is included). 2976 <p> 2977 <li> Now crossroads is ready for compilation. Do a <code>make 2978 local</code> followed by <code>make install</code>. The latter step may have 2979 to be done by the user <code>root</code> if the <code>BINDIR</code> setting of 2980 <code>etc/Makefile.def</code> points to a root-owned directory. 2981 <p> 2982 <li> The documentation doesn't install in this process. If you 2983 want to install the documentation, then proceed as follows: 2984 <p> 2985 <ul> 2986 <li> Optionally, <code>cp doc/crossroads.html</code> 2987 <em>htmldirectory/</em>; where <em>htmldirectory</em> is the destination 2988 directory for your HTML manuals; 2989 <p> 2990 <li> Optionally, <code>cp doc/crossroads.pdf</code> 2991 <em>pdfdirectory/</em>; where <em>pdfdirectory</em> is the 2992 destination directory for your PDF manuals; 2993 <p> 2994 <li> Optionally, <code>cp doc/crossroads.man</code> 2995 <em>manualdirectory</em><code>/crossroads.1</code>, where 2996 <em>manualdirectory</em> is e.g. <code>/usr/man/man1</code>, 2997 <code>/usr/share/man1</code>, <code>/usr/local/man/man1</code>, 2998 <code>/usr/local/share/man1</code>. Any possibility is valid, as 2999 long as <em>manualdirectory</em> is one of the directories 3000 where manual pages are stored; 3001 <p> 3002 <li> If your manual page system supports compressed 3003 manual pages, then you can save some space with 3004 <code>gzip</code> <em>manualdirectory</em><code>/crossroads.1</code>.</ul> 3005 <p> 3006 </ul> 3007 <p> 3008 <a name="l74"></a> 3009 <h3>7.3: Configuring crossroads</h3> 3010 <p> 3011 Now that the binary is available on your system, you need to create a 3012 suitable <code>/etc/crossroads.conf</code>. Use this manual or the output of 3013 <code>crossroads samplconf</code> to get started. 3014 <p> 3015 Once you have the configuration ready, start crossroads with 3016 <code>crossroads start</code>. Test the availability of your services and back 3017 ends. Monitor how crossroads is doing with: 3018 <p> 3019 <ul> 3020 <li> In one terminal, run the script: 3021 <pre> 3022 while [ 1 ] ; do 3023 tput clear 3024 crossroads status 3025 sleep 3 3026 done 3027 </pre> 3028 3029 <p> 3030 <strong>Note</strong> that depending on your system you might need 3031 <code>sleep 3s</code>, i.e., with an <code>s</code> appended. 3032 <p> 3033 <li> In another terminal, run: 3034 <pre> 3035 while [ 1 ] ; do 3036 tput clear 3037 ps ax | grep crossroads | grep -v grep 3038 sleep 3 3039 done 3040 </pre> 3041 3042 <p> 3043 <strong>Note</strong> that depending on your system you might need 3044 <code>ps -ef</code> instead of <code>ps ax</code>. 3045 <p> 3046 <li> In yet another terminal, run <code>tail -f 3047 /var/log/messages</code> (supply the appropriate system log file if 3048 <code>/var/log/messages</code> doesn't work for you).</ul> 3049 <p> 3050 Now thoroughly test the availability of your back ends through 3051 crossroads. The status display will show an updated view of which back 3052 ends are selected and how busy they are. The process list will show 3053 which crossroads daemons are running. Finally, the tailing of 3054 <code>/var/log/messages</code> shows what's going on -- especially if you have 3055 <code>verbosity true</code> statements in the configuration. 3056 <p> 3057 <a name="l75"></a> 3058 <h3>7.4: A boot script</h3> 3059 <p> 3060 Finally, you may want to create a boot-time startup script. The exact 3061 procedure depends on the used Unix flavor. 3062 <p> 3063 <a name="l76"></a> 3064 <strong>7.4.1: SysV Style Startup</strong> 3065 <p> 3066 On SysV style systems, there's a startup script directory 3067 <code>/etc/init.d</code> where bootscripts for all utilities are located. 3068 You may have the <code>chkconfig</code> utility to automate the task of 3069 inserting scripts into the boot sequence, but 3070 otherwise the steps will resemble the following. 3071 <p> 3072 <ul> 3073 <li> Create a script <code>crossroads</code> in <code>/etc/init.d</code> similar to the 3074 following: 3075 <p> 3076 <pre> 3077 #!/bin/sh 3078 /usr/local/bin/crossroads -v $@ 3079 </pre> 3080 3081 <p> 3082 The stated directory <code>/usr/local/bin</code> must correspond with 3083 the installation path. The flag <code>-v</code> causes the startup to 3084 be more 'verbose'. However, once daemonized, the verbosity is 3085 controlled by the appropriate statements in the configuration. 3086 <p> 3087 <li> Determine your 'runlevel': usually 3 when your system is 3088 running in text-mode only, or 5 when you are using a graphical 3089 interface. If your runlevel is 3, then: 3090 <p> 3091 <pre> 3092 root> cd /etc/rc.d/rc3.d 3093 root> ln -s /etc/init.d/crossroads S99crossroads 3094 root> ln -s /etc/init.d/crossroads K99crossroads 3095 </pre> 3096 3097 <p> 3098 This creates startup (<code>S*</code>) and stop (<code>K*</code>) links that 3099 will be run when the system enters or leaves a given runlevel. 3100 <p> 3101 If your runlevel is 5, then the right <code>cd</code> command is to 3102 <code>/etc/rc.d/rc5.d</code>. Alternatively, you can create the 3103 symlinks in both runlevel directories.</ul> 3104 <p> 3105 <a name="l77"></a> 3106 <strong>7.4.2: BSD Style Startup</strong> 3107 <p> 3108 On BSD style systems, daemons are booted directly from <code>/etc/rc</code> and 3109 related scripts. Incase you have a file <code>/etc/rc.local</code>, edit it, 3110 and add the statement: 3111 <p> 3112 <pre> 3113 /usr/local/bin/crossroads start 3114 </pre> 3115 3116 <p> 3117 If your BSD system lacks <code>/etc/rc.local</code>, then you may need to start 3118 Crossroads from <code>/etc/rc</code>. Your mileage may vary. 3119 <p> 3120 </body> 3121 </html>