Sambar Server Documentation

Server Logging


By default, the Sambar Server logs all requests to an access.log for tracking server activity. Several "standard" file formats can be specified for the file layout and the server includes a fairly detailed Log Analysis package for providing daily, weekly or monthly reports. The Apache custom log formatting can be used as well (in this case, the Log Analysis package cannot be used. In addition to the standard file logging, a SQL Logger can be enabled to log hits to a SQL Server; this functionality makes use of the Apache custom logging technology for formatting the SQL insert statement and is detailed below.

Log File Formats
The Sambar Server supports three log file formats: common, combined and performance. Both common and combined log formats are commonly used by NCSA and Apache. The performance log format is a derivative of the combined log format with page delivery performance measurements.



Common Log Format


The following is a sample common log format file:

sandbox.sambar.com - - [09/Sep/1997:10:42:45 -0800] "GET / HTTP/1.0" 200 1234
sandbox.sambar.com - - [09/Sep/1997:10:43:22 -0800] "GET /docs/index.htm HTTP/1.0" 304 0
sandbox.sambar.com - admin [09/Sep/1997:10:46:12 -0800] "GET /sysadmin/index.stm HTTP/1.0" 200 0
207.86.139.145 - - [09/Sep/1997:10:47:43 -0800] "GET /wwwping/index.htm HTTP/1.0" 200 954
207.86.139.145 - - [01/Jan/1997:13:06:51 -0600] "GET /session/wwwping HTTP/1.0" 200 0

The common log file format has the following fields:

remotehost Remote hostname or IP address number if DNS is not enabled/available.
rfc931 The remote login name of the user. (This is not implemented by the Sambar Server).
authuser The username of the authenticated user. This is available when using password protected WWW pages. Important! The username in this field represents the username passed by the client. Applications attempting to hack your server will sometimes attempt to pass invalid credentials in this field; these invalid credentials are logged to server log file, but it does not necessarily mean the user was a "valid" logged in user on the system. This is how usernames of users who do not exist on your server can appear in this field.
[date] Date and time of the request.
"request" The HTTP request line as it came from the client.
status The HTTP response code returned to the client. Indicates whether or not the file was successfully retrieved, and if not, what error message was returned.
bytes The number of bytes transferred. If the status is 200 and bytes are 0, the dynamic page size could not be determined.



Extended Common Log Format


The extended common log format is a variant of the common log format; this format adds two additional fields to the end of the log line, the referer and the user agent fields. The following is a typlical log line:

sandbox.sambar.com - - [09/Sep/1997:10:42:45 -0800] "GET / HTTP/1.0" 200 1234 "http://www.skyweb.se/sambar/" "Mozilla/4.0 (Win95; I)"

The extended common log file format has the following fields:

remotehost Remote hostname or IP address number if DNS is not enabled/available.
rfc931 The remote login name of the user. (This is not implemented by the Sambar Server).
authuser The username of the authenticated user. This is available when using password protected WWW pages.
[date] Date and time of the request.
"request" The HTTP request line as it came from the client.
status The HTTP response code returned to the client. Indicates whether or not the file was successfully retrieved, and if not, what error message was returned.
bytes The number of bytes transferred. If the status is 200 and bytes are 0, the dynamic page size could not be determined.
"referer" The url the client was on before requesting this url.
"agent" The browser the client is using.



Performance Log Format


The performance log format is specific to the Sambar Server. It is a variant of the extended common log format; the time in milliseconds that it took to respond to the client's request is inserted into the stream after the bytes. The following is a typlical log line:

sandbox.sambar.com - - [09/Sep/1997:10:42:45 -0800] "GET / HTTP/1.0" 200 1234 75 "http://www.skyweb.se/sambar/" "Mozilla/4.0 (Win95; I)"

The performance log file format has the following fields:

remotehost Remote hostname or IP address number if DNS is not enabled/available.
rfc931 The remote login name of the user. (This is not implemented by the Sambar Server).
authuser The username of the authenticated user. This is available when using password protected WWW pages.
[date] Date and time of the request.
"request" The HTTP request line as it came from the client.
status The HTTP response code returned to the client. Indicates whether or not the file was successfully retrieved, and if not, what error message was returned.
bytes The number of bytes transferred. If the status is 200 and bytes are 0, the dynamic page size could not be determined.
msec The time in milliseconds that it took the server to respond to the request.
"referer" The url the client was on before requesting this url.
"agent" The browser the client is using.

HTTP Error Codes

The Sambar Server logs the following HTTP error codes:



SQL Logger
To configure the server to use a SQL Logger, you must configure three configuration parameters in the config/config.ini.

SQL Logger = true
SQL Log Cache = dbcache-name
SQL Log Query = insert into rawlog (host, login, tstamp, vhost, referer, browser, bytes, status, request) VALUES ('%h', '%u', '%t', '%v', '%{Referer}i', '%{User-Agent}i', %B, '%s', '%r')

The SQL Log Cache must be a SQL datasource configured using the database cache configuration. Obviously, the database engine must be enabled in order to use the cache specified by the SQL Log Cache. The SQL Log Query string uses the custom log formating (outlined below) to build up the SQL insert statement; above is a sample insert string. All values inserted into the log table are escaped for single-quotes to ensure the insert statement is not malformed.

Custom Log Format

The format argument to the SQL Log Query or Custom Log Format directives is a string. This string is logged to the log file (or SQL server) for each request. It can contain literal characters copied into the log table and the c-type control characters "\n" and "\t" to represent new-lines and tabs. Literal quotes and back-slashes should be escaped with back-slashes.

The characteristics of the request itself are logged by placing "%" directives in the format string, which are replaced in the log file by the values as follows:

%a:          Remote IP-address
%A:          Local IP-address
%b:          Bytes sent.
%B:          Bytes sent, excluding HTTP headers.
%c:          Connection status when response is completed.
                'X' = connection aborted before the response completed.
                '+' = connection may be kept alive after the response is sent.
                '-' = connection will be closed after the response is sent.
%D:          The day of the request (dd).
%{FOOBAR}e:  The contents of the environment variable FOOBAR
%{FOOBAR}$:  The contents of the RC$ or RCS variable FOOBAR
%f:          Filename
%h:          Remote host
%H:          The request protocol
%{Foobar}i:  The contents of Foobar: header line(s) in the request
             sent to the server.
%I:          Log the server thread ID.
%l:          Remote logname (from identd, if supplied)
%m:          The request method
%M:          The month of the request (dd).
%N:          The username the user provided in authentication.
%p:          The canonical Port of the server serving the request
%P:          The password the user provided in authentication (plain text).
%1:          The password the user provided in authentication (sacrypt encrypted).
%2:          The password the user provided in authentication (MD5 encrypted).
%3:          The password the user provided in authentication (Unix crypt encrypted).
%q:          The query string (prepended with a ? if a query string exists,
             otherwise an empty string)
%r:          First line of request (i.e. GET /foo.htm HTTP/1.0)
%s:          Status.  For requests that got internally redirected, this is
             the status of the *original* request.
%t:          Time, in common log format time format (standard english format)
%{format}t:  The time, in the form given by format, which should
             be in strftime(3) format. (potentially localized)
%T:          The time taken to serve the request, in milli-seconds.
%u:          Remote user (from auth; may be bogus if return status (%s) is 401)
%U:          The URL path requested, not including any query string.
%v:          The canonical ServerName of the server serving the request.
%V:          The server name according to the UseCanonicalName setting.
%Y:          The year of the request (yyyy).

Important! Unlike the Apache custom log format, there is no ability to attach "conditions" for inclusion of the item.

Note that there is no escaping performed on the strings from %...r, %...i and %...o. This is mainly to comply with the requirements of the Common Log Format. This implies that clients can insert control characters into the log, so care should be taken when dealing with raw log files.

Some commonly used log format strings are:

Common Log Format (CLF)
"%h %l %u %t \"%r\" %s %b"
Common Log Format with Virtual Host
"%v %h %l %u %t \"%r\" %s %b"
NCSA extended/combined log format
"%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\""
Referer log format
"%{Referer}i -> %U"
Agent (Browser) log format
"%{User-agent}i"

© 1998-2001 Sambar Technologies. All rights reserved. Terms of Use.