Summary
Log File Formats

Summary works with logs from the following servers (and many others):

Apache
When set to produce NCSA Common or NCSA Combined format logs. Other log formats Apache might produce are supported if you configure Summary appropriately.
Apple Personal Web Server
Apple Share IP
Version 5.0 or higher.
Boulevard
Demon Internet
FileMaker Pro
Prior to Version 5 only brief format is supported.
iPlanet Web Server
Lotus Domino
When set to NCSA Common format or Extended Log File format only.
MacHTTP
Mac OS X Server
Microsoft IIS
Microsoft Personal Web Server for Macintosh
NCSA httpd
Netscape web servers
Including FastTrack, Commerce, and Enterprise.
O'Reilly WebSite Professional
OS X builtin web server
QuidProQuo
Rumpus
Version 1.2 or higher, earlier versions with some limitations.
Sun Java System Web Server
Web Server 4D
WebSTAR
Version 1.2.1 or higher.
WebTen
See the Apache comments.
WN
WU-FTP
Zeus
When set to produce NCSA Common or NCSA Combined format logs.
Others
Many others will work automatically. If they don't, see the user specified log formats section.

Summary will automatically recognize the following log formats:

Demon Internet native log format
MacHTTP
Microsoft IIS native format from version 3 or newer
NCSA Common, also called Common Log File Format
NCSA Combined, also called NCSA Extended
Netscape, also used by iPlanet and Sun Java System Web Server
O'Reilly WebSite Professional native format
WebSTAR, also used by several others: QuidProQuo, Web Server 4D, etc.
W3C Extended Log Format, also called ExLF. Support for the official, Microsoft, and WebSTAR variants.
WN Verbose
WU-FTP xferlog log format

Many servers can be configured to produce log files in several different formats. Additionally, you can sometimes configure which specific log fields appear. There is often a trade off between large log files with lots of information and smaller log files that don't tell you everything but actually fit on your hard disk. The following comments will help you choose between the different options, and tell you how to best configure your server for use with Summary.

Comments on specific servers and log formats


AppleShare IP

AppleShare IP, version 5.0 and higher, writes a log file in WebSTAR format. The log file is stored in System Folder, Preferences, AppleshareIP Preferences, HTTP Logs, HTTP Log. The log format is not configurable and AppleShare does not log referrer or agent information so reports that depend on those fields will be blank.


Apache

Apache has a highly configurable log format which, unfortunately, is not self documenting. That means that Summary can not automatically determine the format of an Apache log file unless it is in one of the standard formats. The two most common formats used with Apache are NCSA Common and NCSA Combined. I highly recommend that you configure Apache to produce NCSA Combined format logs. The Apache configuration command for this format is:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined

When running multiple virtual domains with a single copy of Apache it is common practice to log each virtual domain into its own log file. This is important because NCSA Combined format logs don't contain the virtual domain name. It is impossible to determine which log entry goes with which virtual domain when using a single log file in NCSA Combined format with entries from multiple virtual domains. You can reconfigure Apache to put the virtual domain name in each log entry, but that makes the configuration process much more complex.


FileMaker Pro

Version 5 or newer produces logs in a standard format that Summary will recognize automatically. With older versions of FileMaker Pro, Summary can only read the web logs if they are in brief format. , To get Summary to read brief format, you will need to specify a "User log format definition" (entered on the Miscellaneous configuration page) of:

DATE-MDY TIME-12 HOST URI


Log-FM

Log-FM is not officially supported but you might want to try the following "User log format definition" (entered on the Miscellaneous configuration page):

DATE-MDY TIME-24 SKIP HOST URI BYTES SKIP AGENT METHOD SKIP SKIP REFERER TRAN-TIME-TICKS SKIP SKIP STATUS DOMAIN

Note that this must all be entered on a single line even though it probably appears as two lines above.


Microsoft IIS

IIS Versions 4.x and newer support Common Log Format, Microsoft Extended Format, and W3C Extended Format (ExLF). We recommend using the W3C ExLF format since that allows you to customize the tokens appearing in the log file. See the discussion of W3C ExLF for more information about the various tokens.

IIS Version 3.x supports Common Log Format and Microsoft Extended Format. Neither one provides referrer or agent information. Summary supports FlashLog Format, using a server plugin from Maximized Software http://www.maximized.com/products/flashstats/flashlog.htm which adds the referrer and agent information to the end of the line. Summary also supports the WebTrends modified format http://www.webtrends.com/ if cookie logging is enabled.


Microsoft Personal Web Server for Macintosh

Microsoft Personal Web Server for Macintosh produces logs compatible with WebSTAR format containing the following log items:

DATE TIME CS-IP HOSTNAME URL RESULT USER BYTES_SENT

Older versions are not completely compatible with WebSTAR format. If your log can't be read, try the following "User log format definition" (entered on the Miscellaneous configuration page):

SKIP DAY MONTH-NAME YEAR TIME-24 SKIP HOST HOST URI STATUS USER BYTES

Note that this must all be entered on a single line even though it possibly appears as two lines above.


NCSA Common
Common Log File Format

This is a very popular format, supported by many servers. Unfortunately it does not provide referrer, agent, transfer time, domain name, or cookie information, which disables many reports. If possible you should reconfigure your server to use NCSA Combined format.


NCSA Combined
NCSA Extended

This is a very popular format, supported by many servers. It does contain all of the most important fields. It does not provide transfer time, domain name, or cookie information, which disables some reports.

The Apache and WebTen command to get NCSA Combined logs is:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined


QuidProQuo format

QuidProQuo supports NCSA Common log format and QuidProQuo formats. We recommend that you use QuidProQuo format. There is an error in their Common Log Format support that makes it not conform to the specification, at least through version 2.1.2. Use this custom format string to parse QuidProQuo logs in Common Log Format:

HOST SKIP SKIP USER SKIP DATE-CLF FULL-REQUEST STATUS BYTES

QuidProQuo native format is very similar to WebSTAR format. Summary fully supports QuidProQuo format. See the WebSTAR format description below for additional comments.


OS X builtin web server

OS X uses Apache as it's standard web server. The log files are located in "/private/var/log/httpd". This file is normally hidden from view. You can access it from the Finder when logged in as root, and create an alias to it, or you can create a symbolic link from the command line. Otherwise all of the Apache comments apply.


WebSTAR format

The WebSTAR server's log format is highly configurable. WebSTAR supports using Common Log Format (CLF) logs, Extended Log Format (ExLF) logs and WebSTAR Log Format (WLF). We recommend using WebSTAR Log Format.

When configuring your log format there are three issues to keep in mind. The more information you put in the log file, the more Summary will be able to report to you. At the same time, the more information you put in the larger the log files will become, eventually filling your hard disk. Finally, some of the log tokens have become obsolete and have been replaced by newer tokens. The older tokens will still work but are not recommended due to various limitations.

Summary requires the following tokens in WebSTAR format logs:

DATE Date of request.
URL The requested item. Same as CS-URI and CS-URI-STEM.
HOSTNAME Name or IP address of the requesting computer. You can use C-IP if you always leave DNS lookups off in WebSTAR but it is slightly larger. In WebSTAR 5.x you will need to use C-IP, since HOSTNAME has been removed. You can use C-IP and C-DNS together in that order to keep all available information even though Summary won't take advantage of it and your log will be larger. CS-IP and CS-HOST together in that order work but they are not recommended. This field is required for a great number of different reports.

The following fields are very highly recommended:

TIME Time of request. Required for the Hourly, Time of Day, and Gaps in Service reports.
BYTES Bytes sent, same as BYTES_SENT. Required for all of the Bytes related columns in various reports.
SC-STATUS Result code, the same as STATUS. This provides just slightly more information than RESULT (which is also acceptable). CS-STATUS will work but it is not recommended. Required for the Bad Links and Failed Requests reports and the Errors column in various reports.
REFERRER Site and page that referred the visitor to your site. Slightly shorter than CS(REFERER), which also works. This field will increase the size of log files substantially. Required for all of the referrers and paths reports.

The following fields provide additional information for Summary, which enables additional reports. You can decide if they are worth it. Listed from most interesting to least interesting overall, although that is partly a personal preference.

AGENT Browser making the request. Slightly shorter than CS(USER-AGENT) which also works. This field will noticeably increase the size of the log file. It provides information for the Browser and computer reports and also improves visit tracking.
USER Authenticated user name entered into a name and password dialog when some portion of the site is restricted. Provides information for the Auth User report.
TRANSFER_TIME time to send data in 1/60ths of seconds. This field provides information for the Modem Speed report. Avoid TIME_TAKEN which is sometime in hours:minutes:seconds, sometimes in seconds, and sometimes in 1/60ths of seconds which can cause inaccurate results.
CS(HOST) The name of the domain the user sent the request to, the same as HOST and HOSTFIELD. This field provides information for the Virtual Domain report and can be very useful in configuring sub-reports.
SEARCH_ARGS CGI arguments. Same as CS-URI-QUERY. This field provides information for the CGI Arguments report, which must also be enabled in the Summary configuration. The value of this report will depend on your use of CGI and plug-in arguments.
METHOD The method from the request header, GET, PUT, etc. Same as CS-METHOD and slightly shorter than CS(METHOD). This field provides information for the Method Report. Fairly technical.
CS(COOKIE) Any cookies sent by the browser, the same as COOKIE. This field provides information for the Cookie report, which must also be enabled in the Summary configuration. Not used by most sites.

There are a few other fields that WebSTAR supports, which might be of some use to someone, but Summary doesn't use them:

FROM Almost always empty. This field used to be the e-mail address of the visitor but privacy concerns caused browsers to stop sending it. Occasionally filled in by web robots.
CONNECTION_ID The internal WebSTAR id number associated with this connection. I can't imagine ever using this.
PATH_ARGS Portion of the request after a '$' character. This is a WebSTAR specific feature, designed to make programming CGI code easier but it is hardly ever used.


W3C Extended Log Format
(ExLF)

This is a highly configurable format, but not all servers allow all of the options.

Summary requires the following tokens in the log file:

DATE Date of the request.
CS-URI The requested item, essentially the same as CS-URI-STEM which also works.
C-IP Client IP address. Use along with C-DNS if you have DNS lookups turned on, as long as C-IP appears first. CS-IP and CS-HOST can be used together in that order instead but they are obsolete and not recommended. Required for a great number of different reports.

The following fields are very highly recommended:

TIME Time of the request. Required for the Hourly, Time of Day, and Gaps in Service reports.
BYTES Bytes sent. Same as SC-BYTES. Required for the Bytes related columns in various reports.
SC-STATUS Result code. CS-STATUS will work but it is obsolete and is not recommended. Required for the Bad Links and Failed Requests reports and the Errors column in various reports.
CS(REFERER) Site and page that referred the visitor to your site. This field will increase the size of log files substantially. Required for the referrers and paths reports.

The following fields provide additional information for Summary, which enables additional reports. You can decide if they are worth it. Listed from most interesting to least interesting overall, although that is partly a personal preference.

CS(USER-AGENT) Browser making the request. This field will noticeably increase the size of the log file. It provides information for the Browser and computer reports and is also used to improve visit tracking.
CS-USERNAME Authenticated user name entered into a name and password dialog when some portion of the site is restricted. Provides information for the Auth User report.
TIME_TAKEN (WebSTAR) or TIME-TAKEN (Microsoft) The time taken to respond to the request and transfer the data back to the visitor. This field provides information for the Modem Speed report.
CS(HOST) The name of the domain the user sent the request to. You can also use S-IP or S-DNS instead if they are available. This field provides information for the Virtual Domain report and can be very useful in configuring sub-reports.
CS-URI-QUERY CGI arguments. This field provides information for the CGI Arguments report, which must also be enabled in the Summary configuration. The value of this report will depend on your use of CGI and plug-in arguments.
CS-METHOD The method from the request header, GET, PUT, etc. Slightly shorter than CS(METHOD). This field provides information for the Method report. Fairly technical.
CS(COOKIE) Any cookies returned by the browser. This field provides information for the Cookie report, which must also be enabled in the Summary configuration. Not used by most sites.
CS-VERSION The HTTP protocol version number.

There are many fields that ExLF supports, which might be of some use to someone, but Summary doesn't use them. Here are a few of them:

CS-FROM Almost always empty. This field used to be an e-mail address of the visitor but privacy concerns caused browsers to stop sending it. Occasionally filled in by web robots.
CS-BYTES The number of bytes in the original request sent by the client to the server.


User Specified Log Formats

If your server produces logs with a format not listed here, you may be able to configure Summary to read the log file by specifying the format manually. You specify your log format by making a string with the following tokens in the order that the fields appear in your log file and entering that string into User log format definition.

DATE-CLF Full date/time, as used in Common Log File format
DATE-DMY Day, month, year
DATE-MDY Month, day, year
DATE-YMD Year, month, day
UNIX-TIME The number of seconds since January 1, 1970
TIME-24 Hour, minute, second
TIME-12 Hour, minute, second, AM/PM
YEAR Two or four digit year number
MONTH One or two digit month number
MONTH-NAME Three character month name
DAY One or two digit day of the month
HOUR One or two digit hour of the day
MINUTE One or two digit minute of the hour
SECOND One or two digit second of the minute
FULL-REQUEST The original HTTP request line
HOST Host name, the domain name or IP address of the visitor
URI The requested resource, with optional '?' portion
URI-QUERY The '?' portion of the request
STATUS Three digit HTTP response code
WEBSTAR-RESULT WebSTAR four character response code
BYTES Number of bytes transferred
TRAN-TIME-SECS The transfer time in seconds
TRAN-TIME-TICKS The transfer time in 1/60ths of a second
TRAN-TIME-MILLI The transfer time in milliseconds
TRAN-TIME-HMS The transfer time in HH:MM:SS, or 1/60ths
REFERER The referer from the HTTP header
AGENT The agent from the HTTP header
USER The user name from authorization
USER-BRACKET The user name from authorization, terminated by a " ["
METHOD The HTTP request method
PROTOCOL The transfer protocol (such as HTTP/1.0)
DOMAIN The domain name, often from HTTP host field
MAY-DOMAIN The server name, often from HTTP host field. This will not override a preceeding DOMAIN value.
SERVER The physical server name
COOKIE The HTTP cookie field
SKIP Skip to the next field
IF-EOL If at the end of the line, return a valid entry, otherwise continue parsing
MUST-EOL Must be at the end of the line, otherwise it is a corrupt line
EOL Skip everything to the end of the line, must be last if used
W3SVC Field must start with "W3SVC". Log entries with values starting with "MSFTPSVC" for this field are skipped.
FIXUP Fix FlashLog parsing to WebTrends layout
WN Check for CLF, Combined, or WN Verbose log
CHAR Skip one character of input
WHITE Skip any number of spaces and tabs
DATE-APACHE Date and time in Apache error log format
SESSION The session key. Used to identify visits.
MAY-SESSION The session key, but don't replace one gotten from a SESSION field.
LANG The requested language code

Summary will automatically parse the log file into tokens, handle quoted strings, and find fields separated by a space, comma and a space, or tabs.

For example NCSA Common Log Format and NCSA Combined format would be specified with the single string:

HOST SKIP USER DATE-CLF FULL-REQUEST STATUS BYTES IF-EOL REFERER AGENT

 

Quick Start | Overview | Tutorial | How To | Configuration
Javascript Code | Virtual Domains | Log Formats | Custom Overviews
Questions | Reports | Purchasing | FAQ | Glossary

Copyright 1998-2004 by Summary.Net - Updated 3/3/034