AWStats is the premier web analytics reporting tool for analyzing data from web server logs to report on services such as web, streaming media, mail, and FTP servers. We spoke with founder Laurent Destailleur (Eldy) about how it all began, what makes it superior to the other tools out there, and why he chose to share it freely as an Open Source software.
This interview was originally published on April 19, 2020.
Please tell us a little bit about your background before founding AWStats.
In 1999, after receiving my engineering degree in electronics, I worked at an IT company providing development services for large companies. On the side, I developed my company’s web site as well as other sites on miscellaneous topics.
Needing a tool to analyze the traffic for these sites, I started testing existing tools like Analog, Webalizer, and other proprietary software. However, I found their results inaccurate or difficult to understand, so I began to develop my own tool, called AWStats.
What is AWStats?
AWStats is a tool for analyzing a web server’s technical log to compile data and report on the site’s traffic in a clear and easily understandable way.
Most other log analyzers just count the number of lines in the log file to determine the number of visits. For many reasons, this way of calculating traffic produces incorrect conclusions. AWStats is different as it uses several dedicated algorithms for more accurate results.
What information is analyzed by AWStats, and how is it beneficial to site owners?
AWStats needs only one web server log file to function but can compile data from several log files and several web servers as if it was a single server file. These log files contain a lot of information, including number of visited pages and images downloaded as well as when, from which IP address, and by which browser, the site was accessed. AWStats uses this information to deduce if your site’s traffic is real, a worm, or a robot – and presents this information in a graphical view.
Modifying AWStat’s engine to analyze other log files was easy, so the tool was enhanced to analyze email servers (reporting input and output emails) and media server logs (detailing time spent on media files, from where, when, etc.).
Which operating systems are supported?
Since AWStats was developed in Perl, it will run on all platforms and operating systems.
What log formats can be analyzed, and is there a limit of log size?
AWStats can analyze any log file as long as it a text file. Our configuration tool can define the format of the log file, so any format, even custom files, are supported.
In 2005, France’s leading TV and media company needed a solution to analyze their web sites’ traffic. All the existing tools were either too slow or unable to process their large log files. When they approached AWStats, I worked on a solution that could analyze these exceptionally large files without increasing memory consumption and still maintain a high level of performance. Thanks to this upgrading, AWStats can analyze any log file of any size.
Is it possible to run AWStats if you do not have access to your site’s log files?
While AWStats does not require any changes to your web site, it does need access to the log file to run your reports. If necessary, log files can be downloaded, and statistics can be processed locally.
How is AWStats different than other log analyzers?
A lot of the features built into AWStats assure you more accurate results compared to other log analyzers. Some examples:
- If a hit is done on a robot.txt, we know the access was done by a robot and not a human. Other log analyzers just discard the access to robot.txt. AWStats remembers the IP used to get the robot.txt file, so any other access before or after, even on public web pages, with this same IP, can be interpreted as robot access.
- AWStats has a database of signatures to exclude not only robots but also worms or spider attacks. This database also evolves dynamically during the analysis of the log. Detection is enhanced by heuristic analysis.
- AWStats can differentiate between access to images and pages. So when images and only images have been accessed, we can determine that it is not a human visitor even if all the other information indicates so.
- AWStats tries to consolidate page access with redirects, to be able to deduce pages that are not read and avoid counting it as two pages when only one was read.
- AWStats can be enhanced with external plugins.
- AWStats has its own algorithm, optimized for log analysis situation, to sort results and provide a top 5, top 10, top 20 much faster than when using traditional sort algorithms.
- AWStats uses a variable delay to estimate the beginning and end of sessions, allowing the algorithm to run much faster compared to the conventional comparison tools.
- AWStats use both a database of rules and heuristics analysis to be able to presume the keywords used to find your web site.
- There are a lot of other features found only in AWStats that assure better accuracy and performance than any other log analyzer.
How often can AWStats be run?
AWStats can be run as often as you like. AWStats saves the last position analyzed in the log file using both a binary pointer and a timestamp pointer. This way, when running subsequent analyses, it can quickly pick up the process from where it last ended, even if the log file has been purged or rotated. The more often you run AWStats, the faster it is.
Why did you choose to share AWStats as Open Source software?
AWStats was released as Open Source software on Sourceforge in 2000 for several reasons.
- It was the beginning of the Open Source movement, and I thought that the best way to learn and understand it was to be part of it myself.
- I spent a lot of time developing AWStats and felt that the time spent was not a good investment if I was the only user. Releasing it as Open Source software significantly lowered the time I spent per user.
- It was a good way for AWStats to gain popularity and get help with its tools.
- When I first searched for web traffic analysis tools, I was disappointed in the poor choices available. Sharing AWStats as Open Source made sure this did not happen to others.
What significant changes or addition were made to your project by contributors?
The main contributions to AWStats were the enhancement of its rules and signature databases, which are used to detect keywords, worms, viruses, browsers, operating systems, smartphones, etc.
Bug fix contributions are also very constructive. It is always a pleasure to receive bug fixes, especially for those bugs that were unknown to me!
The plugin architecture, such as the GeoIP plugin, was also mostly handled by external contributors.