[BlueOnyx:25851] Re: Real Time access analysis

Michael Stauber mstauber at blueonyx.it
Sat Dec 24 19:05:08 -05 2022


Hi Dirk,

> Have you ever heard of https://goaccess.io/?

Let me tell you: I'm now officially in love with this thing. :p

Whoever wrote that is my kind of "mad scientist coder" and knows his 
stuff and then some.

We all know that the built in web access statistics in the BlueOnyx GUI 
suck. Webalizer was added as a solution that "just works", but the other 
(older) "Usage Information" / "Web" never worked right. The underlying 
problem is a mix of changed log formats, changes in Analog and the whole 
code for it being so ancient and congested that my head explodes 
whenever I try to make sense of it.

I'll now replace the entire existing "Web" statistics outright with 
GoAccess. This will require some fiddling, but the results will be well 
worth it.

I retain a mix of the old statistic principle and combine it with 
GoAccess in this way:

We still let the daily logrotate/logsplit chew through the central 
Apache access logfile and spit out Vsite sized chunks with the relevant 
data into the logs directory if individual Vsites.

But instead of Analog we will then use GoAccess to chew through the 
individual web.log files of all Vsites and spit out the results in JSON 
files. Which we store in a directory tree same as before:

/home/sites/<Vsite>/var/logs/<Year>/<Month>/<Day>/web.json

Next I rebuild the GUI page for the "Web" statistics of Vsites with a 
date selector on top like we have it for the "Email" statistics. By 
default it shows the newest data, but you can (provided we have Json 
files for that date) choose any date you want and it will display the 
stats of said day.

How to present the results? Turns out when GoAccess renders a static 
HTML page (instead of a JSON file) it inserts the JSON data into that 
HTML page. So I just create a GUI page that uses the GoAccess HTML page 
as a template and inserts the loaded JSON data of the selected date. I 
had to deminify the HTML page and it boiled down to around 15.000 lines 
of HTML and inline scripts - plus the JSON payload. I can simply 
template that in CodeIgniter and build a display class around it.

That's nice, clean and easy. Anonymization of IPs is also already built 
in (we need that for GDPR/DSGVO), the stored JSON files are mean and 
lean and take up little space and we can use a method that we already 
have in place for the old statistics to expire older JSON files that are 
past their data retention time.

This will be a really nice addition.

Many thanks for the recommendation, Dirk!

-- 
With best regards

Michael Stauber



More information about the Blueonyx mailing list