Apache Raw Logs & Awstats
Hi,I wrote a small script to parse apache raw logs for analyzing what 'search engine' searches yielded to a page on my site, and it works pretty well. I realized that some rows in the logs are replicated for some reason. For example if you look at the requests below, you will see that IPs are same, request times are almost same, request URLs and referers are same, the only difference is file size (and even that is same for some).
Do you have any idea this single request is recorded as multiple requests? Is it because something like browser optimization, meaning browser is sending multiple requests (for retrieving different chunks ) to the same page concurrently for getting faster response times?
Also, given that awstats generates the stats from these logs, how does this affect awstats logs? Are these reflected as 8 page hits on awstats?
Thanks...
68.152.132.25 - - [30/Jul/2005:05:34:24 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 35042 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:25 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 25270 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:25 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 35042 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:26 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 20974 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:29 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 35042 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:32 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 32430 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:32 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 20974 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
68.152.132.25 - - [30/Jul/2005:05:34:32 -0400] "GET /content/blogcategory/57/82/ HTTP/1.1" 200 20974 "http://www.google.com/search?hl=en&q=search+engine+search&meta=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"