Recently I got into a situation where I had to analyze the error logs on production instance and list top 20 most often thrown errors. Demandware Log Center was my first bet, but sadly it does not provide such as statistics.
I assumed that the errors we get are pretty much the same every day, so I just downloaded all error logs from the recent few days and I wrote this small bash scripts to analyze them. It basically extracts the first line of every error stack trace, sort, removes duplicates and aggregates the results.
#!/bin/bash
combined=./combined.tmp
if [ -f $combined_log ] ; then
rm $combined_log
fi
for log in $@
do
if [ ! -f $log ]; then
echo Warning: $(basename $log) is not a regular file (skipped) >&2;
continue;
fi
if [ ${log: -4} != .log ]; then
echo Warning: $(basename $log) is not a log file (skipped) >&2;
continue;
fi
echo $log;
awk '/Stack trace/{getline; print}' $log | sed 's/\[[0-9\-]+\]/ /g' >> $combined;
done
cat $combined | sort | uniq -c | sort -rn | head -n20 > top20errors.txt;
rm $combined;
echo 'Done! :) '
Example usage:
bash parser.sh logs/*.log