It's Better Than Bad, It's Good

I still don’t want to turn into the Coding Horror rebuttal site, but Phil Windley’s comments on a recent post of mine brought up another of Jeff’s posts that bothered me at some level. Phil mentions at the end of his post that his company’s been using Puppet and Nagios to improve their operational efficiency. That’s an extremely important point. I’m a big believer in automation and monitoring. So The Problem With Logging rubbed me the wrong way. I think the key point is here:
bq.if the information you’re logging is at all valuable, it deserves to be surfaced in the application itself, not buried in an anonymous logfile somewhere. Even if it’s just for administrators. Logfiles are all too often where useful data goes to die, alone, unloved and ignored.
This is exactly the problem Nagios is trying to solve, and it does it well. Reading Jeff’s post, it sounded to me like he was using the wrong tools (or none at all).
One problem is in targeting your log output. Where do you log things? A file? A relational DB? An email? You need to think about the tools a bit before you jump to a conclusion here. The main thing is to not dump your logs into someplace that nobody will ever look in. Pick a tool – Nagios, Splunk, some other tool, but scanning logfiles in Notepad won’t cut it. You should have a story on logging that doesn’t involve looking through individual logfiles, because that won’t scale past a single process on a single machine. If you have logfiles you’re not looking at, that’s your fault, not log4net’s.
In general, I think Jeff’s closing advice is pretty good:
bq.Resist the tendency to log everything. Start small and simple, logging only the most obvious and critical of errors.
For instance, in a .NET application, one of the first places you should put logging is in the unhandled exception handler. That’s just obvious. But there are some other places you should log that would break the mold of “small and simple”. If you have an unhandled case in code, or something where there’s a question (“would this ever happen?”), log it if the condition arises. It’s a simple way to give feedback. It’s not a bad idea to log every exception you catch, other than the ones where an exception is a part of the non-exceptional behavior of the class (which is lousy design, but for example, .NET 1.0 gave the programmer only int.Parse with a catch clause as a way to try to parse an integer out of a string). Sure it makes for a noisy logfile. So make it your job to get the logfile quiet. You should be finding out why your app is throwing so many exceptions to begin with; particularly in .NET, where exception throwing is fairly expensive.
That said, Jeff’s example of the coder who logs method parameters at DEBUG level is really too extreme. First of all, there’s an enormous overhead, both in code and performance. Secondly, at that level, you may as well attach a debugger to the process. It may be the case that you operate in an environment where you can’t run a debugger. But even in that case, I wouldn’t put in that level of debugging until it was proven necessary.

— Gordon Weakliem

---

Comment

  1. The place to log is, obviously, to the underlying distributed filesystem of your mapreduce runtime. Then you run MR jobs to postmortem stuff later. Things like Apache Chukwa and stuff from Facebook are going this way, though the real secret sauce is the mining algorithms, which I haven’t seen much of yet.

    — Steve Loughran · 28 December 2008, 15:26 · #

  2. Steve, the key thing being “your mapreduce runtime”. Sounds like a good idea, but that’s a big architectural assumption. I know that you do a lot of work on MR/Hadoop, so it definitely makes sense that you’d use that option.

    That’s an interesting idea on AWS, use S3 as the log target (at least for some level of messages). Something like what’s described here

    Gordon Weakliem · 29 December 2008, 15:16 · #

Commenting is closed for this article.