Note
We are no longer accepting new customers or work orders at this time. Thank you for your interest.
Consider evaluation forms with multiple choice options. Sometimes you want to give your users a "N/A" (not applicable) choice, which will not affect the statistics. How do you store that in the database, in such a way that when you want to average all of the scores, the individual N/A's do not sway the average? If you're storing the values as integer numbers (say, 1-5) and you choose to store N/A's as the value 0, it will strongly affect the average - and that's bad.
The trick is to make those values NULL. SQL is smart about this - the AVG() function knows the difference between NULL and 0. It knows that NULL is not a number, and should not be counted in the final average. It knows that 0 is a number, and should be counted in the average.
A Tricky Question
Here's a question for you. If you're averaging up 10 rows of data like this:
SELECT COUNT(id) as cnt, AVG(score) as score FROM ... |
and 3 of those rows have NULL for the "score" field, how many rows were actually averaged? What will "cnt" be?
It turns out that 7 rows will get averaged; in other words, the database engine will add the 7 non-null values together and divide that total by 7, to get the average! And cnt will be 10, because 10 rows were dealt with, regardless of the averaging going on nearby.
"No answer" versus "N/A answer"
Now consider this. What if you want to record the concept of, "the person didn't answer the question" as a NULL value in the database. You'll have to pick some other value for "the person answered the question, by choosing the N/A answer, indicating they didn't want to answer the question."
I had exactly this situation come up. I chose to store "0" for the "N/A" value in the database, but I didn't want it to sway my averaging. Here's how I solved it:
SELECT COUNT(eval.id) AS cnt, AVG(IF(eval.q1>0,eval.q1,NULL)) AS q1, AVG(IF(eval.q2>0,eval.q2,NULL)) AS q2, AVG(IF(eval.q3>0,eval.q3,NULL)) AS q3, AVG(IF(eval.q4>0,eval.q4,NULL)) AS q4, AVG(IF(eval.q5>0,eval.q5,NULL)) AS q5, AVG(IF(eval.q6>0,eval.q6,NULL)) AS q6, AVG(IF(eval.q7>0,eval.q7,NULL)) AS q7, AVG(IF(eval.q8>0,eval.q8,NULL)) AS q8, AVG(IF(eval.q9>0,eval.q9,NULL)) AS q9, AVG(IF(eval.q10>0,eval.q10,NULL)) AS q10 FROM eval ... |
As you can see, I'm making AVG() see any numeric value 1 or larger, but any value of 0 (or less) is translated to NULL.
And, yes, I should probably break the questions q1..q10 out from the eval table into their own table, and make it a 1-to-many relationship. I'm ignoring that for the purposes of this Trick. :)
It's good to understand the inner workings of a language like SQL. If you already knew this behavior, I apologize. But if you hadn't thought about this before, study it - learn to think in this frame of mind when designing your database schemas. Doing so has helped me a lot in my projects.
Books
MySQL Cookbook from O'Reilly and Associates
Chapter 8 talks about NULL values and functions like AVG(), SUM(), MIN(), MAX(), the DISTINCT keyword, and more.
Don't miss the latest sql tips and tricks!
Subscribe to our low-volume mailing list:
Privacy Policy
| Copyright © 2006 - 2010 Keith Smith Internet Marketing LLC, all rights reserved. |
| Problem with this web site? please let us know |