I wanted to write about another bit of code that I created for the What Color Is...? website I talked about in my previous post.
Displaying the Results
Probably the most interesting and challenging part of this project was to come up with a way to generate and display the results in an efficient way. Colors were all saved in the database as 3 separate fields (Red, Green, and Blue). This made it easy to select a color average for all responses. Here is the SQL:
SELECT t.name ,
COUNT(*) AS Count,
AVG(IsNull(red,255)) as Red ,
AVG(IsNull(green,255)) as Green ,
AVG(IsNull(blue,255)) as Blue
FROM WCIResponse r
INNER JOIN WCIThing t ON r.ThingID = t.ThingID
GROUP BY t.name
One of the options in the survey is to select "No Color", which is saved in the database as null values. These are converted to 255 so that it is the same as if they selected white.
That was pretty easy, but I also collect some basic, anonymous demographic information to enable breaking down the results a bit. I ask for age, for instance. What I wanted to do was to break down the results by age group, i.e. 20-29, 30-39, etc. This is more complicated, of course, and I didn't want to get into doing a massive number of queries and loops, etc, to build up the results. Fortunately I have recently gotten to know more about the SQL Server statement PIVOT. PIVOT is made for creating just such reports as this. This is somewhat more complicated than a simple PIVOT, because I don't want a column for every age, but I want to group the ages by decade.
I'll run through the ColdFusion/SQL code I used to make this work.
First, I set a variable with an SQL expression to create the name of the age group, e.g. "40-49":
<cfset var AgeRange = "CONVERT(VARCHAR,Age/10*10) + '-' + CONVERT(VARCHAR,Age/10*10+9)">
I only want to include the age ranges for which I actually have data, so I do an initial query to get the age ranges in the data. Note that the AgeRange is wrapped in brackets so that it can be used as the column name in the next query, e.g. "[40-49]".
<cfquery name="GetAges" datasource="#application.dsn#">
SELECT DISTINCT
'[' + #preserveSingleQuotes(AgeRange)# + ']' AS AgeRange
FROM WCIPerson
WHERE Age IS NOT NULL
ORDER BY AgeRange
</cfquery>
I can then build the PIVOT query including all of the age ranges. I'm still not comfortable with the PIVOT syntax. I've only used it a few times, and it's still hard for me to wrap my head around without looking at documentation and examples. If you're new to PIVOT too, this site has a few good examples. As you can see below, I reuse the AgeRange variable I set above twice.
<cfquery name="GetResults" datasource="#application.dsn#">
SELECT t2.name
<cfloop query="GetAges">
,ISNULL(t2.#GetAges.AgeRange#,'0,255,255,255') AS #GetAges.AgeRange#
</cfloop>
FROM ( SELECT t.ThingID ,
t.name ,
#preserveSingleQuotes(AgeRange)# AS AgeRange,
CONVERT(VARCHAR, COUNT(*)) + ','
+ CONVERT(VARCHAR, AVG(ISNULL(red, 255))) + ','
+ CONVERT(VARCHAR, AVG(ISNULL(green, 255))) + ','
+ CONVERT(VARCHAR, AVG(ISNULL(blue, 255))) AS Color
FROM WCIResponse r
INNER JOIN WCIThing t ON r.ThingID = t.ThingID
INNER JOIN WCIPerson p ON r.PersonID = p.PersonID
WHERE p.Age IS NOT NULL
GROUP BY t.ThingID ,
t.name ,
#preserveSingleQuotes(AgeRange)#
) AS t PIVOT ( MAX(color) FOR AgeRange IN ( #ValueList(GetAges.AgeRange)# ) ) AS t2
</cfquery>
Although the What Color Is...? site doesn't have a lot of data at this point, I believe this code should scale pretty well. Of course in the unlikely event that the site becomes an internet phenomenon and gets millions of responses I may have to revisit this code.
Posted on January 31, 2011 5:09:23 PM EST by David Hammond