About Me

My photo
Northglenn, Colorado, United States
I'm primarily a BI Developer on the Microsoft stack. I do sometimes touch upon other Microsoft stacks ( web development, application development, and sql server development).

Wednesday, May 18, 2011

Another SQL Trick/Tip

I had a request to change a view to display information as a result of "Zones" instead of "Observation". So, a little background...

For each processed file, after inserting, I set an observation number based on the row in the file. This is easily done with the ROW_NUMBER() function in SQL-Server:

ROW_NUMBER() OVER (ORDER BY ObservationSQN) AS ROWNUM

,and some other unique tricks I use to separate it out per file.

The request then asked, instead of displaying Observation Numbers, they wanted an average; and if the number is less than or equal to 0 then don't include that observation in the average.

Zone 1 has the average of observations 1,2,3,4,6
Zone 2 has the average of observations 5,8,11
Zone 3 has the average of observations 10,12,13,14,15

Well, thankfully SQL-Server doesn't include a value in it's average function if it's null. Just have to set the value to NULL if <= 0

So at first I did this brute force technique (which ends up being ugly and slow):
SELECT
avg(zone1.value) as zone1Avg,
avg(zone2.value) as zone2Avg,
avg(zone3.value) as zone3Avg,
zone1.FileSQN
FROM
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (1,2,3,4,6)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone1
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHEREObservation in (5,8,11)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN  value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (10,12,13,14,15)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone3 on zone3.FileSQN = zone2.FileSQN
GROUP BY FileSQN

The technique I ended up using, used the WITH common table expression:

WITH WithTable (FileSQN, value, ObservationNumber)
AS
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value,
ObservationNumber
FROMTable1
WHERE DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)

SELECT
zone1.zone1Avg,
zone2.zone2Avg,
zone3.zone3Avg,
FileSQN
FROM
(
SELECT
avg(value) as zone1Avg,
FileSQN
FROM
WithTable
WHERE ObservationNumber in (1,2,3,4,6)
GROUP BY FileSQN
) zone1
INNER JOIN
(
SELECT
avg(value) as zone2Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (5,8,11)
GROUP BY FileSQN
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
avg(value) as zone3Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (10,12,13,14,15)
GROUP BY FileSQN
)


Summary: The first technique was slower, since I was basically pulling the same info 3 times, compared to the second technique's once.

Note: The names of the tables have been change to protect the innocent. I wrote this in line, so not everything might be syntactically correct