I had a request to change a view to display information as a result of "Zones" instead of "Observation". So, a little background...
For each processed file, after inserting, I set an observation number based on the row in the file. This is easily done with the ROW_NUMBER() function in SQL-Server:
ROW_NUMBER() OVER (ORDER BY ObservationSQN) AS ROWNUM
,and some other unique tricks I use to separate it out per file.
The request then asked, instead of displaying Observation Numbers, they wanted an average; and if the number is less than or equal to 0 then don't include that observation in the average.
Zone 1 has the average of observations 1,2,3,4,6
Zone 2 has the average of observations 5,8,11
Zone 3 has the average of observations 10,12,13,14,15
Well, thankfully SQL-Server doesn't include a value in it's average function if it's null. Just have to set the value to NULL if <= 0
So at first I did this brute force technique (which ends up being ugly and slow):
SELECT
avg(zone1.value) as zone1Avg,
avg(zone2.value) as zone2Avg,
avg(zone3.value) as zone3Avg,
zone1.FileSQN
FROM
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (1,2,3,4,6)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone1
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHEREObservation in (5,8,11)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (10,12,13,14,15)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone3 on zone3.FileSQN = zone2.FileSQN
GROUP BY FileSQN
The technique I ended up using, used the WITH common table expression:
WITH WithTable (FileSQN, value, ObservationNumber)
AS
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value,
ObservationNumber
FROMTable1
WHERE DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)
SELECT
zone1.zone1Avg,
zone2.zone2Avg,
zone3.zone3Avg,
FileSQN
FROM
(
SELECT
avg(value) as zone1Avg,
FileSQN
FROM
WithTable
WHERE ObservationNumber in (1,2,3,4,6)
GROUP BY FileSQN
) zone1
INNER JOIN
(
SELECT
avg(value) as zone2Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (5,8,11)
GROUP BY FileSQN
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
avg(value) as zone3Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (10,12,13,14,15)
GROUP BY FileSQN
)
Summary: The first technique was slower, since I was basically pulling the same info 3 times, compared to the second technique's once.
Note: The names of the tables have been change to protect the innocent. I wrote this in line, so not everything might be syntactically correct
For each processed file, after inserting, I set an observation number based on the row in the file. This is easily done with the ROW_NUMBER() function in SQL-Server:
ROW_NUMBER() OVER (ORDER BY ObservationSQN) AS ROWNUM
,and some other unique tricks I use to separate it out per file.
The request then asked, instead of displaying Observation Numbers, they wanted an average; and if the number is less than or equal to 0 then don't include that observation in the average.
Zone 1 has the average of observations 1,2,3,4,6
Zone 2 has the average of observations 5,8,11
Zone 3 has the average of observations 10,12,13,14,15
Well, thankfully SQL-Server doesn't include a value in it's average function if it's null. Just have to set the value to NULL if <= 0
So at first I did this brute force technique (which ends up being ugly and slow):
SELECT
avg(zone1.value) as zone1Avg,
avg(zone2.value) as zone2Avg,
avg(zone3.value) as zone3Avg,
zone1.FileSQN
FROM
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (1,2,3,4,6)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone1
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHEREObservation in (5,8,11)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value
FROM Table1
WHERE Observation in (10,12,13,14,15)
AND DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)zone3 on zone3.FileSQN = zone2.FileSQN
GROUP BY FileSQN
The technique I ended up using, used the WITH common table expression:
WITH WithTable (FileSQN, value, ObservationNumber)
AS
(
SELECT
FileSQN,
CASE WHEN value <= 0 THEN NULL ELSE value END AS value,
ObservationNumber
FROMTable1
WHERE DateDiff(day, ProcessEndDate, GETDATE()) <= 1)
)
SELECT
zone1.zone1Avg,
zone2.zone2Avg,
zone3.zone3Avg,
FileSQN
FROM
(
SELECT
avg(value) as zone1Avg,
FileSQN
FROM
WithTable
WHERE ObservationNumber in (1,2,3,4,6)
GROUP BY FileSQN
) zone1
INNER JOIN
(
SELECT
avg(value) as zone2Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (5,8,11)
GROUP BY FileSQN
)zone2 on zone2.FileSQN = zone1.FileSQN
INNER JOIN
(
SELECT
avg(value) as zone3Avg,
FileSQN
FROM
WithTable
WHEREObservationNumber in (10,12,13,14,15)
GROUP BY FileSQN
)
Summary: The first technique was slower, since I was basically pulling the same info 3 times, compared to the second technique's once.
Note: The names of the tables have been change to protect the innocent. I wrote this in line, so not everything might be syntactically correct
No comments:
Post a Comment