Median is not a default aggregate in SQL-Server, but is sometime a perferable statistical function than Average. So here is quick tip on how I got the median:
Define: Median -
For Example:
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY Current ASC
-and-
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY Current DESC
By using these two values I can pull the absolute value from their subtraction
ABS(ASCRow - DescRow)
To get an approximation of where it is associated with the median. Now if the values you are sorting are distinct, then you can find the median by looking for numbers that are less than or equal to 1 and then taking their average.
ABS(ASCRow - DescRow) <=1
In my case I can get more than a count of 2 numbers that are at the median, in this case I did another ROW_NUMBER function but this time order on the absolute difference and then I selected the top row in an outer query:
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY MedianDistance) as ROW_NUM
This is where I stopped, even though there is still the chance of not getting the "true" Median. I would basically need to go further and pull all data points with the same value as the MedianDistance for those partitions and do an average.
Define: Median -
"The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values" - Wikipedia http://en.wikipedia.org/wiki/MedianBy using ROW_NUMBER() function twice I can get my approximation of where the median is located.
For Example:
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY Current ASC
-and-
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY Current DESC
By using these two values I can pull the absolute value from their subtraction
ABS(ASCRow - DescRow)
To get an approximation of where it is associated with the median. Now if the values you are sorting are distinct, then you can find the median by looking for numbers that are less than or equal to 1 and then taking their average.
ABS(ASCRow - DescRow) <=1
In my case I can get more than a count of 2 numbers that are at the median, in this case I did another ROW_NUMBER function but this time order on the absolute difference and then I selected the top row in an outer query:
ROW_NUMBER() OVER (PARTITION BY Date, Array, Inverter ORDER BY MedianDistance) as ROW_NUM
This is where I stopped, even though there is still the chance of not getting the "true" Median. I would basically need to go further and pull all data points with the same value as the MedianDistance for those partitions and do an average.
No comments:
Post a Comment