This is my contribution to the “recurring, revolving blog party” that is T-SQL Tuesday. Thanks to this month’s host, Jes Borland, for the topic!
On a consulting project some months ago, our client had a database full of daily financial data. For every day that the markets were open, we had a row with some data about the trades that had happened that day. If we ignore all the sticky details, it looked something like this:
date | high | low |
---|---|---|
3/8/2011 | 10 | 8 |
3/9/2011 | 11.5 | 9 |
3/10/2011 | 10.5 | 8.5 |
One of the reports we wanted to generate from this data was a day-on-day comparison, where each day’s high and low were compared with the previous day’s to see how it was changing. The goal was something that looked like:
date | today’s high | yesterday’s high | change | today’s low | yesterday’s low | change |
---|---|---|---|---|---|---|
3/9/2011 | 11.5 | 10 | +1.5 | 9 | 8 | +1 |
3/10/2011 | 10.5 | 11.5 | -1 | 8.5 | 9 | -.5 |
This is simple enough, right? I fired up my editor and, quick as could be, dashed off something like the following query:
select today.[date], today.high, yesterday.high, today.high - yesterday.high as dhigh,
today.low, yesterday.low, today.low-yesterday.low as dlow
from data as today join data as yesterday on yesterday.[date] = DATEADD(day, - 1 , today.date)
|
Swell! Yesterday is today minus a day. That makes perfect sense. Everyone, from historians to small children and especially bright squirrels will tell you that’s what yesterday is.
Except, it doesn’t work.
Because in the markets we were dealing with, ‘yesterday’ isn’t just the day before today, it’s the trading day before today. So if I extend our sample data set back a few days, it might look like this:
date | high | low |
---|---|---|
3/3/2011 | 11 | 9 |
3/4/2011 | 10.5 | 8 |
3/7/2011 | 10 | 7.5 |
3/8/2011 | 10 | 8 |
3/9/2011 | 11.5 | 9 |
3/10/2011 | 10.5 | 8.5 |
And there’s the hitch. When the query above looks for the yesterday to match 3/7, it’s not going to find it, because there was no trading on 3/6 (a Sunday), and so no data. And now, whoops, I’ve just removed Monday’s from our report. Pro tip: despite only taking up 20% of the week, Mondays are kind of important.
But surely this is something I can correct for, I thought. Mondays happen on a pretty regular schedule, I should be able to have a condition somewhere that’ll switch my dateadd(day, -1, today.date to dateadd(day, -3, today.date, and there, solved.
Except, I realized, that won’t work either. Because now Mondays are back, unless there’s a holiday on Friday. And holidays in the middle of the week are going to screw up the days that come after them. But surely there are a finite, predetermined number of holidays. We should be able to figure out when they’re going to be and have some logic to support that, right? In short, no. As a taste of why not, take a look at the method for calculating the date of Easter.
And so, we come finally to the aggregate method portion of our story. If we can’t determine a priori when yesterday is, the only thing left to do is to pull out a nested aggregate query and let the data tell us:
select today.[date], today.high, yesterday.high, today.high - yesterday.high as dhigh,
today.low, yesterday.low, today.low-yesterday.low as dlow
from data as today join data as yesterday on yesterday.[date] = (select MAX([date]) from data
where [date] <= DATEADD(day, - 1 , today.date))
|
It’s certainly not the most efficient date calculation I’ve ever done, but it does satisfy our requirements: it includes every day that has data for both that day and the previous trading day, regardless of weekends, holidays and other assorted calendar oddness. And that’s the story of how I ended up using a nested aggregate query to figure out when yesterday was. What would you do?
Share the post "T-SQL Tuesday 016: When is yesterday, exactly?"