RSS Feed http://www.damba.net

Flexible T-SQL:  Bending T-SQL to Get Around Everyday Programming Obstacles Part IV

This is a last blog in the “Flexible T-SQL” series and will be presented at the June 15, 2009 Macon .NET User’s group.  The other entries can be found here: Bitwise T-SQL Part I, Bitwise T-SQL PartII, Using TSQL Parameter Arrays

This post concerns the Partition Over function of T-SQL combined with Row_Number.  Incidentally, this topic was supposed to be the first blog post, but I happened to have projects that used the other topics so I wound up writing them first.

First we need to create a table and populate it with some test data.  We’ll create a log account that stored daily data on client balances.  Refer to the Create Table Test code at the bottom of the blog.  Now we’ve got three clients with three entries on three separate days.  What we want is the Balance of the account at the start of the day.  The standard Group By and Order by commands won’t return just three entries.  That’s where Partition Over and Row_Number comes in.

The Row_Number function does exactly what it says – it creates a Row Number as part of a select statement.  This function does not stand alone – it must be used in conjunction with Partition Over.  Use the Partition Over command to group the result set in the desired way. 

The format for the command is as follows:

Select Row_Number() Over (Partition by [Column], Order By [Column]) as [RowName] From [Table]

So we take that format and apply it against our data as follows:

SELECT   ClientID, Balance , SystemDateTime
, Row_Number() OVER(PARTITION BY ClientID, CONVERT(varchar(10), SystemDateTime, 101) Order By SystemDateTime asc) AS RowNumber
 FROM Test

ClientID    Balance               SystemDateTime          RowNumber
----------- --------------------- ----------------------- --------------------
1           1526.9761             2009-06-01 06:00:01.000 1
1           366.0195              2009-06-01 06:41:01.000 2
1           473.294               2009-06-01 16:04:59.000 3
1           1342.6356             2009-06-02 02:30:11.000 1
1           672.7358              2009-06-02 10:13:01.000 2
1           1442.8226             2009-06-02 17:06:01.000 3
1           1518.724              2009-06-03 03:08:31.000 1
1           1291.6467             2009-06-03 07:24:03.000 2
1           1657.9617             2009-06-03 16:31:35.000 3
2           1449.6962             2009-06-01 03:58:51.000 1
2           849.5199              2009-06-01 05:41:07.000 2
2           1356.5366             2009-06-01 08:36:49.000 3
2           67.9302               2009-06-02 10:09:53.000 1
2           1675.2756             2009-06-02 14:46:58.000 2
2           930.3946              2009-06-02 15:59:09.000 3
2           1574.3981             2009-06-03 01:36:33.000 1
2           401.0359              2009-06-03 09:47:22.000 2
2           652.8249              2009-06-03 12:15:56.000 3
3           1277.0665             2009-06-01 08:21:38.000 1
3           1257.7315             2009-06-01 09:03:16.000 2
3           1253.5927             2009-06-01 09:11:05.000 3
3           1473.4946             2009-06-02 07:44:24.000 1
3           130.657               2009-06-02 08:12:40.000 2
3           143.3953              2009-06-02 16:54:04.000 3
3           689.1654              2009-06-03 00:19:17.000 1
3           556.0171              2009-06-03 03:33:03.000 2
3           1182.3077             2009-06-03 09:58:59.000 3

That’s a lot of data.  Now let’s narrow it down to just the first record for each day by setting that select table as an inner table in a join.

--Use the Previous table as a join and get the first record for each client
Select dtA.ClientId
 , dtA.Balance
 , CONVERT(varchar(10), SystemDateTime, 101) as Date
From (
 SELECT
  ClientID
  , Balance
  , SystemDateTime
  , Row_Number() OVER(PARTITION BY ClientID, CONVERT(varchar(10), SystemDateTime, 101) Order By SystemDateTime asc) AS RowNumber
 FROM Test
 
 ) dtA
Where RowNumber = 1
Order by  Date, ClientId

Now we have the balance at the beginning of the day for each client.
ClientId    Balance               Date
----------- --------------------- ----------
1           1526.9761             06/01/2009
2           1449.6962             06/01/2009
3           1277.0665             06/01/2009
1           1342.6356             06/02/2009
2           67.9302               06/02/2009
3           1473.4946             06/02/2009
1           1518.724              06/03/2009
2           1574.3981             06/03/2009
3           689.1654              06/03/2009


And that’s all there is to it.  Using Paritition Over with Row_Number gives more flexibility in sorting similar information than just using Order By or Group By. 

Supporting Code:
CREATE TABLE [dbo].[Test](
 [TestId] [char](36) NULL CONSTRAINT [DF_Test_TestId]  DEFAULT (newid()),
 [Description] [varchar](50) NULL,
 [ClientId] [int] NULL,
 [Balance] [money] NULL,
 [SystemDateTime] [datetime] NULL CONSTRAINT [DF_Test_SystemDateTime]  DEFAULT (getdate())
) ON [PRIMARY]

GO
Declare @cash money
 ,@Count Int
 ,@LoopCount Int
 ,@CurrentDate datetime

Select @Count = 3
Set @LoopCount = 1
set @currentdate = '6/1/2009'

While @LoopCount <= @Count

Begin

--create 3 records for 3 days for client replace Client Id with 1, 2, 3
 Insert Into Test (ClientId, Balance, SystemDateTime)
 select 1 as ClientId
 , CAST(CAST(newid() AS binary(3)) AS money) as Balance
 , dateadd(s,  CAST(CAST(newid() AS binary(2)) AS int), @currentdate) as newdate

 Insert Into Test (ClientId, Balance, SystemDateTime)
 select 1 as ClientId
 , CAST(CAST(newid() AS binary(3)) AS money) as Balance
 , dateadd(s,  CAST(CAST(newid() AS binary(2)) AS int), @currentdate) as newdate

 Insert Into Test (ClientId, Balance, SystemDateTime)
 select 1 as ClientId
 , CAST(CAST(newid() AS binary(3)) AS money) as Balance
 , dateadd(s,  CAST(CAST(newid() AS binary(2)) AS int), @currentdate) as newdate

 set @currentdate = dateadd(d,  1, @currentdate)

           Set @LoopCount=@LoopCount + 1

End