Partitioning Questions

Last Post 16 Jun 2011 08:39 PM by rerichards. 4 Replies.
AddThis - Bookmarking and Sharing Button
Author Messages Not Resolved
rerichards
New Member
New Member

--
16 Jun 2011 11:04 AM
Running SQL Server 2008 Enterprise 64 bit on Windows 7.

Before I get to my partitioning questions I need to lay some groundwork...

The tables in our databases all have this same general layout:
CREATE TABLE dbo.MyTable(
  MyTable_UID int identity(1,1),
  ClientID int,
  ...,
  ...,
  CreatedAt datetime,
  ChangedAt datetime,
  CONSTRAINT PK_MyTable_UID PRIMARY KEY CLUSTERED (MyTable_UID, ClientID))

Every client is given a unique ClientID to uniquely differentiate one client from another. Therefore, multiple clients can share the same database.

However, some clients are so large that we put these clients into their own database. In this case, the same ClientID exists in every record, so their is no selectivity on ClientID.

Most (nearly all) of our stored procedure statements have the following in their statements:
WHERE ClientID = @ClientID

A much lower percentage of stored procedure statements contain CreatedAt or ChangedAt:
WHERE ClientID = @ClientID
  AND CreatedAt = @CreatedAt

WHERE ClientID = @ClientID
  AND ChangedAt = @ChangedAt

Question #1: It seems to me that a partitioning key on ClientID would be a good candidate for those databases containing multiple clients. Does that sound reasonable?

Question #2: For those clients residing in their own database, then partion on CreatedAt or ChangedAt?

Question #3: It would be great if I could have the same partitioning key for all databases. If I partitioned all databases based on CreatedAt or ChangedAt, would a statement that does not involve CreatedAt or ChangedAt (such as the following) make use of partition elimination?
SELECT FullName
FROM MyTable
WHERE ClientID = @ClientID
  AND FullName LIKE 'SM%'
rm
New Member
New Member

--
16 Jun 2011 01:03 PM
You may ask yourself why partition that table first. If you partition table by column other than ClientID and query the table by ClientID, may end up to go through all partitions.
rerichards
New Member
New Member

--
16 Jun 2011 03:54 PM
Sorry, but your response does not make sense. I gave three very explicit questions and your response is so vague I cannot tell what point you are trying to make.
gunneyk
New Member
New Member

--
16 Jun 2011 05:27 PM
I am not sure what partitioning under those conditions buys you. Partitioning is first and foremost a data management feature not a performance feature. It is great for things such as sliding windows when you only keep x many partitions around or to maintian indexes on a partition by partition basis IF the indexes are aligned. IN your examples I am not sure why you think normal indexing won't work properly for you. You say that most sps will have WHERE ClientID = @ClientID in them but you don't say what else the WHERE clause will contain except for sometimes the dates. I find it hard to believe that msot sps will return all the rows for a given client so you must have other SARGS in the WHERE clause that narrow down the rows you are looking for. If those SARGS are selective enough then partitioning won't buy you much if anything in terms of performance. A clustered index on ClientID will order the rows such that you would only partially scan the clustered index when specifying the ClientID which is pretty effecient if you wanted all the rows for a given client. Lets take your questions one at a time:

1. Again it may not buy you anything depending on the WHERE clause over a clustered index.
2. How many rows will there be for a given created or ChangedAt value? If this is relatively small then partition elimination will not buy you any more than a normal non-clustered index will get you.
3. If you don't specify the partition key in the WHERE clause how can you expect it to do partition elimination? The partition key is what tells the engine which partions the data is in so if you don't specify it then it will read all partitions to find the correct data.


rerichards
New Member
New Member

--
16 Jun 2011 08:39 PM
We are researching options to scale our growth more vertically (adding more clients to our existing databases) while maintaining our service level agreements and server operations.

Sorry I wasn't clear, but certainly we do have other sargeable arguments in our filters, but the one given is that we always have ClientID.

Even though the clustered index contains ClientID as a key column, it is not the leading column. The identity column is the leading column in the clustered index (MyTable_UID), largely to avoid fragmentation on inserts, since it is ever increasing and ClientID is not ever increasing. We hardly ever use MyTable_UID as a filter, and so the Clustered Index is not used that often for seeks when we have WHERE ClientID = @ClientID, as other non clustered indexes are used. Depending on the need we may create a unique constraint, switching the key column order to ClientID, MyTable_UID.



Acceptable Use Policy
---