Naming conventions ... (taking the bait)

Last Post 21 Sep 2006 03:16 AM by Radagast7. 9 Replies.
AddThis - Bookmarking and Sharing Button
Author Messages
jbrown
New Member
New Member

--
08 Sep 2006 12:40 PM
I'm going to take up Michelle's challenge from her September 2006 SQL Mag column.

I name tables containing data about things using singular nouns. Sometimes you need a verb, especially if the table contains more abstract data, but I never use plural forms in any case. Table names should describe what is represented by a single row. Naming tables according to what you might call all of the contents of the table collectively is just forcing people to wear out their "S" keys without adding information.

I never use class words or hungarian type prefixes in SQL object names. Conversly, I always use hungarian notation for symbols (i.e. variables of one sort or another) in my program code. You can easily tell what type of a thing any given SQL object is from how it is used in its context. Adding scope or type prefixes to schemata, tables, and columns (or even stored procedures, triggers etc.) is more wasted typing and making people read past text that contains no added information. The same is not true for VB or C# or even procedural TSQL. The only exception to this convention would be using a suffix on constraint names to indicate the purpose of the constraint. For example, is it a PK or FK or a check constraint etc.

(If you haven't already gotten out your flame throwers, warm 'em up now...)

For column names, I never prefix column names with their table names. In my 17+ years in I.T., I have rarely found people who agree with me on this point. The supposed benefit of prefixing column names with table names is that it makes the symbol clearer. In her column, Michelle expressed this sentiment as "strengthening the identity of the attribute". I believe that what is really at play here is that 60's vintage network databases had technical limitations that required unique names for all database elements. This got picked up by COBOL programmers and propagated out of habit into the 70's era RDBMS world and then picked up by dBase programmers from COBOL programmers in the 80's and the cycle continues.

I believe in the concept of domain name integrity in which a column's name is not just the name of the column, but in fact all of: (SERVER).(Database).(Schema).(Table).(Column). With apologies to Itzik, we don't generally repeat parent names in child names.

The other facet of this same domain name integrity issue is the even more contentious PK/FK naming issue. In my tables, which generally contain a surrogate key, I call the PK "syskey". This has the advantage of avoiding namespace collisions and is relatively readable. My FK columns are generally of the form (ReferencedTableName)_key. If there are multiple FKs to the same parent table, then a qualifying suffix is required. This way, the CHILD.parent_key column joins to the PARENT.syskey column. I couldn't count the number of hours that I've spent arguing with programmers that insist the only proper way to handle this is to have a parent_id column as the PK of PARENT and a parent_id column as the FK in CHILD. They argue that this lends itself to "natural" joins and makes it easy for automated systems that analyze databases etc. etc. etc. In my view, you should never (let's say 98% never) use a natural join. All of your meaningful relationships should be defined declaratively. Your database analysis tools should never have to guess what your relationships are. Also, if it makes sense to join any parent_id to any other parent_id why doesn't it make equal sense to join any quantity column to any other quantity colum, or effective_date to effective_date, and so on and so on.

One last convention that I'll note is stored procedures (and triggers). For these, I use the form: (TableName) _ _ (Verb) _ (Qualifier if Needed)

Assuming that a procedure effects, or uses one table primarily, the idea is to use that table's name, a double underscore, and then a verb describing the natu
GoldenGal
New Member
New Member

--
13 Sep 2006 10:36 AM
Hi Joel,

I'm impressed that you have such a thoroughly-thought-out and organized naming convention -- congratulations on a job well done! I'm not going to take exception to anything that you've said, because you've obviously thought about the impact your naming convention might have on program maintenance, etc.

I have to smile as I read your comments on column name prefixes -- yup, I started working with computers a hundred years ago (seems like), and mostly I was taught by people who programmed in the 60s and 70s, even though I didn't commit to working with databases until the late 80s. I guess their prejudices really did rub off on me! However, I would not have a problem working in your world, with your naming scheme. It's organized, consistent, and meaningful, and to me, that's the most important aspect of a naming convention.

(P.S. that doesn't mean that I've changed my mind about column naming conventions -- I still like my scheme better than yours. However, the primary key name 'syskey' IS a nice idea. I've worked on a job where the pkey name for every table was 'objid' -- it REALLY made stored proc coding EASY, I didn't have to keep looking up pkey column names!)

>>> If there are multiple FKs to the same parent table, then a qualifying suffix is required.

Do you use the Role name as the suffix? It's a little awkward, but, for instance, in a table called SALE, if you're using a supertype table called PERSON to hold records for both customers and sales staff, then the fkey for customer would be named PERSON_key_Customer, and the fkey for salesperson would be named PERSON_key_Salesperson. Very meaningful...

Hold the line with your programmers on the fkey naming conventions -- it's easy, it's clear, it's straightforward, and it sounds like you're the DBA.

>>> I couldn't count the number of hours that I've spent arguing with programmers that insist the only proper way to handle this is to have a parent_id column as the PK of PARENT and a parent_id column as the FK in CHILD.

Sounds like they've been using Visio as a data modeling tool... I don't like that scheme at all.

>>> They argue that this lends itself to "natural" joins and makes it easy for automated systems that analyze databases etc. etc. etc. In my view, you should never (let's say 98% never) use a natural join. All of your meaningful relationships should be defined declaratively.

Amen, brother! NOT enforcing meaningful relationships in an RDBMS is like trying to eat soup with a fork...why bother? If you don't enforce the relationships, then you have to write and schedule a procedure that sweeps through the entire database, looking for orphaned records and inconsistencies between related tables, and then what do you do with them once you find them? Do you just issue a report? a warning? does your stored proc actually remove these records or record these inconsistencies? How much resource would a job like this be sucking up on a daily basis? Or do you just leave the database as is, never check, ignore orphans and inconsistencies? -- that is, until it's time to publish reports, and all of a sudden the numbers are wierd. Why bother using an RDBMS if you're not going to maintain the data integrity by enforcing the relationships?

And don't tell me that you're going to enforce relationships via client-side or middle-tier programs...over time, that idea falls apart. The relationships are between tables in the database; the place to enforce those relationships is in the database, either by DRI or by triggers -- take your pick. Personally, I, too, prefer DRI (declarative referential integrity).

Whoops, sorry, didn't mean to go off like that, but hey, this is a forum, so I can do that.

Thank you, Joel, for sharing with us. Appreciate it!


Radagast7
New Member
New Member

--
21 Sep 2006 03:16 AM
Hello.

Very interesting points here.

Domain Name Integrity

I agree 100% concerning the domain name integrity point, mainly in regards to not having the table name as prefix for column names. In particular I like the Customers.Name over Customers.CustomerName (yes, I do like the 's' suffix - though only because it makes for better pronunciation, in my opinion). I think very few people would prefix properties in eg. a .Net class with the class name. Doesn't help readability - on the contrary.

I can see the point in having syskey/objid, although I do think it's a matter of definitions. Since you'd have to know the name of the table anyway, knowing the name of the key in a ParentID scheme would be simple, and I just like the ParentID semantics better.

Referential Integrity

Always a dividing topic. I totally agree that using a RDBMS ought to commit one to referential integrity, but the argument I hear to do otherwise is 'it's just easier to do modifications, if we don't' - type of arguments.
Of course, one could argue that implementing RI might make modifications needless, but that argument is very hard to make across all domains and solutions. And I find, that the argument about 'why bother with RDBMS if not using RI' rarely bites on those convinced of it's shortcomings.

Hungarian Notation

One final, somewhat off-topic, note on hungarian notation: I really don't see the need for it in .Net code (C# is mentioned). Any notation should always add clarity, where clarity cannot be found, and I don't think this is the case with the .Net environment (finding the type of a variable is simply a matter of hovering over it). It does give some meaning in SPs, though, if using an interface which doesn't help a lot in this area.

My thoughts - on the whole I agree with what's already here, only with slight modifications.

Regards
Jesper Kirk Jensen
SQLUSA
New Member
New Member

--
30 Sep 2006 12:50 PM
I am using Hungarian - style notation not because I am Hungarian, but because I like it.

By the way, it was named after Charles Simonyi, a Microsoft billionaire, the father of MS Windows, a friend of mine.

Table groupings with short prefixes:

mnfInventory
mnfProduct
mnfProductType
mnfProductPicture

mnf for manufacturing

I like to use SINGULAR names, admitting frequently the plural may sound more natural.

For functions "fn" prefix.

Stored proc naming;

<highest category> <next highest>........<detail>

Self-documenting as much as possible

Examples

rptAnnualCorporateSalesSummary
rptAnnualCorporateSalesDetail
rptAnnualDepartmentSalesSummary
rptAnnualDepartmentInventorySummary

Kalman Toth, Database, Data Warehouse and BI Architect
http://www.sqlusa.com/businessintelligence
The Best SQL Server 2005 Training in the World












xfonhe
New Member
New Member

--
10 Oct 2006 06:08 AM
Wow... good stuff here!

I agree with Joel re: pluralization; I find it is almost always utilized inconsistently in a design. Some tables will be pluralized, while others will not. This lack of consistency quickly leads to frustration in coding. My design refrains from pluralization. (note: I really liked Joel's take that "Table names should describe what is represented by a single row"! I'm stealing that one for future arguments )

In naming I use a "modified" Pascal case; by "modified" I mean I use underscores to separate upper case characters within a name. An example would be a column for CPU SKUs. Since both words are all-upper case, the column name would be CPU_SKU, rather than CPUSKU or cpuSKU or CpuSku, etc. "True" Pascal case is supposed to avoid underscores, period... but I have found this particular exception to be beneficial. An example of one of my typical table names: OrgHierarchy. (Note: I do a significant amount of development in Oracle and DB2, both of which have different limitations on naming conventions. Oracle stores all table names in upper case, thus the use of underscores all over the place in an Oracle environment)

There are a couple of places where I consistently use underscores in object names: DRI, Stored Procedures, and User Defined Functions (UDF). I use the following naming convention in DRI: FK_childtable_parenttable; PK_tablename_column(s). For stored procedures, usp_ for "generic" procedures, rpt_ for procedures used in reporting, adm_ for administrative procedures, etc. The pattern is a lower case three-character acronym, followed by an underscore, then a descriptive name for the procedure. ALL of my UDFs are prefixed by ufn_

I have seen a lot of designs that use a prefix for Views; I am not a fan of this, as I use views to obscure underlying table architecture from users. I have not found the view prefix useful in coding either, as I always develop procedures, functions, etc. using the most selective statements possible; which means accessing the tables themselves. Views typically contain references to columns unnecessary to the query I am writing... which is simply inefficient.

Names should be descriptive, but not ridiculously long. A balance must be struck. In many instances, that balance is achieved via acronyms. THIS, I have found, is a major-league nightmare in design!! The problem again is consistency, or lack thereof. This is why a DOCUMENTED library of acceptable acronyms is crucial to any design. Example: Date = DT, Organization = Org, Server = Svr, etc. Deviation from the documented acronyms is NOT TOLERATED! (Note: one gripe I have about use of acronyms is they go too far. Some words that don't need to be shortened are shortened. Best example = DATE. Changing DATE to DT drops a whole two letters... whoop-dee-doo! That is unnecessary IMO. Acronyms should be used sparingly, and only where there is a real benefit).

Another method I have seen used to keep object names short is by dropping non-leading vowels from the object name. Using my OrgHierarchy example from before, the name becomes OrgHrchry. I have found this naming convention to be fairly solid, again, as long as it is used consistently. (starting to see a pattern here?)

Well there's my two cents

X
mwesch
New Member
New Member

--
10 Oct 2006 03:14 PM
Actually X, the rules is that "AEIOU and sometimes Y" are vowels -- see this link.

So using your dropping non-leading vowels system that would make OrgHierarchy into OrgHrrch - I think a rather horrible naming convention.
xfonhe
New Member
New Member

--
11 Oct 2006 06:54 AM
mwesh - eh, it takes some getting used-to. I guess I should have pointed out that in each case I've seen the rule employed "Y" is not counted as a vowel (NOT "sometimes Y").

I prefer to use Pascal case with short, descriptive names; acronyms only where beneficial... and only where the acronym is documented.

X
mwesch
New Member
New Member

--
11 Oct 2006 01:51 PM
I was just having a little fun with the example you gave. I agree in principle that any naming convention is better than none. Personally, I like really long names, not liking to abbreviate anything.
xfonhe
New Member
New Member

--
12 Oct 2006 05:20 AM


Really long names can be a pain when it comes to coding. Of course drag & drop in Management Studio/Query Analyzer offsets that a bit.

After working in Oracle and DB2/U2, and dealing with their limitations on name lengths (sometimes VERY restrictive rules e.g. 18char max), one becomes accustomed to the practice of keeping names short & simple. I find the longer names to be beneficial in that the objects become self-describing, which is always a good thing IMO, but I also find that developers scream bloody murder if the object name is > 10 characters

The one point I tried to drive home is CONSISTENCY. Long names, short names, acronyms, or combinations therof, always be consistent. There is nothing worse (IMO) than an inconsistent, undocumented design... it quickly becomes "spaghetti code", begging to be scrapped and re-architected. I suppose that's a good thing, for consultants like myself. Scratch everything I just said: be inconsistent!! Refrain from documentation! Then call 'ol X... I'll save the day

jbrown
New Member
New Member

--
12 Oct 2006 05:47 AM
I don't mind long names if they are truly descriptive and not just redundant. I also don't mind abbreviations, especially if they are used consistently and even more so if there is a published dictionary of official abbreviations - as long as this doesn't get out of hand!

I had a client many years ago that was very stringent about their abbreviations. They had an extensive abbreviation dictionary and rules for further shortening object names if the abbreviations weren't enough to get under the name length cap. A colleague of mine was team-programming with an employee of this organization who was dutifully zealous about the shop's rules. This colleague thought that the abbreviations had passed from the sublime to the ridiculous (i.e. ACCOUNT becomes ACT, not ACCT). In order to register his protest, my colleague used to write code with really long names for the employee to trip over, such as: boolUnsavedChangesNeedSavingFlag.

Regards,

Joel Brown
Mooseware Ltd.


Acceptable Use Policy
---