Thursday, 8 February 2007

Modern databases have too many tables

We have been using the rules of normal form to design databases for the past 20 years. The problem is that these rules quite often result in databases with far too many tables.

Why so many tables

Let's take a configuration management database (CMDB) for a bank where we store data about all the items within a bank and how they relate to each other. We’d have tables for people, computers, printers, software, departments, projects, locations etc. Let's make a conservative estimate of 30 different entity types to store information on.

Unfortunately this list of tables is only just beginning as we would also want:
  • tables containing lookup values
  • tables to join M:M relationships
  • cache tables (where required)
  • a user table
  • permissions tables
  • a log table
  • a system parameters table
  • temporary work tables
  • data transfer tables for file loading
  • some people also like to split the main tables into current and history tables
The number of tables quickly grows and in our example above will probably reach around 200 tables.

Why is this a bad thing?



More tables mean more code

Every table in your database needs code to support it:
  • Code to add data to the table. This might be a form or a feed.
  • Code to check the integrity of the data. You do check your data don't you?
  • Code to view the data and provide reports.
  • Code to search the data.
The more tables in your design then the more code you need.



More code is always bad

More code means:
  • it will take longer to develop the system
  • it will cost more
  • it will cost more to maintain
  • it will have more errors
  • users will have to wait longer for a solution

The last point is hurts in many ways as the longer your users wait then the higher are their expectations and the less likely you are to meet these increased expectations.

It also means that there is more chance that the needs of the application will have changed radically by the time your solution is ready to roll out.


More tables means fewer features

Let's imagine we have a number of lookup tables in our database. The data in these tables would have been entered by hand as it's probably a waste of time providing a form to enter data into each table.

If we store all these lookup values in just one table then we could easily provide a screen to maintain the data in that one table. We could now spend the time to provide features such as default values, display sequence and HTML code generation as these features will automatically appear on every lookup value in the database.

The problem is that rules of normal form tell us we can't have different types of values being stored in the same fields.

The question is - "does your design really need that many tables?"

2 comments:

psteve said...

Just a gentle, I hope, note to say that you should work on your spelling and grammar a bit. Always is always spelled always and never allways. "The last point is hurts" is not right. "More tables means less features" actually, it means fewer fetaures. "Radicaly" has two l's.

Your point is interesting, and I look forward to seeing you develop it, but you'll get more readers who care about what you say if you take the time to write carefully.

Cheers, and don' tfeel you have to approve this, and you can remove it if you like.

Mike Robinson said...

Thanks for the feedback - spelling changed. I didn't think I'd get any feedback on such an arcane subject.