Creating and Naming Database Tables and Fields
This series of articles is based on an outline I made while reading
"Database
Design for Mere Mortals" by Michael J. Hernandez. Mike graciously
gave permission for its use in this derived work. I highly recommend
the book to anyone who wants a good grasp of database fundamentals.
The series is intended to familiarize the uninitiated with the concepts
of database design. I'll include an extensive bibliography for those
interested in more technical or in-depth coverage.
Why Use A Database?
Data overload is a common problem in business today. Corporations and
individuals have plenty of raw data, but can't always find it or aren't
aware that they even have it. Raw data must be filtered and organized
to become useful information. Databases are a primary tool for the task;
a tool which takes advantage of the speed and power of modern computers.
Goals
Creating a database can be a daunting task to a beginner. As with most
complex projects, good database design can be broken down into discrete
steps which are easier to manage.
In a future article, we'll discuss the information-gathering process
necessary before database design even begins. For now, let's talk about
something less abstract — tables and fields. For the examples
in this article I'll try to provide samples which reflect real-world
situations.
Terminology
First, some terminology. We won't be using the highly technical terms
"relations" and "tuples"; even in the manuals they introduce the terms
and then go right back to "tables" and "rows" so we'll skip the interim
step. However, the technical term "attributes" works well to describe
the concept embodied in "columns" so we'll use whichever term seems
more understandable in context. And each "record" makes up one "row,"
another familiar term. Again, we'll use whichever fits best conceptually.
Finally, for each row there will be an "attribute" in each row. This
is a "field."
A database is made up of one or more tables. Each table is made up
of one or more rows of information. Each row contains various attributes,
one in each column.
From Flat Spreadsheet to Relational Database
If you're familiar with spreadsheet software, you're familiar with
tables of rows and columns. For instance, you might have columns called
"Name," "Company," "Address," "Specialty," and "Phone." This works fine
for a while, but you soon realize that if you have multiple entries
from the same company, the same specialty, or perhaps even the same
name, you're duplicating entries. One of the geeky things database administrators
get excited about is efficiency. It's not efficient to store the company
name "Spinhead Web Design" twelve times; it's the same every time, right?
And if you have two contacts at Spinhead who answer the main phone number,
you've entered that twice, haven't you? And look; you have "Database
Administrator" listed for three different contacts. This obviously isn't
a big issue with small databases, but in your plans for world domination,
small databases will rapidly be replaced by huge databases. Plan
ahead. Build efficiently. [1]
How can we make the spreadsheet more efficient?
If there were a way to use multiple spreadsheets, one for each type
of data, and have the spreadsheets share information, you could have
the "People" spreadsheet linked to the "Skills" and "Companies" spreadsheets.
Each spreadsheet would only contain unique records,
not duplicates [2]. If two people
had the same skill or worked for the same company, they could both refer
to the same record the appropriate spreadsheet.
Congratulations. You now understand the basic concept behind a relational
database.
Tables and Fields
Each of our spreadsheets compares to a table in our "Contacts" database.
But how exactly do we decide what each field should contain? And which
fields go in which table?
In the real world, a certain amount of research precedes any real database
design work. We'll discuss it further in a future article, but essentially,
through interviews with the intended users and analysis of the existing
processes, we have collected a long and probably disorganized list of
possible candidates. Through two processes, we'll decide which will
become fields, which will become tables, and which fields go in which
tables.
First, look for nouns which identify an object or event but do not
modify some other object or event. This list is our preliminary list
of Tables. At the same time, we're doing the opposite; looking for nouns
which do modify another object or event. This list becomes
our preliminary list of Fields. For example, "vendor" would go on the
"Tables" list, but "phone number" is likely a characteristic of "vendor"
and belongs on the "Fields" list. "Tables" usually refer to collections
of things; something that can be broken down into smaller components.
"Fields" would be those components. Don't worry about precise names
yet. We'll stick to simple descriptive words or phrases for now.
So far we haven't done anything to organize our fields into tables.
Assigning Fields to Tables
Once we're certain the our "Tables" and "Fields" lists are complete,
it's time to assign each field to the appropriate table. For each entry
on the preliminary field list, choose the table which it describes or
to which it applies. If some fields don't seem to fit any table, set
them aside for the moment. We'll be altering our lists as we proceed,
so don't worry about orphaned tables or fields.
Now, let's take a closer look to ensure that our tables and fields
are as efficient as possible at this stage. A short checklist for an
ideal table:
- Represents only a single subject
- Has a field (or pair of fields) which uniquely identify each row
(the Primary Key; more on that later)
- Does not contain unnecessary duplicate fields
- No repetition of the same type of value
- No fields which belong in other tables (we'll address links between
tables later)
And for an ideal field:
- Represents a characteristic of a table subject
- Contains a single value
- Is atomic; that is, not multi-part [3]
- Is not calculated or concatenated
- Is unique throughout the database structure
Naming Tables and Fields
Once we've reached this point, let's establish our table and field
names. Names for all database elements should be:
- Unique
- Clear, meaningful, unambiguous
- Short
Some restrictions for naming tables:
- No acronyms or abbreviations
- No proper names or words which unduly limit the data which can be
entered
- Should not imply more than one subject
- Should be plural
And for naming fields:
- No acronyms
- Use abbreviations only if clear and meaningful
- Should not imply more than one subject
- Should be singular
You'll probably see some duplicate field names, such as "Name" in both
the "Companies" and "People" tables. Let's make them unique across the
database. You might choose "ContactName" and "CompanyName" but whatever
you use, stick to the guidelines above, and be consistent.
Review
Database tables and fields take the simple spreadsheet concept to another
dimension. Establishing tables which refer to a single subject, populated
with fields which refer specifically to that subject, will provide a
sound foundation for future growth.
You'll be using table and field names frequently in your database work.
Make sure they're succinct, but not arcane.
Give it a try. On paper, make up some sample lists of subjects and
characteristics. Group the characteristics by subject, and try your
hand at meaningful names.