A small database has the following business rules:

Joe Celko, Contributor

August 20, 2007

5 Min Read

For this month's puzzler, consider a posting in a Newsgroup by Patrick L. Nolan at Stanford University. He has a small database with the following business rules:

1) Every person in the database is uniquely defined by a single key, their user_id.

2) Everyone is assigned a job category, call them 'A', 'B' and 'X'.

3) Everyone in job category 'X' has a supervisor who must be in either job category 'A' or job category 'B'.

4) Nobody can be their own supervisor.

One proposal was to divide job category 'X' into two, call them 'XA' and 'XB' respectively. All the 'XA' people would have 'A' supervisors, and all the 'XB' people would have 'B' supervisors.

Nolan immediately noticed that there is redundancy and the possibility of inconsistency. Suppose somebody in job category 'XA' somehow gets assigned to a supervisor in job category 'B', contrary to the definition of 'XA'. Can you think of a way to do this in pure DDL?ANSWER:

The first temptation is create a look up table for the job categories like this:

CREATE TABLE JobCategories (job_cat CHAR(2) NOT NULL PRIMARY KEY, -- {'A', 'B', 'X', 'XA', 'XB'} job_cat_description VARCHAR(50) NOT NULL);

Now I need a users table that has the constraints:

CREATE TABLE Users (user_id INTEGER NOT NULL PRIMARY KEY, job_cat CHAR(2) NOT NULL REFERENCES JobCategories(job_cat), super_job_cat CHAR(2) NOT NULL REFERENCES JobCategories(job_cat), CHECK (CASE WHEN job_cat IN ('XA', 'XB') AND super_job_cat = 'X' THEN 'T' ELSE 'F' END = 'T') Etc);

This is not a good answer. I don't have any knowledge of who the supervisor is. I don't know what to do with a 'non-X' situation.

The better answer lies in the observation that Users and Job Assignments are fundamentally different. Users are entities and Job Assignments are relations thus we need two tables. The job categories are so short you could put them in a CHECK() constraint, but what the heck, let's allow for expansion and flexibility.

The first trick is to have a super key in the Users table that can be referenced by the job assignments. This adds the business rule that a user has one and only one job category.

CREATE TABLE Users (user_id INTEGER NOT NULL PRIMARY KEY, -- key job_cat CHAR(1) NOT NULL REFERENCES JobCategories(job_cat), UNIQUE (user_id, job_cat), -- super key! etc.);

-- this could be a CHECK() in Users table CREATE TABLE JobCategories (job_cat CHAR(1) NOT NULL PRIMARY KEY, -- {'A', 'B', 'X'} job_cat_description VARCHAR(50) NOT NULL);

The job assignments use the super key as their foreign key. Notice the use of a role prefix on the data element names.

CREATE TABLE JobAssignments (sub_user_id INTEGER NOT NULL, sub_job_cat CHAR(1) NOT NULL, FOREIGN KEY (sub_user_id, sub_job_cat) REFERENCES Users(user_id, job_cat),

super_user_id INTEGER NOT NULL, super_job_cat CHAR(1) NOT NULL, FOREIGN KEY (super_user_id, super_job_cat) REFERENCES Users(user_id, job_cat),

-- the tricky part! CHECK (sub_user_id <> super_user_id), -- assumed

CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat IN ('A', 'B') THEN 'T' WHEN sub_job_cat = 'A' AND < THEN 'T' WHEN sub_job_cat = 'B' AND < THEN 'T' ELSE 'F' END = 'T'), PRIMARY KEY (sub_user_id, super_user_id), etc.); This assures nobody is his own supervisor and that everyone in job category 'X' has a supervisor, who must be in either job category 'A' or job category 'B'. But again we do not know what to do about 'A' and 'B' users. You can easily expand the CASE expression to as complicated a set of rules as you wish. CASE expression can also be nested inside each other, too. The question is whether to use positive or negative logic. That is, should the WHEN clauses test for TRUE conditions and accept a row, or test for FALSE conditions and reject a row. For example, in this problem, what if we only reject an 'X' category user without a proper supervisor and accept any other situation? CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat NOT IN ('A', 'B') THEN 'F' ELSE 'T' END = 'T') In this example, we have a more compact CASE expression, but that is not always true. When you have really complicated rules, I strongly recommend getting a copy of Logic Gem, a Windows-based decision table tool. You fill in a spreadsheet-like form with conditions that that create your business rules. Once you've defined the rules, the editor will automatically analyze them. It will add missing rules and remove rules that are redundant or contradictory. You know for certain that you have logically complete business rules from which you can automatically generate source code. Joe Celko is an independent consultant in Austin, Texas, and the author of SQL Puzzles and Answers (2006), Joe Celko's SQL for Smarties: Advanced SQL Programming (2005), and Joe Celko's Trees and Hierarchies in SQL for Smarties (2004).A small database has the following business rules: 1) Every person in the database is uniquely defined by a single key, their user_id; 2) Everyone is assigned a job category, call them 'A', 'B' and 'X'; 3) Everyone in job category 'X' has a supervisor who must be in either job category 'A' or job category 'B'; 4) Nobody can be their own supervisor. One proposal was to divide job category 'X' into two, call them 'XA' and 'XB' respectively. All the 'XA' people would have 'A' supervisors, and all the 'XB' people would have 'B' supervisors... This presents a redundancy and the possibility of inconsistency...

About the Author(s)

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights