Celko's 'Can't Be Your Own Supervisor' SQL Puzzle - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
8/20/2007
11:35 AM
Joe Celko
Joe Celko
Commentary
50%
50%

Celko's 'Can't Be Your Own Supervisor' SQL Puzzle

A small database has the following business rules: 1) Every person in the database is uniquely defined by a single key, their user_id; 2) Everyone is assigned a job category, call them 'A', 'B' and 'X'; 3) Everyone in job category 'X' has a supervisor who must be in either job category 'A' or job category 'B'; 4) Nobody can be their own supervisor. One proposal was to divide job category 'X' into two, call them 'XA' and 'XB' respectively. All the 'XA' people would have 'A' supervisors, and all

For this month's puzzler, consider a posting in a Newsgroup by Patrick L. Nolan at Stanford University. He has a small database with the following business rules:

1) Every person in the database is uniquely defined by a single key, their user_id.

2) Everyone is assigned a job category, call them 'A', 'B' and 'X'.

3) Everyone in job category 'X' has a supervisor who must be in either job category 'A' or job category 'B'.

4) Nobody can be their own supervisor.

One proposal was to divide job category 'X' into two, call them 'XA' and 'XB' respectively. All the 'XA' people would have 'A' supervisors, and all the 'XB' people would have 'B' supervisors.

Nolan immediately noticed that there is redundancy and the possibility of inconsistency. Suppose somebody in job category 'XA' somehow gets assigned to a supervisor in job category 'B', contrary to the definition of 'XA'. Can you think of a way to do this in pure DDL?ANSWER:

The first temptation is create a look up table for the job categories like this:

CREATE TABLE JobCategories (job_cat CHAR(2) NOT NULL PRIMARY KEY, -- {'A', 'B', 'X', 'XA', 'XB'} job_cat_description VARCHAR(50) NOT NULL);

Now I need a users table that has the constraints:

CREATE TABLE Users (user_id INTEGER NOT NULL PRIMARY KEY, job_cat CHAR(2) NOT NULL REFERENCES JobCategories(job_cat), super_job_cat CHAR(2) NOT NULL REFERENCES JobCategories(job_cat), CHECK (CASE WHEN job_cat IN ('XA', 'XB') AND super_job_cat = 'X' THEN 'T' ELSE 'F' END = 'T') Etc);

This is not a good answer. I don't have any knowledge of who the supervisor is. I don't know what to do with a 'non-X' situation.

The better answer lies in the observation that Users and Job Assignments are fundamentally different. Users are entities and Job Assignments are relations thus we need two tables. The job categories are so short you could put them in a CHECK() constraint, but what the heck, let's allow for expansion and flexibility.

The first trick is to have a super key in the Users table that can be referenced by the job assignments. This adds the business rule that a user has one and only one job category.

CREATE TABLE Users (user_id INTEGER NOT NULL PRIMARY KEY, -- key job_cat CHAR(1) NOT NULL REFERENCES JobCategories(job_cat), UNIQUE (user_id, job_cat), -- super key! etc.);

-- this could be a CHECK() in Users table CREATE TABLE JobCategories (job_cat CHAR(1) NOT NULL PRIMARY KEY, -- {'A', 'B', 'X'} job_cat_description VARCHAR(50) NOT NULL);

The job assignments use the super key as their foreign key. Notice the use of a role prefix on the data element names.

CREATE TABLE JobAssignments (sub_user_id INTEGER NOT NULL, sub_job_cat CHAR(1) NOT NULL, FOREIGN KEY (sub_user_id, sub_job_cat) REFERENCES Users(user_id, job_cat),

super_user_id INTEGER NOT NULL, super_job_cat CHAR(1) NOT NULL, FOREIGN KEY (super_user_id, super_job_cat) REFERENCES Users(user_id, job_cat),

-- the tricky part! CHECK (sub_user_id <> super_user_id), -- assumed

CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat IN ('A', 'B') THEN 'T' WHEN sub_job_cat = 'A' AND < THEN 'T' WHEN sub_job_cat = 'B' AND < THEN 'T' ELSE 'F' END = 'T'),

PRIMARY KEY (sub_user_id, super_user_id), etc.);

This assures nobody is his own supervisor and that everyone in job category 'X' has a supervisor, who must be in either job category 'A' or job category 'B'. But again we do not know what to do about 'A' and 'B' users. You can easily expand the CASE expression to as complicated a set of rules as you wish. CASE expression can also be nested inside each other, too.

The question is whether to use positive or negative logic. That is, should the WHEN clauses test for TRUE conditions and accept a row, or test for FALSE conditions and reject a row. For example, in this problem, what if we only reject an 'X' category user without a proper supervisor and accept any other situation?

CHECK (CASE WHEN sub_job_cat = 'X' AND super_job_cat NOT IN ('A', 'B') THEN 'F' ELSE 'T' END = 'T')

In this example, we have a more compact CASE expression, but that is not always true. When you have really complicated rules, I strongly recommend getting a copy of Logic Gem, a Windows-based decision table tool. You fill in a spreadsheet-like form with conditions that that create your business rules. Once you've defined the rules, the editor will automatically analyze them. It will add missing rules and remove rules that are redundant or contradictory. You know for certain that you have logically complete business rules from which you can automatically generate source code.

Joe Celko is an independent consultant in Austin, Texas, and the author of SQL Puzzles and Answers (2006), Joe Celko's SQL for Smarties: Advanced SQL Programming (2005), and Joe Celko's Trees and Hierarchies in SQL for Smarties (2004).A small database has the following business rules:

1) Every person in the database is uniquely defined by a single key, their user_id; 2) Everyone is assigned a job category, call them 'A', 'B' and 'X'; 3) Everyone in job category 'X' has a supervisor who must be in either job category 'A' or job category 'B'; 4) Nobody can be their own supervisor. One proposal was to divide job category 'X' into two, call them 'XA' and 'XB' respectively. All the 'XA' people would have 'A' supervisors, and all the 'XB' people would have 'B' supervisors... This presents a redundancy and the possibility of inconsistency...

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
Commentary
Preparing for the Upcoming Quantum Computing Revolution
John Edwards, Technology Journalist & Author,  6/3/2021
News
How SolarWinds Changed Cybersecurity Leadership's Priorities
Jessica Davis, Senior Editor, Enterprise Apps,  5/26/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Slideshows
Flash Poll