Reference Data Management: What, Why, and How

Razza Dimension Server 5.0 offers a modern approach to reference data management.



At first, hierarchy and reference data management may not sound like something to get excited over — if you even understand what it means right away, since it sounds esoteric and complex. Fortunately, though, reference (or master) data management is exactly what you would imagine it to be: the management of data that typically resides in "master" tables, such as customer, location, product, and, of course, the innumerable "type" tables that clutter up our databases. This data also sometimes assumes the rather fancy alias of "dimensions," particularly in the context of data warehousing.

Mastering the Reference Dimension

Reference data can legitimately claim to be the Rodney Dangerfield of data — it just doesn't get any respect. And frankly, who's too bothered about all those dimensions? It's the facts that we're after. Yet managing reference data — particularly hierarchical reference data such as product and geographic hierarchies — has been the bane of many an application database, whether for custom OLTP applications, ERP, or data warehousing and business intelligence. Architects, modelers, and managers of data know that in the fairyland of data, reference data is the imp: hard to control and constantly creating mischief. And, although it is facts such as dollar sales or revenue that users pursue, these facts are meaningless without the context provided by their dimensions — in other words, the reference data. The number 1,000,000 tells us nothing — until we learn it's a sales amount, stated in currency C, for product X, in the region R of nation N, achieved by salesperson P, in the year Y and month M — each of which is reference data! Reference data is tremendously important because it provides a frame of reference to information, without which the information is meaningless.

Why is reference data so hard to manage? Because there's no single definition of such data. The profusion of system-of-record applications and data sources in today's world leads, more often than not, to multiple sources of reference data, each of which is true to its own domain, but may or may not agree with others. This situation is usually further confounded by a pervasive lack of coordination and standards for reference data, at both the business process and technology levels. Every IT solution that needs reference data typically builds containers and presentation components for it or builds custom bridges to other existing data sources — and thus adds more threads to the spaghetti of reference data. All this complexity, effort, and cost could largely be avoided by defining a single source of truth for reference data. The Razza Dimension Server from Razza Solutions exists to build such a single source of truth. (See Figure 1.)

Figure 1: Raza Dimension Server intervenes between source and reporting/analysis systems to ensure data hierarchies are clean.

Introducing Razza Dimension Server

The stated purpose of the Razza Dimension Server (hereafter, RazzaDS) is "to greatly simplify master data harmonization across multiple enterprise systems." RazzaDS is conceptually a simple solution. Recognizing that most dimensions are basically a hierarchy, RazzaDS lets users develop their own hierarchies of business dimensions, without regard for the underlying metadata aspects such as data types and lengths. For example, say that users want to create a geographical hierarchy consisting of countries, regions, states, and postal codes. The usual data modeling approach would be to define these as entities, connected by identifying or nonidentifying relationships. In RazzaDS, users do away with all metadata concerns and directly enter the hierarchy values into a simple, intuitive, hierarchical folderlike user interface. (See Figure 2.) Thus, users could directly enter the value "USA," followed by four child regions: West, Midwest, South, and East. Then, under the South region, users could enter states, such as Florida, Georgia, and Louisiana. Finally, postal codes would be entered under each of these states. The nodes (called either limbs or leaves in RazzaDS) have properties, which characterize the nodes. (Actually, properties can also be defined at the hierarchy or version level. More on versions later.)

Figure 2. Users enter hierarchy values into a simple, folderlike user interface.

One of the strongest features of RazzaDS is that properties can be local or global, primary or derived (my terms), declared or inherited, newly defined or predefined (such as for Essbase), and more. In data modeling terms, properties are the non-key attributes of the dimensional entities.



RazzaDS enforces referential integrity by means of two rules: A node can't occur more than once in a single hierarchy (thus preventing circular references), and if a node participates in more than one hierarchy, it must have the same children in all the hierarchies. Additionally, simple to complex user-defined validations can be applied to the hierarchies in a real-time or batch mode.

Hierarchies can be managed as versions. For example, the 2004Q2 version of the Accounts hierarchy can be in production while 2004Q3 is being finalized (from a copy of the 2004Q2 version, of course).

Hierarchies can be imported and exported. In fact, when you deploy RazzaDS, you mostly import hierarchies from existing systems of record (using XML) rather than have users enter them manually.

Pros

  1. Competent solution for reference data management
  2. Effective and user-friendly interface
  3. Support for XML/SOAP

Cons

  1. No support for modeling of reference data
  2. Can be disruptive to existing apps
  3. Small, privately held single-product company

RazzaDS enforces security at various levels: version, hierarchy, node, or property level. In addition it has function-level security: Users can be granted or denied access to specific functions.

RazzaDS comes in two architectural flavors. It has a small, traditional client/server version, and a larger enterprise-level Web solution based on Microsoft Windows technology. It doesn't support Unix. Deploying RazzaDS is relatively easy; however, it potentially requires the critical step of cleansing and integrating existing reference data sources in order to load the data into RazzaDS. The effort required to achieve this step will depend on the quality and cohesion of existing data, and I think will define the effort and cost of deploying RazzaDS.

Building Core (Data) Competency

Overall, I see RazzaDS as a conceptually simple, focused, and common-sense product — which is a compliment, not a critique. RazzaDS focuses on and has the potential to vastly improve upon a core competency, if you will, of application databases: that of effectively managing reference data. This focus, in turn, improves the ability of IT to serve facts and analytic data to business users and directly contributes to the organization's overall data quality. Either of these benefits alone could make RazzaDS a worthwhile investment.

PRODUCT SPEC SHEET

Razza Dimension Server 5.0
Razza Solutions
7209 Ellaview Lane
Austin, TX 78759
972.293.4343
www.razza.com

Minimum Requirements:

  • Operating system: Windows NT/2000, 95/98/ME
  • Database compatibility: Oracle and Microsoft SQL Server
  • Hardware: Pentium processor, 20MB disk space, 64MB RAM.

Pricing: $2,500 per named user; server prices start at $59,500.

Using a solution such as RazzaDS isn't without risks, notably the possibility of disruption to existing applications caused by introducing an off-the-shelf reference data management solution to an existing stable of applications. It's also somewhat risky to rely on Razza's survival as a company and product because Razza is essentially a small, privately held, single-product company. While organic growth is possible, it may also be an unaffordable luxury for Razza, particularly if large vendors in the database, data quality, data warehousing, and business intelligence space decide to grow or acquire similar capabilities.

That said, I have no hesitation in recommending serious consideration of RazzaDS for your reference and dimensional data management. RazzaDS is an elegant, effective, and user-friendly solution, and, if positioned correctly within the enterprise, is likely to bring a significant return on investment through enhanced data quality and manageability.

Rajan Chandras is a principal consultant with the New York offices of CSC Consulting (www.csc.com). The opinions expressed here are his own.


We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Email This  | 
Print  | 
RSS
More Insights
Copyright © 2020 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service