What is Reference Data?

Reference data, to tell you honestly, is a fairly hot topic in the financial market today. But interestingly the truth is, this was not the case. Until relatively recently, it had not attracted the same amount of study that its cousin, the Real-time Financial Market Data ever had. However, the picture now seems to have changed a lot with the advent of current regulatory scaffolds in the US and Europe along with the increasing intricacies and the cost of managing it – all responsible for attracting attention to Reference Data with renewed interest.

For Reference Data, it is the dimension that matters the most. The mere mentioning of a figure, like 1,000,000 does not strike a cord anywhere till one is told that it refers to a sales amount, stated in currency C, for product P, in the region of R of nation N that has been achieved by the sales person S, in the year Y and month M – each of them being examples of Reference Data. Reference Data is immensely important since it provides a frame of reference to the information, in the absence of which it just becomes meaningless. And that is why it has become so importance for business. Reference data allows the business to analyze and also draw perspective and this in turn is a great help to chart the road ahead.

However, none seems to have become interested in dimensions – it is the facts and figures that people are interested in. Yet managing reference data, especially hierarchical reference data that includes product and geographical gradations has always been a source of exasperation for many application databases, be they custom OLP applications, ERO or data warehousing and standard business intelligence. Also, architects, modelers as well as managers of data are fully aware of the fact that in the field of data, the Reference Data is known as the most notorious – difficult to control and always causing complications.

But why is Reference Data so difficult to manage? Is it because of the fact that there is no single definition for such type of data or that there is no single source from which it can be traced from or encountered? On the other hand, the abundance of system-of-record applications and data sources in the contemporary world leads to multiple sources of Reference Data, each of them true to their own domains, yet mostly disagreeing with each other. And this variety in the source and the nature of the obtained data can lead to complexities and complications no doubt that business managers need to deal with expertly. Wrong interpretation of the data can lead the business to arrive at mistaken conclusions and its implications can be fatal.

The condition is further vitiated through a persistent lack of coordination and values, both at the business process premises and at technology level. Perhaps for this very reason, every IT solution that requires reference data characteristically builds containers and presentation gears for it or creates custom bridges to other existing data sources, thus toting up more threads to the bedlam of reference data.

Let us see how some of the commonest of Dimension Servers react to Reference Data.

Most of the Dimension Servers try to ‘deeply simplify master data harmonization across multiple enterprise system. Realizing that the majority of the dimensions are truly hierarchy, the severs let users build their own hierarchies of business dimensions, sans any regard for the underlying metadata aspects like data types or lengths. For an user wanting to create a geographical hierarchy comprising countries, regions, states, postal codes, etc (this often becomes necessary for a business that is truly big and spans various countries and thus needs to get data and analyze it to arrive at correct conclusions), the standard data modeling approach would be to identify these as entities, linked with identifying or even non-identifying their relationship.

Some Dimension Servers do away with all metadata aspects and directly enter the hierarchy values into a straightforward, perceptive hierarchical folder-like user interface. In that case, the users could directly enter the value as ‘USA’, followed by four geographical regions: West, Midwest, South and East. Subsequently, under South region, the user could enter states like Florida, Louisiana and Georgia and then the postal codes could be entered under each of these states.

Though the realization that reference data is important for business is not a very old concept, but now that it has dawned on business administrators and managers, companies are using this for various purposes. Reference data is in today among the corporate enterprises and we can only expect that in the years to come, its use will be more for business.

This entry was posted in Basics. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

* Copy this password:

* Type or paste password here:

433 Spam Comments Blocked so far by Spam Free Wordpress

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>