| Data Integration Tips |
|
The most important component based on which all business houses function is “data”. Data typically refers to the set of information (real time and/or past) that is available to the company. The company, based on the available data, performs requisite functions, be it data modeling, sampling or simply analyzing it.
However, data by nature is huge and generally unmanageable in the raw format. For effective analysis and examination, data needs to be pooled, both logically and in terms of volume. This process of pooling raw data, so as to make it compatible for analysis, is broadly called data integration (DI). Data integration systems are formally defined as a triple where G is the global (or mediated) schema, S is the heterogeneous set of source schemas, and M is the mapping that maps queries between the source and the global schemas. The data that companies get access to, generally comes from different sources, and need to be unified in order to provide end-users with a comprehensive picture. The main function of data integration is to achieve this purpose. Data integration appears with increasing frequency as the volume and the need to share existing data goes. It has been the focus of extensive theoretical work which involves numerous open problems that need to be solved. Data integration has been an extremely important technique that has been successfully put into use for commercial, scientific and management purposes (where it is known as Enterprise Information Integration). When research results from different bioinformatics repositories need to be combined or two similar companies need to merge their databases, Data Integration is the process experts turn to. The data integration procedure has to integrate data models and establish common terms of reference. For this, integration techniques need to go beyond simple consolidation of application databases. Data Integration comprises of three separate layers – the data transport interface, the data exchange services, and the user/ application interface. These components need to be successfully managed for effective integration to happen. Data integration, by nature, can also be of two types – internal integration (for pooling data available within the databases of a single company) and external integration (Integration with external constituencies in a highly-controlled environment – generally, among parties that have strong, well-established partnerships, when one or more of the parties impose standards for data exchange, or when the volume of data exchanged is very high and frequent). There exist certain basic requirements that need to be fulfilled in order to make the data integration process a success. The major factors that influence internal data integration can be tabulated as under:
The requirements for external data integration, however, are slightly different. Hence, the factors that influence it also differ from those that affect internal data integration. These factors may be denoted as follows:
Apart from these taking care of these factors that are crucial to the performance of internal and external data integration respectively, all integration procedures must comply with the organizational security policies. Data integration techniques come with a certain, pre-estimated cost. The expenses for initialization and subsequent long-run deployment of integration also need to be considered by company heads. Integration has, over the recent years, become an immensely useful technique for managing data in the corporate framework. Physical integration of data (involving tools and technology) has often taken center stage, and a good volume theoretical study has been done on this aspect of data integration. However, for effectively integrating available data, a comprehensive study of the strategies, designs, standards and policies of governance need to be undergone. The nature of the data that flows in the information system of a company also has to be understood thoroughly. Indeed, for the purpose of clarification and understanding, the data integration procedure can be explained in terms of a simple framework. This framework comprises of:
Hence, we find that, as businesses grow, the sheer volume of data that needs to be handled and analyzed grows enormously. Data integration provides a solution to this requirement, by explaining the procedures for effectively pooling the data. Integration of data needs to be comprehensive too, accommodating for both internal and external needs. Add as favourites (38) | Quote this article on your site
Write Comment
|
||||||
| < Prev | Next > |
|---|

