Tips for Data Integration

Within most business environments there are multiple IT systems.  Each system is likely designed to fit the process of a certain area of the business eg. Sales processing or Production.  It is rare that one business system covers the end to end processes of a business including the finance and HR requirements that all businesses have to manage.  The consequence is that records are often repeated in the various systems.  And when a record is updated it may need to be updated across a number of systems.  Data integration allows for a controlled mechanism to be put in place to keep systems across the business in line with each other.  As with most projects, planning is important and the setup of a data integration is no different.  Covered below are tips on some aspects of a data integration that should be considered at the planning stage.

Check requirements

A mapping exercise to map the flow of data around the business is a good starting point.  But also, a follow up to verify that all those data flows are necessary is also important.  Especially with the introduction of GDPR rules it is important to ensure only necessary data is stored.

One-way or two-way

Establish if a one-way or two-way integration is required.  Do new records only ever get generated from one start point?  Can the detail of records get updated in multiple systems?  What pieces of data might get updated?  Are those pieces of data essential to the other systems?  Even where records are changed, if the other systems do not make use of that information is there a need to send an update?  Keep in mind that a one-way data integration is far simpler to manage.

Determine the Master System

With a one-way integration, the master system is easily identified as the source system.  With a two-way integration, it is important that one of the systems is identified as the master.  Dependent upon the setup of the integration, if a delay or error in processing occurs in one direction it is feasible that the most current value does not get stored in one system.  Where such instances occur and values come into question it is important that a master system has already been identified so that an agreed value is identified and data conflicts resolved.

Frequency / triggers

During the initial requirements gathering exercise the timeliness of data delivery should be noted.  Whilst a user may have re-keyed data twice historically –  is the second record actually needed in the second system immediately?  Could updates be stored up and sent as a batch?  When real time updates are requested it is worth digging further into what “real time” really means to the user and what tolerances can be afforded.  Answering these questions will also help with determining the most suitable integration tool.  Once the frequency has been determined and the data integration put in place it is also important to ensure that the users are made aware of timings so that their expectations are correct.

1:1 Matching

When integrating systems, it is clearly imperative that the correct corresponding records are updated.  To do this it is better to work with ID fields, as opposed to matching on text values.  If the same ID field can be used across all the systems this is useful.  Otherwise cross-reference lists can be utilised to match record IDs.  The situation that must be avoided is matching one to many records across systems.  If duplicates have been noticed in a system before the integration is setup, then an exercise to eradicate them should be undertaken.  Assistance in that area can be provided by QGate, see here.

Data Types

Having decided what fields need updating it will be worthwhile checking that the underlying data types of them are the same or similar enough.  The reason for this is that if they are not the same then values stored in one system may error when an attempt to post them in the second system is made. For example, a text value cannot be posted to a number type field.

Dates

In a similar vein to data types above, a check on the date formats should be made and a transformation during integration carried out if necessary eg. UK to US format.  Whilst systems may display the dates in the same manner, the way it is stored at database level may differ.  Further considerations are needed for dates to include time zones and how each system stores the values.  Some systems will always store the dates in UTC (Coordinated Universal Time) and display the dates in line with a user’s regional settings.  The times will be especially important to some businesses where service level agreements are closely monitored.  Even where a business is only sensitive to the day on which an event occurred the storage of date/times must be checked as an hour adjustment for daylight savings can have the effect of changing the date if the hour is 2300hrs.

Error reporting

When an integration is put in place some monitoring should be established as well.  It can put IT on the back foot if they are reliant upon the users to flag up issues.  Some monitoring and error reporting that notifies a technical team is recommended.

 Related Resources: