Data migration is the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another. Data migration is the process of moving data from one location to another, one format to another, or one application to another. Generally, this is the result of introducing a new system or location for the data.
The business driver is usually an application migration or consolidation in which legacy systems are replaced or augmented by new applications that will share the same dataset. These days, data migrations are often started as firms move from on-premises infrastructure and applications to cloud-based storage and applications to optimize or transform their company.
Why Is Data Migration Seen as Difficult and Risky?
The short answer is "data gravity." Although the concept of data gravity has been around for some time, the challenge is becoming more significant because of data migrations to cloud infrastructures. In brief, data gravity is a metaphor that describes:
- How data attracts other data to it as it grows
- How data is integrated into a business
- How data becomes customized over time
To move applications and data to more advantageous environments, Gartner recommends "disentangling" data and applications as a means of overcoming data gravity. By making time at the beginning of the project to sort out data and application complexities, firms can improve their data management, enable application mobility, and improve data governance.
The main issue is that every application complicates data management by introducing elements of application logic into the data management tier, and each one is indifferent to the next data use case. Business processes use data in isolation and then output their own formats, leaving integration for the next process.
Therefore, application design, data architecture, and business processes must all respond to each other, but often one of these groups is unable or unwilling to change. This forces application administrators to sidestep ideal and simple workflows, resulting in suboptimal designs. Although the workaround may have been necessary at the time, this technical debt must eventually be addressed during data migration or integration.
Given this complexity, consider promoting data migration to "strategic weapon" status so that it gets the right level of awareness and resources. To ensure that the project gets the attention it needs, focus on the most provocative element of the migration – the fact that the legacy system will be turned off – and you’ll have the attention of key stakeholders, guaranteed.
Types of Data Migration
There are numerous business advantages to upgrading systems or extending a data center into the cloud. For many firms, this is a very natural evolution. Companies using cloud are hoping that they can focus their staff on business priorities, fuel top-line growth, increase agility, reduce capital expenses, and pay for only what they need on demand. However, the type of migration undertaken will determine how much IT staff time can be freed to work on other projects.
First, let’s define the types of migration:
The process of moving data off existing arrays into more modern ones that enable other systems to access it. Offers significantly faster performance and more cost-effective scaling while enabling expected data management features such as cloning, snapshots, and backup and disaster recovery.
The process of moving data, application, or other business elements from either an on-premises data center to a cloud or from one cloud to another. In many cases, it also entails a storage migration.
The process of moving an application program from one environment to another. May include moving the entire application from an on-premises IT center to a cloud, moving between clouds, or simply moving the application's underlying data to a new form of the application hosted by a software provider.
How to Plan a Data Migration
Data migration involves 3 basic steps:
- Extract data
- Transform data
- Load data
Moving important or sensitive data and decommissioning legacy systems can put stakeholders on edge. Having a solid plan is a must; however, you don’t have to reinvent the wheel. Here is a typical 7-phase process:
Evaluate the data being moved for stability.
Identify and brief key stakeholders.
Establish a robust data quality rules management process and brief the business on the goals of the project, including shutting down legacy systems.
Determine what data to move, and the quality of that data before and after the move.
Build & Test
Code the migration logic and test the migration with a mirror of the production environment.
Execute & Validate
Demonstrate that the migration has complied with requirements and that the data moved is viable for business use.
Decommission & Monitor
Shut down and dispose of old systems.
This may appear to be an overwhelming amount of work, but not all these steps are needed for every migration. Each situation is unique, and each company approaches the task differently.
Top 10 Data Migration Challenges
Even though data migration has been a fact of IT life for decades, horror stories are still reported every year. Here are the top 10 challenges that firms encounter in moving data:
- Not contacting key stakeholders. No matter the size of the migration, there is someone, somewhere who cares about the data you’re moving. Track them down and explain the need for this project and the impact on them before you get going on the task. If you don’t, you’ll certainly hear from them at some stage, and chances are good that they’ll disrupt your timeline.
- Not communicating with the business. Once you’ve explained the project to the stakeholders, be sure to keep them informed of your progress. It’s best to provide a status report on the same day every week, especially if things get off track. Regular communication goes a long way in building trust with all those affected.
- Lack of data governance. Be sure you’re clear on who has the rights to create, approve, edit, or remove data from the source system, and document that in writing as part of your project plan.
- Lack of expertise. Although this is a straightforward task, there's a lot of complexity involved in moving data. Having an experienced professional with excellent references helps the process go smoothly.
- Lack of planning. On average, families spend 10 to 20 hours planning their vacation, while IT teams may spend as little as half that time planning a small data migration. Hours spent planning don't always guarantee success, but having a solid data migration plan does save hours when it comes to actually moving the data.
- Insufficient data prep software and skills. If this is a large migration (millions of records or hundreds of tables), invest in first-class data quality software and consider hiring a specialist firm to assist. Good news: An outside firm will probably rent you the software to help conserve costs.
- Waiting for perfect specs for the target. If the implementation team is sorting out design criteria, press on with steps 2 and 3. Target readiness will matter later in the project, but don’t let it stop you now.
- Unproven migration methodology. Do some research to be sure that the data movement procedure has worked well for other firms like yours. Resist the temptation to just accept the generic procedure offered by a vendor.
- Supplier and project management. Vendors and projects must be managed. If you're still doing your day job too, be sure that you have the time to manage the project and any related suppliers.
- Cross-object dependencies. With the technology and capabilities of data management tools available today, it's still shocking to learn about a dependent dataset that wasn’t included in the original plan. Because cross-object dependencies often are not discovered until very late in the migration process, be sure to build in a contingency for them so that your entire delivery date isn’t thrown off.
Data Migration vs. Data Conversion vs. Data Integration
The terms data migration and data conversion are sometimes used interchangeably on the internet, so let’s clear this up: They mean different things. As pointed out earlier, data migration is the process of moving data between locations, formats, or systems. Data migration includes data profiling, data cleansing, data validation, and the ongoing data quality assurance process in the target system. In a typical data migration scenario, data conversion is only the first step in a complex process.
The term data conversion refers to the process of transforming data from one format to another. This is necessary when moving data from a legacy application to an upgraded version of the same application or an entirely different application with a new structure. To convert it, data must be extracted from the source, altered, and loaded into the new target system based on a set of requirements.
Another term that is sometimes confused with data migration is data integration. Data integration refers to the process of combining data residing at different sources to provide users with a unified view of all the data. Integrating data from multiple sources is essential for data analytics. Example of data integration include data warehouses, data lakes, and Pools, which automate data tiering between on-premise data centers and clouds or automatically tier data between AWS EBS block storage and AWS S3 object stores.
Move to Infrastructure as a Service (IaaS):
- Rehost (lift and shift). Redeploy data and applications on an IaaS without making changes
- Revise (rearchitect). Modify or extend existing application code to fit the new cloud environment.
- Replace. Retire legacy applications hosted and managed on premises with a comparable application hosted in the cloud; for example, Office36
Move to Platform as a Service (PaaS):
- Refactor. Inject your code and run your application on top of the cloud.
- Rebuild. Discard code for an existing application and rearchitect the application on top of the cloud.
Choosing a deployment model that aligns with business requirements is essential to make sure that any data migration is both smooth and successful and delivers business value in terms of performance, security, and ROI