Data Migration testing
Data Migration
Data migration is the process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems, or when systems merge (such as when the organizations that use them undergo a merger or takeover).
To achieve an effective data migration procedure, data on the old system is mapped to the new system providing a design for data extraction and data loading. The design relates old data formats to the new system's formats and requirements. Programmatic data migration may involve many phases but it minimally includes data extraction where data is read from the old system and data loading where data is written to the new system.
After loading into the new system, results are subjected to data verification to determine whether data was accurately translated, is complete, and supports processes in the new system. During verification, there may be a need for a parallel run of both systems to identify areas of disparity and forestall erroneous data loss.
Automated and manual data cleaning is commonly performed in migration to improve data quality, eliminate redundant or obsolete information, and match the requirements of the new system.
Data migration phases: design, extraction, cleansing, load and verification; for applications of moderate to high complexity are commonly repeated several times before the new system is deployed.
Types of Data Migration
Testing Options and Strategies
The de facto approach to testing data and content migrations relies upon sampling, where some subset of random data or content is selected and inspected to ensure the migration was completed “as designed”. Those that have tested migrations using this approach are familiar with the typical iterative test, debug and retest method, where subsequent executions of the testing process reveal different error conditions as new samples are reviewed.
Sampling works, but is reliant upon an acceptable level of error and an assumption pertaining to repeatability. An acceptable level of error implies that less than 100% of the data will be migrated without error and the level of error is inversely proportionate to the number of samples tested(refer to sampling standards such as ANSI/ASQ Z1.4). As per the assumption on repeatability, the fact that many migrations require four, five or more iterations of testing with differing results implies that one of the key tenets of sampling is not upheld, i.e., “non-conformities occur randomly and with statistical independence…”.
Even with these shortcomings, sampling has a role in a well defined testing strategy, but what are the other testing options? The following lists options for testing by the phase of the migration process:
Pre-migration testing
These tests occur early in the migration process, before any migration, even migration for testing purposes, is completed. The pre-migration testing options include:
Formal Design Review
Conduct a formal design review of the migration specification when the pre-migration testing in near complete, or during the earliest stages of the migration tool configuration.
The specification should include:
The formal design review should include representatives from the appropriate user communities, IT and management. The outcome of a formal design review should include a list of open issues, the means to close each issue and approve the migration specification and a process to maintain the specification in sync with the migration tool configuration (which seems to continuously change until the production migration).
Post-Migration Testing
Once a migration has been executed, additional end to end testing can be executed. Expect a significant sum of errors to be identified during the initial test runs although it will be minimized if sufficient pre-migration testing is well executed.
Post-migration is typically performed in a test environment and includes:
The advantages of the automated approach include the ability to identify errors that are less likely to occur (the proverbial needles in a haystack). Additionally, as an automated testing tool can be configured in parallel with the configuration of the migration tool, the ability to test 100% of the migrated data is available immediately following the first test migration. When compared to sampling approaches, it is easy to see that automated testing saves significant time and minimizes the typical iterative test, debug and retest found with sampling.
Migrated content has special considerations. For those cases where content is being migrated without change, testing should verify the integrity of the content is maintained and the content is associated with the correct destination record. This can be completed using sampling or as already described, automated tools can be used to verify 100% of the result.
User Acceptance Testing
Functional subtleties related to the co-mingling of migrated data and data created in the destination system may be difficult to identify early in the migration process. User acceptance testing provides an opportunity for the user community to interact with legacy data in the destination system prior to production release, and most often, this is the first such opportunity for the users. Attention should be given to reporting, downstream feeds, and other system processes that rely on migrated data.
Production Migration
All of the testing completed prior to the production migration does not guarantee that the production process will be completed without error. Challenges seen at this point include procedural errors and at times, production system configuration errors. If an automated testing tool has been used for post migration testing of data and content, executing another testing run is straightforward and recommended. If an automated approach had not been used, some level of sampling or summary verification is still recommended.
Recommendations for Designing a Data Migration Test Strategy
In the context of data and content migrations, business and compliance risks are a direct result of migration error. A thorough testing strategy minimizes the likelihood of data and content migration errors.
The list below provides a set of recommendations to define such a testing strategy for a specific system:
Theme by Danetsoft and Danang Probo Sayekti