Nagarjuna Prabhu, part of the Assurance Services Unit (ASU) at TCS, blogs about deploying a test data management (TDM) framework.
How many times have you hurriedly entered ‘ABC’, ’PQR’ or ’XYZ’ in fields in online forms? When hard pressed for time, people often take such short cuts. As a result, corporate databases get filled with invalid data in fields for names and locations. What if your testers, for lack of sample input values, took the same approach, putting product functionality and performance at risk? Sounds scary?
In many development projects, testing teams rely on manually generated test data, lying in uncontrolled spreadsheets across project repositories. This dependency is often the cause for cost and schedule overruns.
Testing needs the right data
Your testers should be spending time on testing, and not collecting data. To make this possible, you need to provision test data along with test cases. If your testers don’t have the right test data while executing test cases, they will try to hunt for it, and if they don’t find it in time, they may resort to alternatives such as the ’ABC’, ’PQR’ and ’XYZ’ approach, putting your project and business at risk.
Merely provisioning test environments and having exhaustive test cases is not enough. Testing also needs the right data – a challenge in itself – especially in this age of agile, which brings business benefits, but also mandates the need for continuous testing. Every release now must be put through a series of rigorous, functional, performance, security and other tests – and that too at rapid pace.
With every phase of the software lifecycle becoming a specialty in itself, there comes the need to provision copies of datasets separately for development, testing and training. For example, functionality tests require test data and input values that represent live business scenarios. Load, performance and stress tests require large volume data sets that can survive automated scripts, and outlive several hours under stressful usage.
Such large volume of test data is usually not available in existing business data repositories. Many businesses take the easy route to address the test data challenge and use live production data for testing. If your testing teams too are doing this, your product and business are at risk. Current production data is often not representative of future business scenarios and emerging trends. These trends bring in varying transaction types and complex business models, which require equally flexible data structures – often not available in existing production databases. As a result, testers tend to steer towards current happy paths, and miss out on many potential opportunities for negative testing. As a result, the product gets tested for bugs, but not bottlenecks!
Further, the technology landscape in most businesses comprises a combination of legacy and latest technologies, resulting in data scattered across systems in different formats. Then, there’s the need to protect customer confidentiality and data privacy. By using live production data for testing, one could end up having customer sensitive and private data exposed through unprotected workstations – a threat to customer data privacy. Regulations in certain industries such as banking and healthcare explicitly forbid use of production data for testing. If you are thinking of manual ’data masking’ as the quick fix, stopgap arrangement, sorry, that is just not an option anymore. With corporate databases growing to beyond terabyte sizes, Big Data getting bigger, and demand for voluminous test data on the rise, manual data masking is expensive, error prone, risky and prohibitive.
So what do you do?
How about deploying a test data management (TDM) framework that can generate the right volume of test data on demand? Such a framework could comprise standards for data masking, strategies for mitigating security risks, blueprints for synchronising data across business applications, and tools for generating synthetic, production-like data for testing. Specifically, such a framework should help address compliance, automation, management, and data validation challenges by incorporating the following:
- Regulatory requirements and their conformance through regular reviews.
- Data touch-points and their easy accessibility.
- Test data management tools.
- Key performance indicators (KPIs).
- Change management – course correction and changes based on KPI results.
- Processes for data validation and verification, meta data changes.
Remember, systems are only as intelligent as the data they hold. When implemented with the right components and controls, the right test data framework holds potential to change the face of assurance – through automation, speed, cost optimisation and tester efficiency. And help you scale from bug detection to bottleneck prevention.
An earlier version of this blog was published at #ThinkAssurance, the quality assurance (QA) and testing blog of Tata Consultancy Services.