DataOps Insights for Testers
Defining the Business Problems
A factory analogy for software is a powerful way to describe the varied problems testers face, and to highlight the assumptions entailed by our current thinking. Like other over-worked members of key technical staff, Testers are dubious of any claims that the newest way of thinking will actually amount to anything because they’ve been burned too many times. The fact remains, however, that disruption does occur – and sometimes at massive scale. We’ve got to be ready to follow good data and sensible deductions to consider new things - because insight always arises out of what we haven’t yet considered.
“Provision speed, data freshness, data completeness, data synchronicity and consistency within and among datasets, distribution speed, resource availability, reproducibility, and repeatability [can] all contribute to longer deployment frequency.” And in the midst of these problems, competitors are finding ways to go faster and faster. “The best companies are deploying 30x faster, delivery times are dropping as much as 83%, and unicorns like Amazon are now deploying new software every second.” Here’s an (incomplete) short list of tester challenges DataOps can address:
- Context Switch Speed/Cost: What does the time and cost to shutdown/pause one thing and startup another thing do to tester productivity?
- Cost of Complexity– Our test data and test data delivery have too many steps, too much effort per step, and too many components.
- Cost of Optimizing Locally– Optimizing our own workflow often creates both the illusion of progress and an increasein cost.
- Time to Test Data/ Test Environment Ready– How long does it take to return to golden baseline or any baseline or any secure full-size baseline?
If we connect the dots between a day in the life of a tester and all the ways that data affects the speed and efficiency of that tester, the case is clear: Test data is at the heart of many of the challenges testers face each day. These problems have been so intransigent for so long that it’s hard for testers to even accept that different assumptions about how testing works are even possible. But, as we’ll see, these assumptions are ripe for challenge.
Insight: “Blazing fast code deployment doesn’t solve the test data bottleneck”
Context Switch Speed/Cost
A few months back, I was chatting with an SVP of software development and he commented to me that the frequency and cost of context switch was enemy number one for his development organization. In test data management, we need to context switch our test for a variety of reasons: to change the release we’re working on, to load new test data, swapping out data to keep from compromising someone else’s dataset, to triage a production problem, etc. But, often we stop ourselves from context switching even when its beneficial because of the speed and cost implications.
For example, “reproducing defects can also slow down deployment. Consider that quite often developers complain that they can’t reproduce the defect that a tester has found. This often leads to a full halt in the testing critical path as the tester must “hold” her environment to let the developer examine it. In some cases, whole datasets are held hostage while triage occurs.
Or, “platforms that could be re-purposed as test-ready environments are fenced in by context-switching costs. Testers know the high price of a context switch, and the real possibility that switching back will fail, so they simply hold their environment for “testing” rather than risk it. Behaviors driven by the cost of context-switching create increased serialization, more sub-setting, and (ironically), by “optimizing their part of the product/feature delivery pipeline, testers end up contributing to one of the bottlenecks that prevent that pipeline from moving faster globally.”[9]
A simple realization is that the state of a dataset has value in proportion to the time and expense it takes you to reproduce it. When that dataset is at hand, that cost and time is almost nothing. But, when you want to switch between two states, that cost can rise significantly. Moreover, when more than one person is using a dataset at the same time, fencing costs come in as we invest to keep the work of one person from clobbering the work of another. Sometimes these fencing schemes work, and sometimes their failure to work causes all sorts of delays as people “freeze” environments or are forced to wait or find alternate methods to test.
The capabilities of a DataOps solution massively deflate these costs. First, the collection of all of the states of a dataset (and its associated configurations) are held in a rapidly accessible and provision-able state. Second, since these datasets are lightweight, they are also rapidly mobile – meaning that its simple to decouple the dataset from the host. Third, since DataOps solutions also power libraries of datasets, it can be simple to rapidly switch dataset contexts by activating one dataset, then activating a different one in its place, then going back to the original dataset exactly as it was because DataOps solutions can perfectly preserve data state. The DataOps virtualization capability provides incredible speed, and the massive sharing of common data across versions of datasets provides significantly lower cost. Together, these break the reasons behind the sort of behaviors that make both environments and workflows brittle and serialized.
Insight: DataOps Capabilities cause the Cost of Context Switch to plummet.
Cost of Complexity
Where does complexity arise in the context of test data? Certainly, making sure you have the right test cases to provide code coverage. Or, in building rules to generate synthetic data. Or, at least for some, in the premise that subsets are preferable and that therefore, we need rules to make subsets consistent within and across different datasets within an application or a business flow that may be tested. In the world of manufacturing, complexity often comes down to the simple but powerful observations that fewer steps, fewer components, tools that fit the user scale, components that are standard and that are easy to place, and a design for manufacture or maintainability each play a part in reducing overall complexity. Consider complexity in the context of these common test data tasks:
- Build (Creating Test Data):Taking a complete set or building a subset of an application’s data that maintains referential integrity (or a subset of many dataset components in a composite app that maintains distributed referential integrity). And, often, doing either one with masked data.
- Update (Maintaining Continuous Data):fetching and continuing to fetch a changed copy of a dataset (A new build, a new set of test data, data from production for a bug fix) to maintain currency with the codebase or changes in the test suite or currency with a reference dataset like production.
- Provision (Provisioning Test Data):provisioning any of a number of versions of application data (by Release, by Bug Number, by condition – masked/subsetted/synthetic), etc.
With any scheme that involves selecting and transforming data to arrive at test dataset, complexity rises because the scheme must rely on the rules used within the dataset’s ontology (i.e. data model), the code coverage approach (e.g., covering every branch of code), and any hidden or implicit relationships within the data. Despite improvements in AI and automation, the fact remains that many times these rules, approaches, and relationships are only discernible by experts. Further, since it’s the nature of applications to evolve, this level of complexity is sustained at a level proportional to each change in the code. Translating that into our language of complexity, and we observe that a creating test data via sub-setting exhibits complexity in that discovering the rules that define the data involves many steps, defining the rules to subset the data involves many steps and components, the need for experts makes process compression and standardization difficult, and the complexity remains proportional to the change, suggesting it is not designed for maintainability. Unfortunately, poorly crafted subsets are rife with mismatches because they fail to maintain referential integrity. And, they often result in hard-to-diagnose performance errors that crop up much later in the release cycle.
Masking, of course, has a different purpose. And while it shares the need to understand the domain and rules of the dataset’s ontology, it has much less to do with the actual code being executed. However, since the cost to understand the ontology is already sunk, it was natural to see these two capabilities dovetail together in so many products on the market today
Sub-setting remains popular. But, what happens when we ask why it remains popular? It turns out that “to solve the pain of provisioning large test datasets, test leaders often turn to sub-setting to save storage and improve execution speed.” There’s a whole industry built around the notion that this is the best solution, and entire consulting organizations implicitly supporting that same proposition because it drives so much consulting revenue. DataOps challenges both the need and the value of subsets.
The recent Techwell survey concluded that among testers, the primary motivation behind sub-setting was storage savings. But, one key capability of DataOps – data virtualization – mitigates virtually all the value found by sub-setting as data already lives in its most compressed and shared state. What does that mean? It means that all things being equal, the net storage for 10 fresh full-size copies of a dataset using a DataOps solution is typically LESS than the storage needed for ten 10% size subsets of that same dataset without a DataOps solution. So what? Consider 3 factors in the lifecycle cost of your test data: build, update, and provision.
In the build phase, a solution that doesn’t need to subset doesn’t need all those rules and approaches and schemes and still arrives at the same cost savings. Further, referential integrity is, by definition, as good as the referential integrity of the application from the moment of inception.
In the update phase, rules discovery still doesn’t need to happen. And, DataOps solutions can integrate incremental data from large datasets in minutes.
In the provision phase, as the points on the timeline grow, the cost of the dataset doesn’t continue to rise as with subsets. Consider that with 2 distinct point in time subsets, you might need 2x storage. But, the incremental capability of DataOps solutions permits it to integrate and share common data. (and in most cases, data is 90% common)
That “Library of Versions” concept within DataOps extends to distributed referential integrity as well. Placing large datasets into containers with common timelines makes synchronicity easy. And, synchronicity is typically the big time culprit to preparing large and complex datasets.
Typically, here, a subset proponent will raise his or her hand and offer a very well-reasoned defense for using subsets to keep the size of development databases down so that they don’t “stuff the box”. Fair enough. There is some business value there. But, the counter contention is that that value is far less than you think. Anecdotally, we find that only about 10% of applications reach the scale where the cost of hardware that is production scale is in the same cost ballpark of building and maintaining a subset scheme. Second, there is a shift right that occurs when performance tests are conducted too late in the cycle. Full size datasets mitigate some of that risk. Third, the hidden assumption in this is often that the expectation that the data cannot be agile at scale and so time slicing, dataset and test environment agility, and other ways to treat any application more like an on-premises cloud haven’t been factored in. Are there situations where a subset is absolutely necessary? Sure. But, the cost and agility premises behind most subset approaches are so fully mitigated by DataOps capabilities that the real need is more like 5%.
Insight: DataOps’ storage and agility advantages kill the case for subsets.
Cost of Optimizing Locally
It’s natural to optimize our own environment – to arrange things so that we can be the most productive. Two problems arise. First, if we work at a speed of delivery faster than the next step in the assembly process can absorb it, we create inventory. If fact, whenever we are working at a pace out of sorts with the speed of the bottleneck, the system is not optimized. Second, artificial constraints can warp the way we do work. When it comes to test data, what does this mean?
What is “inventory” when it comes to a developer or tester? It could be code that has been unit tested but not yet begun integration testing or other testing. It could be test data that’s in the process of being built, or that is ready but hasn’t been provisioned to an environment yet.
Or, it could be rules and configurations for defining data subsets or for administrators it may be masking rules that have yet to be applied. The real kicker is the second issue – when it takes too long to get good data, we find other avenues to get our work done. Instead of a refresh, we may write out own scripts to generate fresh test data. Or, instead of a refresh, we may use the old dataset even though it exposes you to untested corner cases – and that gets worse as integration gets more complex. Keeping data fresh can mean expensive investments in masking and data pipelines.
Environment availability prevents data from getting the right data to the right place just in time. Many testers use a limited number of environments, forcing platforms to be overloaded with streams such that the resultant sharing and serialization force delay, rework and throwaway work to happen. Some testers wait until an environment is ready. Others write new test cases rather than wait, and still others write test cases they know will be thrown away.
Insight: “Speeding up” by not using fresh data creates lots of hidden cost/risk.
Time to Test Data or Test Environment Ready
“Feature delivery can take a hard shift right as errors pile up from stale data or as rework enters because new data breaks the test suite. Why is the data out of date? Most companies fail to provision multi-Tb test datasets in anywhere near the timeframes in which they can build their code. For example, 30% of companies take more than a day and 10% more than a week to provision new databases.”
Testers are often at the mercy of an application owner or a backup schedule or a resource constraint that forces them to gather their copy of the dataset at different times. These time differences create consistency problems the tester has to solve because without strict consistency, the distributed referential integrity problems can suddenly scale up factorially. This leads to solutions with even more complex rulesets and time logic. Compounding Federation with Subsetting can mean a whole new world of hurt as subset rules must be made consistent across the federated app.
These problems are all subsumed into a tester’s most basic need: to restart and repeat her test using the right data. Consider, then, that repeating the work to restore an app (or worse a federated app), synchronize it, subset it, mask it, and distribute it scales up the entire testing burden in proportion to the number of test runs. That’s manageable within a single app, but can quickly grow unwieldy at the scale of a federated app.
In a recent study, 76% of respondents claimed that “Slow deployment to test and development environments” was a major challenge in the adoption of continuous delivery. Test environment delivery is a major challenge. And shifting the power to deliver the data via self-service combined with the ability to provision on command in very little time are 2 knockout capabilities that DataOps brings to the table right now.
Insight: The time to test/environment ready is a key predictor of test maturity.