Migrating to Snowflake? Here's What You Need to Test - Tricentis

Page created by Herbert Edwards
 
CONTINUE READING
Migrating to Snowflake? Here's What You Need to Test - Tricentis
Migrating to Snowflake?
Here’s What You Need to Test
Migrating to Snowflake? | 1

Migrating legacy data (e.g., from IBM Netezza, Oracle, MSSQL, PostgreSQL…) to Snowflake is not a simple “lift
and shift.” It can’t happen all at once, and it must be tested to ensure the tens of thousands of data reports
continue to operate properly on the new model. Organizations must ensure the data is moved efficiently by
performing extensive validation and reconciliation across the old and new worlds.
The current process surrounding the migration and ongoing testing of Snowflake is to utilize manual tests
that create a dependency matrix. This process is error prone, so many different iterations are usually
required. As a result, timelines can easily shift from days/weeks to months—adding considerable delays and
costs to the migration project.
This paper outlines the top challenges that enterprise organizations typically encounter during a Snowflake
migration. For each challenge, we briefly explain how Tricentis Data Integrity has been used to address the
challenge. We conclude with a proactive approach to eliminating data integrity issues—before, during, and
after Snowflake migration.

Challenges of Snowflake Migration
No database migration is simple. Gartner reported that 83% of data migration projects either fail to meet
budgets and schedule expectations…or fail altogether. Migrating workloads from on-prem solutions to cloud
databases is even more complex. Forrester estimates that an average Snowflake migration requires full-time
involvement from 3 DBAs/IT staff for 6 months—plus considerable consulting time.
Here are the top Snowflake migration challenges we’ve encountered at customer and prospect sites, along
with ways to address them.

Data must be migrated incrementally
Snowflake is so popular that they limit the transactions to their system. You can be limited to transferring only
10GB a day unless you get special permission. For a 5 terabyte system, that could mean 500 days...if you do it
perfectly. To avoid drawing out an already lengthy process, automatically reconcile and validate each transfer
as it happens.
Tricentis Data Integrity verifies that the data moves efficiently and accurately. Automated reconciliation tests
provide instant insight into which transformation requirements have been tested and whether those tests
succeeded or failed.

Organizations don’t want to move and store bad data
Since Snowflake charges per terabyte, it’s in your best interest to clean “garbage data” and duplicate records
before moving it over. However, few organizations have the time or resources to do this at any point—much

  Tricentis                                                                                    www.tricentis.com

www.tricentis.com
                    v
Migrating to Snowflake? | 2

less when they’re preparing for a massive migration project. Automatically-generated tests that expose data
errors will not only save time, but also enable a much more thorough and accurate inspection than manual
efforts ever could.
Tricentis Data Integrity’s “pre-screening” tests exposes data that isn’t fit for migration. For instance, it finds
missing values, duplicates, data formats issues, data beyond the acceptable range, etc.

Migrating workloads is tedious and error-prone
Moving workloads from legacy environments to Snowflake is an error-prone activity with a high risk of
disrupting business as usual. Migrating code, business logic, and analytics jobs all have their own set of
unique challenges. For example, workloads must have their exact target equivalent matching the production
performance SLAs. To achieve this, enterprises must perform all the following steps before putting new
workloads into production:

     1.   Thoroughly assess the existing inventory of workloads to identify the chain of workloads to be moved.
     2.   Match the source and target data.
     3.   Convert scripts, business logic, reporting logic, etc.
     4.   Validate the migrated logic.
Tricentis Data Integrity is used to validate the migrated logic. Typically, we find that 60% of the legacy data
workloads can be migrated as-is, 20% workloads might require some additional optimization, and 20%
workloads require total re-engineering. In all cases, testing can be automated with our end-to-end suite of
data integrity tests. These tests span from pre-screening, to vital checks for consistency and correctness,
through any data transformations, and finally to the analytics and report checks that verify the process was
completed correctly.

Data processes are deeply embedded
With RDBMS, existing ETL pipelines push data to legacy warehouses, customized visualization tools pull data
out of their warehouses, and custom applications also depend closely on data from their warehouse. When
you move to Snowflake, all these processes must be re-engineered …and tested.
Tricentis Data Integrity can effectively deal with the reconciliation and validation required to make the
migration risk-free.

  Tricentis                                                                                        www.tricentis.com

www.tricentis.com
                    v
Migrating to Snowflake? | 3

Rethink Data Testing: Before, During, and
After Your Snowflake Migration
Rather than tackle these challenges with a “whack-a-mole” approach, use Snowflake migration as an
opportunity to modernize and transform your overall approach to data integrity—just like you’re modernizing
and transforming your approach to data management.
Tricentis Data Integrity’s end-to-end data reconciliation and validation has helped top organizations unleash
the full power and speed of Snowflake.

    •    Before the integration, take the opportunity to assess the data, identify issues, and fix them so your
         Snowflake data is streamlined and accurate from the start.

    •    During the migration, automatically detect unintentional changes from the old data to the new
         Snowflake stores and processes. This automated regression testing can run throughout the migration
         period to expose change impacts the moment they are introduced—which is when they are 10X faster
         to find and fix.

    •    Once you’re up and running on Snowflake, reuse the same tests to identify when ongoing system
         modifications compromise your processes and data. These extensible, reusable, and resilient tests
         and embed them into the DevOps toolchain of your choice. With this baseline, you can expose
         unintentional data impacts as soon as they occur.

For a deeper dive into what’s involved in this strategy—including a look at how we approach each step—
watch our webinar.

About Tricentis Data Integrity
Tricentis Data Integrity is the industry’s top end-to-end data testing solution for enterprise organizations. Our
end-to-end automation covers everything from the integrity of the data fed into your system, to the accuracy
of integrations, transformations, and migrations, to verification of report logic and presentation.
Tricentis Data Integrity takes advantage of the unique capabilities of Snowflake. For example, for Time Travel,
we create tests that profile and monitor changes in the data as it enters the process—not at the end when
this bad data impacts end users in the business units.
What sets Tricentis Data Integrity apart?

  Tricentis                                                                                    www.tricentis.com

www.tricentis.com
                    v
Migrating to Snowflake? | 4

    •    End-to-end: Automates end-to-end data testing covering all reconciliation and validation tasks from
         sources to stores to reporting and visualizations.

    •    Any technology: Sits on top of any data landscape, covering structured, unstructured, and message
         data from any source or technology as well as reports in any analytics tool via UI, API, and PDF.

    •    Snowflake enrichment: Allows you to create tests utilizing Snowflake’s unique capabilities (such as
         Time Travel) to pinpoint data regression issues as they happen at the source.

    •    Accessible automation: Enables Business Analysts, Data Stewards, Data Engineers, etc. to automate
         testing, replacing spotty “state and compare” checking as well as complex, unscalable SQL scripting.

    •    CI/CD integration: Integrates into CI/CD pipelines to ensure frequent application changes don’t
         inadvertently alter ETL processes and compromise data quality.

    •    Enterprise grade: Delivers a mature enterprise-grade solution with highly-scalable performance and
         enterprise-grade global support to help you achieve your goals, fast.

    •    Risk-based: Guides teams to focus limited testing resources on top business risks; reveals whether a
         release candidate is sufficiently tested and fit for release.

Next Steps
Learn more about how Tricentis can help your organization simplify your migration to Snowflake and ensure
that ongoing system modifications in Snowflake don’t compromise data integrity. Contact your organization’s
Tricentis representative to schedule a briefing with our data integrity specialists.

  Tricentis                                                                                   www.tricentis.com

www.tricentis.com
                    v
You can also read