Value-Focused Due Diligence with Data Analytics

Private equity funds are shifting away from asset due diligence toward value-focused due diligence. Historically, the due diligence (DD) process centered around an audit of a portfolio company’s assets. Now, private equity (PE) firms are adopting value-focused DD strategies that are more comprehensive in scope and focus on revealing the potential of an asset.

Data analytics are key in support of private equity groups conducting value-focused due diligence. Investors realize the power of data analytics technologies to accelerate deal throughput, reduce portfolio risk, and streamline the whole process. Data and analytics are essential enablers for any kind of value creation, and with them, PE firms can precisely quantify the opportunities and risks of an asset.

The Importance of Taking a Value-Focused Approach to Due Diligence

Due diligence is an integral phase in the merger and acquisition (M&A) lifecycle. It is the critical stage that grants prospective investors a view of everything happening under the hood of the target business. What is discovered during DD will ultimately impact the deal negotiation phase and inform how the sale and purchase agreement is drafted.

The traditional due diligence approach inspects the state of assets, and it is comparable to a home inspection before the house is sold. There is a checklist to tick off: someone evaluates the plumbing, another looks at the foundation, and another person checks out the electrical. In this analogy, the portfolio company is the house, and the inspectors are the DD team.

Asset-focused due diligence has long been the preferred method because it simply has worked. However, we are now contending with an ever-changing, unpredictable economic climate. As a result, investors and funds are forced to embrace a DD strategy that adapts to the changing macroeconomic environment.

With value-focused DD, partners at PE firms are not only using the time to discover cracks in the foundation, but they are also using it as an opportunity to identify and quantify huge opportunities that can be realized during the ownership period. Returning to the house analogy: during DD, partners can find the leaky plumbing and also scope out the investment opportunities (and costs) of converting the property into a short-term rental.

The shift from traditional asset due diligence to value-focused due diligence largely comes from external pressures, like an uncertain macroeconomic environment and stiffening competition. These challenges place PE firms in a race to find ways to maximize their upside to execute their ideal investment thesis. The more opportunities a PE firm can identify, the more competitive it can be for assets and the more aggressive it can be in its bids.

Value-Focused Due Diligence Requires Data and Analytics

As private equity firms increasingly adopt value-focused due diligence, they are crafting a more complete picture using data they are collecting from technology partners, financial and operational teams, and more. Data is the only way partners and investors can quantify and back their value-creation plans.

During the DD process, there will be mountains of data to sift through. Partners at PE firms must analyze it, discover insights, and draw conclusions from it. From there, they can execute specific value-creation strategies that are tracked with real operating metrics, rooted in technological realities, and modeled accurately to the profit and loss statements.

This makes data analytics an important and powerful tool during the due diligence process. Data analytics can come in different forms:

  • Data Scientists: PE firms can hire data science specialists to work with the DD team. Data specialists can process and present data in a digestible format for the DD team to extract key insights while remaining focused on key deal responsibilities.
  • Data Models: PE firms can use a robustly built data model to create a single source of truth. The data model can combine a variety of key data sources into one central hub. This enables the DD team to easily access the information they need for analysis directly from the data model.
  • Data Visuals: Data visualization can aid DD members in creating more succinct and powerful reports that highlight key deal issues.
  • Document AI: Harnessing the power of document AI, DD teams can glean insights from a portfolio company’s unstructured data to create an ever more well-rounded picture of a potential acquisition.

Data Analytics Technology Powers Value

Value-focused due diligence requires digital transformation. Digital technology is the primary differentiating factor that can streamline operations and power performance during the due diligence stage. Moreover, the right technology can increase or decrease the value of a company.

Data analytics ultimately allows PE partners to find operationally relevant data and KPIs needed to determine the value of a portfolio company. There will be enormous amounts of data for teams to wade through as they embark on the DD process. However, savvy investors only need the right pieces of information to accomplish their investment thesis and achieve value creation. Investing in robust data infrastructure and technologies is necessary to implement the automated analytics needed to more easily discover value, risk, and opportunities. Data and analytics solutions include:

  • Financial Analytics: Financial dashboards can provide a holistic view of portfolio companies. DD members can access on-demand insights into key areas, like operating expenses, cash flow, sales pipeline, and more.
  • Operational Metrics: Operational data analytics can highlight opportunities and issues across all departments.
  • Executive Dashboards: Leaders can access the data they need in one place. This dashboard is highly tailored to present hyper-relevant information to executives involved with the deal.

Conducting value-focused due diligence requires timely and accurate financial and operating information available on demand. 2nd Watch partners with private equity firms to develop and execute the data, analytics, and data science solutions PE firms need to drive these results in their portfolio companies. Schedule a no-cost, no-obligation private equity whiteboarding session with one of our private equity analytics consultants.


Modern Data Warehouses and Machine Learning: A Powerful Pair

Artificial intelligence (AI) technologies like machine learning (ML) have changed how we handle and process data. However, AI adoption isn’t simple. Most companies utilize AI only for the tiniest fraction of their data because scaling AI is challenging. Typically, enterprises cannot harness the power of predictive analytics because they don’t have a fully mature data strategy.

To scale AI and ML, companies must have a robust information architecture that executes a company-wide data and predictive analytics strategy. This requires businesses to focus their data application beyond cost reduction and operations, for example. Fully embracing AI will require enterprises to make judgment calls and face challenges in assembling a modern information architecture that readies company data for predictive analytics. 

A modern data warehouse is the catalyst for AI adoption and can accelerate a company’s data maturity journey. It’s a vital component of a unified data and AI platform: it collects and analyzes data to prepare the data for later stages in the AI lifecycle. Utilizing your modern data warehouse will propel your business past conventional data management problems and enable your business to transform digitally with AI innovations.

What is a modern data warehouse?

On-premise or legacy data warehouses are not sufficient for a competitive business. Today’s market demands organizations to rely on massive amounts of data to best serve customers, optimize business operations, and increase their bottom lines. On-premise data warehouses are not designed to handle this volume, velocity, and variety of data and analytics.

If you want to remain competitive in the current landscape, your business must have a modern data warehouse built on the cloud. A modern data warehouse automates data ingestion and analysis, which closes the loop that connects data, insight, and analysis. It can run complex queries to be shared with AI technologies, supporting seamless ML and better predictive analytics. As a result, organizations can make smarter decisions because the modern data warehouse captures and makes sense of organizational data to deliver actionable insights company-wide.

How does a modern data warehouse work with machine learning?

A modern data warehouse operates at different levels to collect, organize, and analyze data to be utilized for artificial intelligence and machine learning. These are the key characteristics of a modern data warehouse:

Multi-Model Data Storage

Data is stored in the warehouse to optimize performance and integration for specific business data. 

Data Virtualization

Data that is not stored in the data warehouse is accessed and analyzed at the source, which reduces complexity, risk of error, cost, and time in data analysis. 

Mixed Workloads

This is a key feature of a modern data warehouse: mixed workloads support real-time warehousing. Modern data warehouses can concurrently and continuously ingest data and run analytic workloads.

Hybrid Cloud Deployment

Enterprises choose hybrid cloud infrastructure to move workloads seamlessly between private and public clouds for optimal compliance, security, performance, and costs. 

A modern data warehouse can collect and process the data to make the data easily shareable with other predictive analytics and ML tools. Moreover, these modern data warehouses offer built-in ML integrations, making it seamless to build, train, and deploy ML models.

What are the benefits of using machine learning in my modern data warehouse?

Modern data warehouses employ machine learning to adjust and adapt to new patterns quickly. This empowers data scientists and analysts to receive actionable insights and real-time information, so they can make data-driven decisions and improve business models throughout the company. 

Let’s look at how this applies to the age-old question, “how do I get more customers?” We’ll discuss two different approaches to answering this common business question.

The first methodology is the traditional approach: develop a marketing strategy that appeals to a specific audience segment. Your business can determine the segment to target based on your customers’ buying intentions and your company’s strength in providing value. Coming to this conclusion requires asking inductive questions about the data:

  • What is the demand curve?
  • What product does our segment prefer?
  • When do prospective customers buy our product?
  • Where should we advertise to connect with our target audience?

There is no shortage of business intelligence tools and services designed to help your company answer these questions. This includes ad hoc querying, dashboards, and reporting tools.

The second approach utilizes machine learning within your data warehouse. With ML, you can harness your existing modern data warehouse to discover the inputs that impact your KPIs most. You simply have to feed information about your existing customers into a statistical model, then the algorithms will profile the characteristics that define an ideal customer. We can ask questions around specific inputs:

  • How do we advertise to women with annual income between $100,000 and $200,000 who like to ski?
  • What are the indicators of churn in our self-service customer base?
  • What are frequently seen characteristics that will create a market segmentation?

ML builds models within your data warehouse to enable you to discover your ideal customer via your inputs. For example, you can describe your target customer to the computing model, and it will find potential customers that fall under that segment. Or, you can feed the computer data on your existing customers and have the machine learn the most important characteristics. 

Conclusion

A modern data warehouse is essential for ingesting and analyzing data in our data-heavy world.  AI and predictive analytics feed off more data to work effectively, making your modern data warehouse the ideal environment for the algorithms to run and enabling your enterprise to make intelligent decisions. Data science technologies like artificial intelligence and machine learning take it one step further and allow you to leverage the data to make smarter enterprise-wide decisions.

2nd Watch offers a Data Science Readiness Assessment to provide you with a clear vision of how data science will make the greatest impact on your business. Our assessment will get you started on your data science journey, harnessing solutions such as advanced analytics, ML, and AI. We’ll review your goals, review your current state, and design preliminary models to discover how data science will provide the most value to your enterprise.

-Ryan Lewis | Managing Consultant at 2nd Watch

Get started with your Data Science Readiness Assessment today to see how you can stay competitive by automating processes, improving operational efficiency, and uncovering ROI-producing insights.


3 Data Priorities for Organic Value Creation

Organic value creation focuses on a few main areas, including improving current performance (both financial and operational) of your companies, establishing a pattern of consistent growth, strengthening your organizational leadership team, and building the potential for a brighter future through product and competitive positioning. All of these are supported by and/or partially based on the data foundation you create in your companies. At exit, your buyers want to see and feel confident that the created organic value is sustainable and will endure. Data and analytics are key to proving that. 

Companies that solely focus on competition will ultimately die. Those that focus on value creation will thrive. — Edward De Bono

To organically create and drive value, there are a few key data priorities you should consider:

  1. A starting point is data quality, which underpins all you will ever do and achieve with data in your organization. Achieving better-quality data is an unrelenting task, one that many organizations overlook.
  2. Data monetization is a second priority and is also not top-of-mind for many organizations. The adage that “data is the new oil” is at least partially true, and most companies have ways and means to leverage the data they already possess to monetize and grow revenue for improved financial returns.
  3. A third data priority is to focus on user adoption. Having ready data and elite-level analytical tools is not sufficient. You need to be sure the data and tools you have invested in are broadly used – and not just in the short term. You also need to continue to evolve and enhance both your data and your tools to grow that adoption for future success.

Data Quality

Data quality is a complicated topic worthy of a separate article. Let’s focus our data quality discussion on two things: trust and the process of data quality.

If you are organically growing your companies and increasing the use of and reliance upon your data, you better make sure you trust your data. The future of your analytics solutions and broad adoption across your operational management teams depend on your data being trustworthy. That trust means that the data is accurate, consistent across the organization, timely, and involved in a process to ensure the continuing trust in the data. There is also an assumption that your data aligns with external data sources. You can measure accuracy of your portfolio company’s data in many ways, but the single best measure is going to be how your operating executives answer the question, “How much do you trust your data?”

Data quality is never stagnant. There are always new data sources, changes in the data itself, outside influences on the data, etc. You cannot just clean the data once and expect it to stay clean. The best analogy is a stream that can get polluted from any source that feeds into the stream. To maintain high data quality over time, you need to build and incorporate processes and organizational structures that monitor, manage, and own the quality of your company’s data.

One “buzzwordy” term often applied to good data governance is data stewardship – the idea being that someone within your enterprise has the authority and responsibility to keep your data of the highest quality. There are efficient and effective ways to dramatically improve your company data and to keep it of the highest quality as you grow the organization. Simply put, do something about data quality, make sure that someone or some group is responsible for data quality, and find ways to measure your overall data quality over time.

A leading equipment distributor found new revenue sources and increased competitive edge by leveraging the cloud data warehouse that 2nd Watch built for their growing company to share data on parts availability in their industry. Using the centralized data, they can grow revenue, increase customer service levels, and have more industry leverage from data that they already owned. Read this private equity case study here.

Data Monetization

Organic value creation can also come from creating value out of the data your portfolio companies already own. Data monetization for you can mean such options as:

Enriching your internal data – Seek ways to make your data more valuable internally. This most often comes from cross-functional data creation (e.g., taking costing data and marrying it with sales/marketing data to infer lifetime customer value). The unique view that this enriched internal data offers will often lead to better internal decision-making and will drive more profitable analytics as you grow your analytics solutions library.

Finding private value buyers – Your data, cleansed and anonymized, is highly valuable. Your suppliers will pay for access to more data and information that helps them customize their offerings and prices to create value for customers. Your own customers would pay for enhanced information about your products and services if you can add value to them in the process. Within your industry, there are many ways to anonymize and sell the data that your portfolio companies create.

Finding public value buyers – Industry trade associations, consultancies, conference organizations, and the leading advisory firms are all eager to access unique insights and statistics they can use and sell to their own clients to generate competitive advantage.

Building a data factory mindset – Modern cloud data warehouse solutions make the technology to monetize your data quite easy. There are simple ways to make the data accessible and a marketplace for selling such data from each of the major cloud data warehouse vendors. The hardest part is not finding buyers or getting them the data; it is building an internal mindset that your internal data is a valuable asset that can be easily monetized. 

User Adoption

Our firm works with many private equity clients to design, build, and implement leading analytics solutions. A consistent learning across our project work is that user adoption is a critical success factor in our work.

Just because we have more accurate data, or more timely data, or more enriched data won’t necessarily increase the adoption of advanced analytical solutions in your portfolio companies. Not all of your operating executives are data driven nor are they all analytically driven. Just because they capably produce their monthly reporting package and get it to you on time does not mean they are acting on issues and opportunities that they should be able to discern from the data. Better training, organizational change techniques, internal data sharing, and many other ways can dramatically increase the speed and depth of the user adoption in your companies.

You know how to seek value when you invest. You know how to grow your companies post-close. Growing organically during your hold period will drive increased exit valuations and let you outperform your investment thesis. Focus on data quality and broad user adoption as two of your analytics priorities for strong organic value creation across your portfolio.

Contact us today to set up a complimentary private equity data whiteboarding session. Our analytics experts have a template for data monetization and data quality assessments that we can run through with you and your team.


What Is the Difference Between Snowflake and Amazon Redshift?

The modern business world is data-centric. As more businesses turn to cloud computing, they must evaluate and choose the right data warehouse to support their digital modernization efforts and business outcomes. Data warehouses can increase the bottom line, improve analytics, enhance the customer experience, and optimize decision-making. 

A data warehouse is a large repository of data businesses utilize for deep analytical insights and business intelligence. This data is collected from multiple data sources. A high-performing data warehouse can collect data from different operational databases and apply a uniform format for better analysis and quicker insights.

Two of the most popular data warehouse solutions are Snowflake and Amazon Web Services (AWS) Redshift. Let’s look at how these two data warehouses stack up against one another. 

What is Snowflake?

Snowflake is a cloud-based data warehousing solution that uses third-party cloud-compute resources, such as Azure, Google Cloud Platform, or Amazon Web Services (AWS.) It is designed to provide users with a fully managed, cloud-native database solution that can scale up or down as needed for different workloads. Snowflake separates compute from storage: a non-traditional approach to data warehousing. With this method, data remains in a central repository while compute instances are managed, sized, and scaled independently. 

Snowflake is a good choice for companies that are conscious about their operational overhead and need to quickly deploy applications into production without worrying about managing hardware or software. It is also the ideal platform to use when query loads are lighter, and the workload requires frequent scaling. 

The benefits of Snowflake include:

  • Easy integration with most components of data ecosystems
  • Minimal operational overhead: companies are not responsible for installing, configuring, or managing the underlying warehouse platform
  • Simple setup and use
  • Abstracted configuration for storage and compute instances
  • Robust and intuitive SQL interface

What is Amazon Redshift?

Amazon Redshift is an enterprise data warehouse built on Amazon Web Services (AWS). It provides organizations with a scalable, secure, and cost-effective way to store and analyze large amounts of data in the cloud. Its cloud-based compute nodes enable businesses to perform large-scale data analysis and storage. 

Amazon Redshift is ideal for enterprises that require quick query outputs on large data sets. Additionally, Redshift has several options for efficiently managing its clusters using AWS CLI/Amazon Redshift Console, Amazon Redshift Query API, and AWS Software Development Kit. Redshift is a great solution for companies already using AWS services and running applications with a high query load. 

The benefits of Amazon Redshift include:

  • Seamless integration with the AWS ecosystem
  • Multiple data output formatting support
  • Easy console to extract analytics and run queries
  • Customizable data and security models

Comparing Data Warehouse Solutions

Snowflake and Amazon Redshift both offer impressive performance capabilities, like scalability across multiple servers and high availability with minimal downtime. There are some differences between the two that will determine which one is the best fit for your business.

Performance

Both data warehouse solutions harness massively parallel processing (MPP) and columnar storage, which enables advanced analytics and efficiency on massive jobs. Snowflake boasts a unique architecture that supports structured and semi-structured data. Storage, computing, and cloud services are abstracted to optimize independent performance. Redshift recently unveiled concurrency scaling coupled with machine learning to compete with Snowflake’s concurrency scaling. 

Maintenance

Snowflake is a pure SaaS platform that doesn’t require any maintenance. All software and hardware maintenance is handled by Snowflake. Amazon Redshift’s clusters require manual maintenance from the user.

Data and Security Customization

Snowflake supports fewer customization choices in data and security. Snowflake’s security utilizes always-on encryption enforcing strict security checks. Redshift supports data flexibility via partitioning and distribution. Additionally, Redshift allows you to tailor its end-to-end encryption and set up your own identity management system to manage user authentication and authorization.

Pricing

Both platforms offer on-demand pricing but are packaged differently. Snowflake doesn’t bundle usage and storage in its pricing structure and treats them as separate entities. Redshift bundles the two in its pricing. Snowflake tiers its pricing based on what features you need. Your company can select a tier that best fits your feature needs. Redshift rewards businesses with discounts when they commit to longer-term contracts. 

Which data warehouse is best for my business?

To determine the best fit for your business, ask yourself the following questions in these specific areas:

  • Do I want to bundle my features? Snowflake splits compute and storage, and its tiered pricing provides more flexibility to your business to purchase only the features you require. Redshift bundles compute and storage to unlock the immediate potential to scale for enterprise data warehouses. 
  • Do I want a customizable security model? Snowflake grants security and compliance options geared toward each tier, so your company’s level of protection is relevant to your data strategy. Redshift provides fully customizable encryption solutions, so you can build a highly tailored security model. 
  • Do I need JSON storage? Snowflake’s JSON storage support wins over Redshift’s support. With Snowflake, you can store and query JSON with native functions. With Redshift, JSON is split into strings, making it difficult to query and work with. 
  • Do I need more automation? Snowflake automates issues like data vacuuming and compression. Redshift requires hands-on maintenance for these sorts of tasks. 

Conclusion

A data warehouse is necessary to stay competitive in the modern business world. The two major data warehouse players – Snowflake and Amazon Redshift – are both best-in-class solutions. One product is not superior to the other, so choosing the right one for your business means identifying the one best for your data strategy.

2nd Watch is an AWS Certified Partner and an Elite Snowflake Consulting Partner. We can help you choose the right data warehouse solution for you and support your business regardless of which data warehouse your choose.

We have been recognized by AWS as a Premier Partner since 2012, as well as an audited and approved Managed Service Provider and Data and Analytics Competency partner for our outstanding customer experiences, depth and breadth of our products and services, and our ability to scale to meet customer demand. Our engineers and architects are 100% certified on AWS, holding more than 200 AWS certifications.

Our full team of certified SnowPros has proven expertise to help businesses implement modern data solutions using Snowflake. From creating a simple proof of concept to developing an enterprise data warehouse to customized Snowflake training programs, 2nd Watch will help you to utilize Snowflake’s powerful cloud-based data warehouse for all of your data needs.

Contact 2nd Watch today to help you choose the right data warehouse for your business!


4 Data Principles for Operational Resilience

Scaling your portfolio companies creates value, and increasing their native agility multiplies the value created. The foundation of better resilience in any company is often based on the ready availability of operational data. Access to the data you need to address problems or opportunities is necessary if you expect your operating executives and management teams to run the business more effectively than their competitors.

Resilience is the strength and speed of our response to adversity – and we can build it. It isn’t about having a backbone. It’s about strengthening the muscles around our backbone. — Sheryl Sandberg

You need and want your portfolio companies to be operationally resilient – to be ready and able to respond to changes and challenges in their operations. We all have seen dramatic market changes in recent years, and we all should expect continued dynamic economic and competitive pressures to challenge even the best of our portfolio companies. Resilient companies will respond better to such challenges and will outperform their peers.

This post highlights four areas that you and your operating executives should consider as you strive to make yourself more operationally resilient:

  1. Data engineering takes time and effort. You can do a quick and dirty version of data engineering, also called loading it into a spreadsheet, but that won’t be sufficient to achieve what you really need in your companies.
  2. Building a data-driven culture takes time. Having the data ready is not enough, you need to change the way your companies use the data in their tactical and strategic decision-making. And that takes some planning and some patience to achieve.
  3. Adding value to the data takes time. Once you have easily accessible data, as an organization you should strive to add or enrich the data. Scoring customers or products, cleaning or scrubbing your source data, and adding external data are examples of ways you can enrich your data once you have it in a centrally accessible place.
  4. Get after it. You need and want better analytics in every company you own or manage. This is a journey, not a single project. Getting started now is paramount to building agility and resiliency over time on that journey.

Data Engineering can be Laborious

Every company has multiple application source systems that generate and store data. Those multiple systems store the data in their proprietary databases, in a format that best suits transactional systems, and likely redundantly stores common reference data like customer number and customer name, address, etc. To get all that data, standardize it, scrub it, and model it in the way that you need to manage your business takes months. You likely must hire consultants to build the data pipelines, create a data warehouse to store the data, and then build the reports and dashboards for data analysis.

On most of our enterprise analytics projects, data engineering consumes 60-70% of the time and effort put into the project. Ask any financial analyst or business intelligence developer – most of their time is spent getting their hands on the right, clean data. Dashboards and reports are quickly built once the data is available.

CASE STUDY

The CEO of a large manufacturing company wanted to radically increase the level of data-driven decision-making in his company. Working with his executive team, we quickly realized that functional silos, prior lack of easy data access, and ingrained business processes were major inhibitors to achieving their vision. 2nd Watch incorporated extensive organizational change work while we built a new cloud-based analytics warehouse to facilitate and speed the pace of change. Read the full case study.

A Data-driven Culture needs to be Nurtured and Built

Giving your executives access to data and reports is only half the battle. Most executives are used to making decisions without the complete picture and without a full set of data. Resiliency comes from having the data and from using it wisely. If you build it, not all will come to use it.

Successful analytics projects incorporate organizational change management elements to drive better data behaviors. Training, better analytics tools, collaboration, and measuring adoption are just some of the best practices that you can bring to your analytics projects to drive better use of the data and analysis tools that will lead to more resilience in your portfolio companies.

Data Collaboration Increases the Value of your Data

We consistently find that cross-functional sharing of data and analytics increases the value and effectiveness of your decision-making. Most departments and functions have access to their own data – finance has access to the GL and financial data, marketing has access to marketing data, etc. Building a single data model that incorporates all of the data, from all of the silos, increases the level of collaboration that lets your executives from all functions simultaneously see and react to the performance of the business.

Let’s be honest, most enterprises are still managed through elaborate functional spreadsheets that serve as the best data source for quick analysis. Spreadsheets are fine for individual analysis and reporting, and for quick ad-hoc analytics. They are not a viable tool for extensive collaboration and won’t ever enable the data value enhancement that comes from a “single source of truth.”

Operating Executives need to Build Resilience as they Scale their Companies.

Change is constant, markets evolve, and today’s problems and opportunities are not tomorrow’s problems and opportunities. Modern data and analytics solutions can radically improve their operational resilience and drive higher value. These solutions can be technically and organizationally complex and will take time to implement and achieve results. Start building resiliency in your portfolio companies by mapping out a data strategy and creating the data foundation that your companies need.

Contact us today to set up a complimentary whiteboarding session. Our analytics experts will work through a high-level assessment with you.


How a Dedicated Data Warehouse Yields Better Insight than Your CRM or ERP

What percent of your enterprise data goes completely untapped? It’s far more than most organizations realize. Research suggests that as much as 68% of global enterprise data goes unused. The reasons are varied (we can get to the root cause with a current state assessment), but one growing problem stems from misconceptions about CRMs, ERPs, EHRs, and similar operational software systems.

The right operational software systems are valuable tools with their own effective reporting functions. The foundation of any successful reporting or analytics initiative depends on two factors: on a centralized source of truth that exists in a unified source format. All operational software systems struggle to satisfy either aspect of that criteria.

Believe it or not, one of the most strategic systems for data-driven decision-making is still a dedicated data warehouse. Here is the value a data warehouse brings to your organization and the necessary steps to implement that enhance your analytics’ accuracy and insight.

Download Now: Modern Data Warehouse Comparison Guide [Snowflake, Redshift, Azure Synapse, and Google BigQuery]

CRMs and ERPs Are Data Silos with Disparate Formats

Operational software systems are often advertised as offering a unified view, but that’s only true for their designed purpose. CRMs offer a comprehensive view of customers, ERPs of operations, and EHRs of patient or member medical history. Outside of their defined parameters, these systems are data silos.

In an HBR blog post, Edd Wilder-James captures the conundrum perfectly: “You can’t cleanly separate the data from its intended use. Depending on your desired application, you need to format, filter, and manipulate the data accordingly.”

Some platforms are enabled to integrate outside data sources, but even that provides you with a filtered view of your data, not the raw and centralized view necessary to generate granular and impactful reports. It’s the difference between abridged and unabridged books – you might glean chunks of the big picture but miss entire sections or chapters that are crucial to the overall story.

Building a dedicated data warehouse removes the question of whether your data sets are complete. You can extract, transfer, and load data from source systems into star schemas with a unified format optimized for business users to leverage. The data is formatted around the business process rather than the limitations of the tool. That way, you can run multifaceted reports or conduct advanced analytics when you need it – without anchoring yourself to any specific technology.

Tracking Down Your Data Sources

In all honesty, organizations not familiar with the process often overlook vital information sources. There might be a platform used to track shipping that only one member of your team uses. Maybe there’s a customer service representative who logs feedback in an ad hoc document. Or it’s possible there’s HIPAA-compliant software in use that isn’t automatically loading into your EHR. Regardless of your industry, there are likely gaps in your knowledge well outside of the CRMs, ERPs, EHRs, and other ostensibly complete data sources.

How do you build a single source of truth? It’s not as simple as shifting around a few sources. Implementing a dedicated data warehouse requires extensive planning and preparation. The journey starts with finding the invisible web of sources outside of your primary operational software systems. Those organizations that choose to forgo a full-fledged current state assessment to identify those hidden sources only achieve fragmentary analytics at best.

Data warehouse implementations need guidance and buy-in at the corporate level. That starts with a well-defined enterprise data strategy. Before you can create your strategy, you need to ask yourself questions such as these:

  • What are your primary business objectives?
  • What are your key performance indicators?
  • Which source systems contribute to those goals?
  • Which source systems are we currently using across the enterprise?

By obtaining the answers to these and other questions from decision-makers and end users, you can clarify the totality of your current state. Otherwise, hunting down those sources is an uphill battle.

Creating Data Warehouse Value that Lasts

Consolidating your dispersed data sources is just a starting point. Next, you need to extract the data from each source system and populate them within the data warehouse framework itself. A key component of this step is to test data within your warehouse to verify quality and completeness.

If data loss occurs during the ETL process, the impact of your work and veracity of your insights will be at risk. Running a variety of different tests (e.g., data accuracy, data completeness, data transformation, etc.) will reduce the possibility of any unanticipated biases in your single source of truth.

What about maintaining a healthy and dynamic data warehouse? How often should you load new data? The answer depends on the frequency of your reporting needs. As a rule of thumb, think in terms of freshness. If your data has gone stale by the time you’re loading it into your data warehouse, increase the frequency of your data refresh. Opt for real-time analytics if it will provide you with a strategic advantage, not because you want to keep current with the latest buzzword.

Improving Your Results with an Outsourced Partner

Each step in the process comes with its own complications. It’s easy to fall into common data warehousing pitfalls unless you have internal resources with experience pinpointing hidden data sources, selecting the right data model, and maintaining your data warehouse post-implementation.

One of our clients in the healthcare software space was struggling to transition to a dynamic data warehousing model that could enhance their sales. Previously, they had a reporting application that they were using on a semi-annual basis. Though they wanted to increase the frequency of their reporting and enable multiple users to run reports simultaneously, they didn’t have the internal expertise to confidently navigate these challenges.

Working with 2nd Watch made a clear difference. Our client was able to leverage a data warehouse architecture that provided daily data availability (in addition to the six-month snapshot) and self-service dashboards that didn’t require changes or updates on their part. We also set them on the right path to leverage a single source of the truth through future developments.

Our strategies in that project prioritized our client’s people instead of a specific technology. We considered the reporting and analytics needs of their business users rather than pigeonholing their business into a specific tool. Through our tech-agnostic approach, we guided them toward a future state that provided strategic advantage and a clear ROI that might have otherwise gone unachieved.

Want your data warehouse to provide you with a single source of the truth? Schedule a whiteboard session to review your options and consolidate your data into actionable insight.


Where Does a Modern Data Warehouse Fit in an Organization?

In part 1 and part 2 of our modern data warehouse series, we laid out the benefits of a data warehouse and compared the different types of modern data warehouses available. In part 3, we take a step back and see how the modern data warehouse fits in your overall data architecture.

A modern data warehouse is just one piece of the puzzle of a modern data architecture that will ultimately provide insights to the business via reporting, dashboarding, and advanced analytics.

There are many factors to consider when it comes to modern data warehousing, and it’s important to understand upfront that it’s a huge endeavor. With that in mind, a well-designed modern data warehouse will help your organization grow and stay competitive in our ever-changing world.

Download Now: Modern Data Warehouse Comparison Guide [Snowflake, Redshift, Azure Synapse,and Google BigQuery]

The ultimate goal of modern architecture is to facilitate the movement of data not only to the data warehouse but also to other applications in the enterprise. The truth of the matter is that a modern data architecture is designed very similarly to how we at 2nd Watch would design an on-premise or traditional data architecture, though with some major differences. Some of the benefits of a modern data architecture are as follows:

  • Tools and technology available today allow the development process to speed up tremendously.
  • Newer data modeling methodologies can be used to track the history of data efficiently and cost-effectively.
  • Implementation of near real-time scenarios is much more cost-effective and easier to implement utilizing cloud technologies.
  • With some SaaS providers, you can worry much less about the underlying hardware, indexing, backups, and database maintenance and more about the overall business solution.
  • While technology advances have removed some of the technical barriers experienced in on-premises systems, data must still be modeled in a way that supports goals, business needs, and specific use cases.

Below you will find a high-level diagram of a modern data architecture we use at 2nd Watch, along with a description of the core components of the architecture:

Technical details aside, 2nd Watch’s architecture provides key benefits that will add value to any business seeking a modern data warehouse. The raw data layer enables the ingestion of all forms of data, including unstructured data. In addition, the raw layer keeps your data safe by eliminating direct user access and creating historical backups of your source data. This historical record of data can be accessed for data science use cases as well as modeled for reports and dashboards to show historical trends over time.

The transformation-focused data hub enables easy access to data from various source systems. For example, imagine you have one customer that can be tracked across several subsidiary companies. The business layer would enable you to track their activity across all of your business lines by conforming the various data points into one source of truth. Furthermore, the business layer allows your organization to add additional data sources without disrupting your current reporting and solutions.

The enterprise data warehouse provides a data layer structured with reporting in mind. It ensures that any reports and dashboards update quickly and reliably, and it provides data scientists with reliable data structured for use in models. Overall, the modern data warehouse architecture enables you to provide your end users with near real-time reporting, allowing them to act on insights as they occur. Each component of the architecture provides unique business value that translates into a competitive advantage.

If you depend on your data to better serve your customers, streamline your operations, and lead (or disrupt) your industry, a modern data platform built on the cloud is a must-have for your organization.

Contact us for a complimentary whiteboarding session to learn what a modern data warehouse would look like for your organization.


Blockchain: The Basics

Blockchain is one of those once-in-a-generation technologies that has the potential to really change the world around us. Despite this, blockchain is something that a lot of people still know nothing about. Part of that, of course, is because it’s such a new piece of technology that really only became mainstream within the past few years. The main reason, though, (and to address the elephant in the room) is because blockchain is associated with what some describe as “fake internet money” (i.e., Bitcoin). The idea of a decentralized currency with no guarantor is intimidating, but let’s not let that get in the way of what could be a truly revolutionary technology. So, before we get started, let’s remove the Bitcoin aspect and simply focus on blockchain. (Don’t worry, we’ll pick it back up later on.)

Blockchain, at its very core, is a database. But blockchains are different from traditional databases in that they are immutable, unable to be changed. Imagine this: Once you enter information into your shiny new blockchain, you don’t have to worry about anybody going in and messing up all your data. “But how is this possible?” you might ask.

Blockchains operate by taking data and structuring it into blocks (think of a block like a record in a database). This can be any kind information, from names and numbers all the way to executable code scripts. There are a few essential pieces of information that should be placed in all blocks, those being an index (the block number), a timestamp, and the hash (more on this later) of the previous block. All of this data is compiled into a block, and a hashing algorithm is applied to the information.

After the hash is computed, the information is locked and you can’t change information without re-computing the hash. This hash is then passed on to the next block where it gets included in its data, creating a chain. The second block then compiles all of its own data and, including the hash of the previous block, creates a new hash and sends it to the next block in the chain. In this way, a blockchain is created by “chaining” together blocks by means of a block’s unique hash. In other words, the hash of one block is reliant on the hash of the previous block, which is reliant on that of the one before it, ad infinitum.

And there you go, you have a blockchain! Before we move on to the next step (which will really blow your mind), let’s recap:

You have Block-0. Information is packed into Block-0 and hashed, giving you Hash-0. Hash-0 is passed to Block-1, which is combined with the data in Block-1. So, Block-1’s data now includes its own information and Hash-0. This is now hashed to release Hash-1, and it’s passed to the next block.

The second major aspect of blockchain is that it is distributed. This means that the entire protocol is operated across a network of nodes at the same time. All of the nodes in the network store the entire chain, along with all new blocks, at the same time and in real time.

Secure Data Is Good Data

Remember earlier when we said a blockchain is immutable? Let’s go back to that.

Suppose you have a chain 100 blocks long and running on 100 nodes at once. Now let’s say you want to stage an attack on this blockchain to change Block-75. Because the chain is run and stored across 100 nodes simultaneously, you have to instantaneously change Block-75 in all 100 nodes at the same time. Let’s imagine somehow you are able to hack into those other nodes to do this; now you have to rehash everything from Block-75 to Block-100 (which, remember, rehashing is extremely computationally difficult). So while you (the singular malicious node) are trying to rehash all of those blocks, the other 99 nodes in the network are working to hash new blocks, thereby extending the chain. This makes it impossible for a compromised chain to become valid because it will never reach the same length of the original chain.

About That Bitcoin Thing…

Now, there are two types of blockchains. Most popular blockchains are public, in which anybody in the world is able to join and contribute to the network. This requires some incentive, as without it nobody would join the network, and this comes in the form of “tokens” or “coins” (i.e., Bitcoin). In other words, Bitcoin is an incentive for people to participate and ensure the integrity of the chain. Then there are permissioned chains, which are run by individuals, organizations, or conglomerates for their own reasons and internal uses. In permissioned chains, only nodes with certain permissions are able to join and be involved in the network.

And there you go, you have the basics of blockchain. At a fundamental level, it’s an extremely simple yet ingenious idea with applications for supply chains, smart contracts, auditing, and many more to come. However, like any promising new technology, there are still questions, pitfalls, and risks to be explored. If you have any questions about this topic or want to discuss the potential for blockchain in your organization, contact us here.


Benefits of a Data Vault Model for 3PLs and How It Can Help Drive Better Decisions

As many third-party logistics (3PL) companies transition to a data-driven approach, it’s essential to underscore the importance of your data management practices. The way you choose to organize and store data impacts everything from how fast you can access information to which metrics are available. Many data-forward 3PL companies have begun implementing a data vault model to address this strategic decision. The data vault model allows them to address industry-wide challenges such as disparate data, lack of visibility into what is happening, reworking of analytics when acquisitions occur, and slow retrieval or transfer of information.

To assist you in determining the best possible way to organize your data, we will outline the benefits of a data vault model for 3PLs and highlight four use cases to illustrate the benefits for better decision-making.

What is data vault?

A data vault model is known for its practice of separating your data’s primary keys, relationships, and attributes from each other. Let’s say you want to analyze which customers are moving the most loads through you. The relationship between the customer and the load would be stored in one table, while the details about each load and customer would be stored in two separate, but related tables.

Structuring data in this manner helps you account for changing relationships within your data and seamless integration of new data sources when acquisitions or business rules inevitably change. Additionally, it enables quicker data loading through parallel streams and automatically stores historical data. For more details on what a data vault model is and the benefits it provides, check out this blog by 2nd Watch.

Data vault makes it easier to build a data warehouse with accurate, centralized data

The built-in relationships between data vault entities (hubs, satellites, links) make it easier to build a data warehouse. Structuring your data model around flexible but integrated primary keys allows you to combine data from various source systems easily in your data warehouse. It helps you ensure the data loaded into your reporting is not duplicated or out of date.

A lack of a data governance strategy often means that reporting is inconsistent and inaccurate. It reduces executives’ visibility into departments throughout the organization and limits your ability to create effective reporting because data is disjointed. Implementing a data vault model inherently accounts for centralizing your source data and enforcing primary keys. This will not only allow you to offer better reporting to customers, but it has also been found that accurate data is key to shipping accuracy. A strong data warehouse will further your internal analytics abilities by unlocking dashboards that highlight key metrics from revenue to cost-per-pound or on-time performance.

Data vault models make it easy to add new data sources and update business rules without interrupting access to data

A data vault model enables you to centralize data from various sources, while still addressing their differences such as load frequency and metadata. This is accomplished by storing the primary keys for an entity in one table, then creating attribute tables (satellites) specific to separate source systems.

Under a traditional model, most of this data would be held in one table and would require changes to the table structures, and therefore interruptions to data in production, each time a new source system is added. A scalable data model, like data vault, allows you to quickly adjust data delivery and reporting if your customers expand to new markets or merge with another company. Not only will this satisfy your current customers, but it is additionally a quality many logistics companies seek when choosing a 3PL partner. Accommodating multiple source systems and implementing business rules flexibly is key for any 3PL company’s data solution.

Data vault models allow for parallel loading, which gets you and your customers access to data faster

Data vault separates its source systems and data components into different tables. In doing so, it eliminates dependencies within your data and allows for parallel loading, meaning that multiple tables can be loaded at once rather than in a sequence. Parallel loading dramatically reduces the time it takes to access refreshed data.

Many 3PL companies offer customers access to high-quality reporting. Implementing a data vault model to load data quicker allows customers to gain insights in near-real-time. Furthermore, key metrics such as order accuracy, return rates, and on-time shipping percentage rely on timely data. They either require you to respond to a problem or could become inaccurate if your data takes too long to load. The faster you access your data, the more time you have to address your insights. This ultimately enables you to increase your accuracy and on-time shipments, leading to more satisfied customers.

Data vault models automatically save historic data required for advanced analytics

Whether you are looking for more advanced forecasting or planning to implement machine learning analytics, you will need to rely on historical data. Satellite tables, mentioned previously, store attribute information. Each time a feature of an order, a shipment, an employee, etc., changes, it is recorded in a satellite table with a timestamp when the change occurred. The model tracks the same information for changing relationships. This data allows you to automatically tie larger events to specific attributes involved when the events occurred.

3PL companies without data vault models often lose this history of attributes and relationships. When they pursue initiatives to find nuanced trends within their data through advanced analytics, their implementation is roadblocked by the task of generating adequate data. Alternatively, 3PL companies with a data vault model are ready to hit the ground running. Having historical data at your fingertips makes you prepared for any advanced analytics strategy.

2nd Watch has vast experience integrating 3PL companies’ key financial and operational data into a centralized hub. This immediately enables quick, reliable, and holistic insights to internal stakeholders and customers. Furthermore, it lays the groundwork for advanced predictive analytics that allow your teams to proactively address key industry challenges, including late deliveries, volatile market rates, and equipment failure.

Reach out to 2nd Watch for assistance getting started with data vault or evaluating how it may fit in with your current data strategy.


A CTO’s Guide to a Modern Data Platform: Data Strategy and Governance

In our previous blog post on how to build a data warehouse in 6-8 weeks, we showed you how to get lightning-fast results and effectively create a working data warehouse with Snowflake. Future state integrations and governance needs are coming, though. This is why 2nd Watch highly recommends executing a data strategy and governance project in parallel with your Snowflake proof-of-concept. Knowing how to leverage Snowflake’s strengths to avoid common pitfalls will save you time, money, and re-work.

Consider one company that spent a year using the data discovery layer-only approach. With data sources all centralized in the data warehouse and all transformations occurring at run-time in the BI tool, the data team was able to deliver a full analytical platform to its users in less time than ever before. Users were happy, at first, until the logic became more mature and more complex and ultimately required more compute power (translating to higher cost) to keep the same performance expectations. For some, however, this might not be a problem but an expected outcome.

For this company, enabling analytics and reporting was the only need for the first year, but integration of data across applications was coming full steam ahead. The primary line of business applications needed to get near-real-time updates from the others. For example, marketing automation didn’t rely 100% on humans; it needed data to execute its rules, from creating ad campaigns to sending email blasts based on events occurring in other systems.

This one use case poked a big hole in the architecture – you can’t just have a data warehouse in your enterprise data platform. There’s more to it. Even if it’s years away, you need to effectively plan for it or you’ll end up in a similar, costly scenario. That starts with data strategy and governance.

ETL vs. ELT in Snowflake

Identify where your transformations occur and how they impact your downstream systems.

The new paradigm is that you no longer need ETL (Extract, Transform, Load) – you need ELT (Extract, Load, Transform). This is true, but sometimes misleading. Some will interpret ELT as no longer needing to build and manage the expensive pipelines and business logic that delay speed-to-insight, are costly to maintain, and require constant upkeep for changing business rules. In effect, it’s interpreted as removing the “T” and letting Snowflake solve for this. Unfortunately, someone has to write the code and business logic, and it’s best to not have your business users trying to do this when they’re better served working on your organization’s core goals.

In reality, you are not removing the “T” – you are moving it to a highly scalable and performant database after the data has been loaded. This is still going to require someone to understand how your customer data in Salesforce ties to a customer in Google Analytics that corresponds to a sale in your ERP. You still need someone who knows both the data structures and the business rules. Unfortunately, the “T” will always need a place to go – you just need to find the right place.

Ensure your business logic is defined only once in the entire flow. If you’ve written complex transformation code to define what “customer” means, when that business logic inevitably changes, you’ll be guaranteed that this definition of “customer” will flow the same way to your BI users as it does to your ERP and CRM. When data science and machine learning enter the mix, you’ll also avoid time spent in data prep and instead focus on delivering predictive insights.

You might be thinking that this all sounds even more similar to the data warehouse you’ve already built and are trying to replace. There’s some good news: Snowflake does make this easier, and ELT is still exactly the right approach.

Defining and Adjusting the Business Logic and Views

Snowflake enables an iterative process of data discovery, proof-of-concept, business value, and long-term implementation.

Perhaps you’ve defined a sales hierarchy and a salesperson compensation metric. The developer can take that logic, put it into SQL against the raw data, and refresh the dashboard, all while the business user is sitting next to them. Is the metric not quite what the user expected, or is the hierarchy missing something they hadn’t thought of in advance? Tweak the SQL in Snowflake and refresh. Iterate like this until the user is happy and signs off, excited to start using the new dashboard in their daily routine.

By confirming the business logic in the salesperson compensation example above, you’ve removed a major part of what made ETL so painful in the past: developing, waiting for a load to finish, and showing business users. That gap between load finishing and the next development cycle is a considerable amount of lost time and money. With this approach, however, you’ve confirmed the business logic is correct and you have the SQL already written in Snowflake’s data discovery views.

Developing your initial logic in views in Snowflake’s data discovery layer allows you to validate and “certify” it for implementation into the physical model. When you’ve completed the physical path, you can change the BI tool for each completed subject area to point to the physical layer instead of the data discovery layer.

If you have any questions about data strategy and governance, or if you want to learn more about how Snowflake can fit into your organization, contact us today.

This blog originally appeared as a section of our eBook, “Snowflake Deployment Best Practices: A CTO’s Guide to a Modern Data Platform.” Click here to download the full eBook.

Related Content:

What is Snowflake, How is it Different, and Where Does it Fit in Your Ecosystem?

How to Build a Data Warehouse in 6-8 Weeks

Methods to Implement a Snowflake Project