Dating back to 2014, Snowflake disrupted analytics by introducing the Snowflake Elastic Data Warehouse, the first data warehouse built from the ground up for the cloud with patented architecture that would revolutionize the data platform landscape. Four years later, Snowflake continued to disrupt the data warehouse industry by introducing Snowflake Data Sharing, an innovation for data collaboration that would eliminate the barriers of traditional data sharing methods in favor of enabling enterprises to easily share live data in real time without moving the shared data. This year, in 2022, under the bright Sin City lights, Snowflake intends to disrupt application development by echoing a unified platform for developing data applications from coding to monetization.
Currently, data teams looking to add value, whether it be improving their team’s analytical efficiency or reducing costs to their enterprise’s processes, develop internal data products such as ML-powered product categorization models or C-suite dashboards in whichever flavor their team is savvy in. However, to produce an external data product that brings value to their enterprise, there is only one metric executives truly care about: revenue.
To bridge the value gap between internal and external data products comes the promise of the Snowflake Native Application Framework. This framework will now enable developers to build, distribute, and deploy applications natively in the Data Cloud landscape through Snowflake. Moreover, these applications can be monetized on the Snowflake Marketplace, where consumers can securely purchase, install, and run these applications natively in their Snowflake environments, with no data movement required. It’s important to note that Snowflake’s goal is not to compete with OLTP Oracle DB workloads, but rather to disrupt how cloud applications are built by seamlessly blending the transactional and analytical capabilities Snowflake has to offer.
To round out the Snowflake Native Application Framework, a series of product announcements were made at the Summit:
Unistore (Powered by Hybrid Tables): To bridge transactional and analytical data in a single platform, Snowflake developed a new workload called Unistore. At its core, the new workload enables customers to unify their datasets across multiple solutions and streamline application development by incorporating all the same simplicity, performance, security, and governance customers expect from the Snowflake Data Cloud platform. To power the core, Snowflake developed Hybrid Tables. This new table type supports fast single-row operations driven by a ground-breaking row-based storage engine that will allow transactional applications to be built entirely in Snowflake. Hybrid Tables will also support primary key enforcement to protect against duplicate record inserts.
Snowpark for Python: Snowpark is a development framework designed to bridge the skill sets of engineers, data scientists, and developers. Previously, Snowpark only supported Java and Scala, but Snowflake knew what the people wanted – the language of choice across data engineers/scientists and application developers, Python. Allowing Python workloads into Snowflake removes the burden of security, a challenge developers often face within their enterprises.
Snowflake Worksheets for Python: Though in private preview currently, Snowflake will support Python development natively within Snowsight Worksheets, to develop pipelines, ML models, and applications. This will allow streamlining development features like auto-complete and the ability to code custom Python logic in seconds.
Streamlit Acquisition: To democratize access to data, a vision both Snowflake and Streamlit share, Snowflake acquired Streamlit, an open-source Python project for building data-based applications. Streamlit helps fill the void of bringing a simplified app-building experience for data scientists who want to quickly translate an ML model into a visual application that anyone within their enterprise can access.
Large Memory Warehouses: Still in development (out for preview in AWS in EU-Ireland), Snowflake will soon allow consumers to access 5x and 6x larger warehouses. These larger warehouses will enable developers to execute memory-intensive operations such as training ML models on large datasets through open-source Python libraries that will be natively available through Anaconda integration.
On top of all those features released for application development, Snowflake also released key innovations to improve data accessibility, such as:
Snowpipe Streaming: To eliminate the boundaries between batch and streaming pipelines, Snowflake introduced Snowpipe Streaming. This new feature will simplify stitching together both real-time and batch data into one single system. Users can now ingest via a client API endpoint aggregated log data for IoT devices without adding event hubs and even ingest CDC streams at a lower latency.
External Apache Iceberg Tables: Developed by Netflix, Apache Iceberg tables are open-source tables that can support a variety of file formats (e.g., Parquet, ORC, Avro). Snowflake will now allow consumers to query Iceberg tables in place, without moving the table data or existing metadata. This translates to being able to access customer-supplied storage buckets with Iceberg tables without compromising on security and taking advantage of the consistent governance of the Snowflake platform.
External On-Prem Storage Tables: For many enterprises, moving data into the Data Cloud is not a reality due to a variety of reasons, including size, security concerns, cost, etc. To overcome this setback, Snowflake has released in private preview the ability to create External Stages and External Tables on storage systems such as Dell or Pure Storage that can expose a highly compliant S3 API. This will allow customers to access a variety of storage devices using Snowflake without worrying about concurrency issues or the effort of maintaining compute platforms.
Between the Native Application Framework and the new additions for data accessibility, Snowflake has taken a forward-thinking approach on how to effectively disrupt the application framework. Developers should be keen to take advantage of all the new features this year while understanding that some key features such as Unistore and Snowpipe Streaming will have bumps along the road as they are still under public/private preview.
Sisu is a fairly new, relatively unique tool that applies a user-friendly interface to robust and deep-diving business analytics, such as the example of big data analytics in the telecom industry we’ll cover in this blog post. With well-defined KPIs and a strong grasp of the business decisions relying on the analytics, even non-technical users are able to confidently answer questions using the power of machine learning through Sisu.
Below, we’ll detail the process of using Sisu to uncover the main drivers of customer churn for a telecom company, showing you what kind of data is appropriate for analysis in Sisu, what analysis 2nd Watch has performed using Sisu, and what conclusions our client drew from the data analysis. Read on to learn how Sisu may offer your organization the competitive advantage you’re looking for.
What is Sisu?
Sisu uses a high-level declarative query model to allow users to tap into existing data lakes and identify the key features impacting KPIs, even enabling users who aren’t trained data analysts or data scientists. Analysis improves with time as data increases and more users interact with Sisu’s results.
Sisu moves from user-defined objectives to relevant analysis in five steps:
Querying and Processing Data: Sisu ingests data from a number of popular platforms (e.g., Amazon Redshift, BigQuery, Snowflake) with light transformation and can update/ingest over time.
Data Quality, Enrichment, and Featurization: Automated, human-readable featurization exposes the most relevant statistical factors.
Automated Model and Feature Selection: Sisu trains multiple models to investigate KPIs on a continuous or categorical basis.
Personalized Ranking and Relevance: Sisu ranks facts by several measures that prioritize human time and attention, improving the personalized model over time.
Presentation and Sharing: To dig into facts, Sisu offers natural language processing (NLP), custom visualization, supporting statistics, and related facts that illustrate why a fact was chosen.
How does Sisu help users leverage data to make better data-driven decisions?
Sisu can help non-technical users analyze data from various data sources (anything from raw data in a CSV file to an up-and-running database), improving data-driven decision-making across your organization. A couple of things to keep in mind: the data should already be cleaned and of high integrity; and Sisu works best with numerical data, not text-based data.
Once the data is ready for analysis, you can easily create a simple visualization:
Identify your key variable.
Choose a tracking metric.
Select the time frame, if applicable.
Run the visualization and apply to A/B groups as necessary.
With Sisu, users don’t need to spend time on feature selection. When a user builds a metric, Sisu queries the data, identifies high-ranking factors, and presents a list of features with the most impact. This approach subverts the traditional OLAP and BI process, making it easier and faster to ask the right questions and get impactful answers – requiring less time while offering more value.
Simplicity and speed are key contributors to why Sisu is so advantageous, from both a usability standpoint and a financial point of view. Sisu can help you increase revenue and decrease expenses with faster, more accurate analytics. Plus, because Sisu puts the ability to ask questions in the hands of non-technical users, it creates more flexibility for teams throughout your organization.
How did 2nd Watch use Sisu to reduce customer churn for a telecom company?
Being able to pick out key drivers in any set of data is essential for users to develop specific business-impacting insights. Instead of creating graphics from scratch or analyzing data through multiple queries like other analytical tools require, Sisu allows your teams to query their data in a user-friendly way that delivers the answers they need.
For our client in the telecommunications industry, group comparisons were crucial in determining who would likely become long-standing customers and who would have a higher rate of churn. Filtering and grouping the demographics of our client’s customer base allowed them to outline their target market and begin understanding what attracts individuals to stay longer. Of course, this then enables the company to improve customer retention – and ultimately revenue.
Sisu can also be employed in other areas of our client’s organization. In addition to customer churn data, they can investigate margins, sales, network usage patterns, network optimization, and more. With the large volumes of data in the telecom industry, our client has many opportunities to improve their services and solutions through the power of Sisu’s analytics.
How can Sisu benefit your organization?
Sisu reduces barriers to high-level analytical work because its automated factor selection and learning capabilities make analytics projects more efficient. Using Sisu to focus on who is driving business-impacting events (like our telecom client’s customer churn) allows you to create user profiles, monitor those profiles, and track goals and tweak KPIs accordingly. In turn, this allows you to be more agile, move from reactive to proactive, and ultimately increase revenue.
Because feature selection is outsourced to Sisu’s automated system, Sisu is a great tool for teams lacking in high-level analytics abilities. If you’re hoping to dive into more advanced analytics or data science, Sisu could be the stepping stone your team needs.
Data and analytics is a major driver and source of great value for private equity firms. The best private equity firms know the full power of data and analytics. They realize that portfolio company enterprise data is typically the crown jewel of an acquisition or deal target.
Data and analytics are also the foundation of financial and operational transformation. Quickly pulling data from their portfolio companies, and consolidating it into actionable information, will enable and accelerate financial and operational value opportunities, driving up EBITDA. Even better, the creation of data monetization revenue opportunities unlocks hidden sources of value creation. And down the road, a data-driven organization will always yield much higher financial valuations and returns to their investors.
Most firms doing due diligence on potential targets will only do basic due diligence. They will focus on assuring financial valuation and risk assessment. Therefore, most PE firms will conduct standard IT due diligence, analyzing expense budgets, hardware and software capital assets, license and service contracts, and headcount/staffing. They will seek to understand IT architecture, as well as assess the network in terms of capability. Because it is top of mind, the due diligence effort will also heavily focus on cyber and network security, and the architecture built to protect the portfolio company and its data. Typically, they will declare the due diligence effort complete.
Beyond classical IT due diligence, most dealmakers try to understand their data assets once the deal has closed and they begin operating the acquired company. However, best practice says otherwise. To accelerate the data and analytics value creation curve, it really starts at data due diligence. Precise data due diligence serves as the foundation for portfolio data strategy, as well as uncovers hidden sources of potential and opportunistic strategic value. Doing data due diligence will give the PE firm and portfolio company a running start on data value creation once the deal has closed.
What should deal firms look for when doing data and analytics due diligence? Here are key areas and questions for investigation and analysis when investigating a target portfolio company.
Step 1: Determine the target company’s current overall approach to managing and analyzing its data.
Develop an understanding of the target company’s current approach to accessing and analyzing their data. Understanding their current approach will let you know the effort needed to accelerate potential data value creation.
Does the target company have a comprehensive data strategy to transform the company into a data-driven enterprise?
Does the company have a single source of truth for data, analytics, and reporting?
What is the target company’s usage of data-driven business decisions in operations, marketing, sales, and finance?
What cloud services, architectures, and tools does the company use to manage its data?
What is the on-prem data environment and architecture?
What kind of cloud data and analytics proofs-of-concept does the company have in place to build out its capabilities?
Has the company identified and implemented value prop use cases for data and analytics, realizing tangible ROI?
Where is the target company on the data and analytics curve?
Step 2: Identify the data sources, what data they contain, and how clean the data is.
Data value depends on breadth and quality of the target company’s data and data sources. Document what the data sources are, what purpose they serve, how the target company currently integrates data sources for analytics, the existing security and data governance measures, and the overall quality of the data.
Inventory all of the company’s data sources, including a data dictionary, size, physical and logical location, data architecture, data model, etc.
How many of the data sources have an API for ETL (extract, transform, load) to pull data into the data warehouse?
Does the target company have a data warehouse, and are all of its data sources feeding the data warehouse?
How much history does each data source have? Obviously, the longer the history, the greater the value of the data source.
What kind of data security is in place to protect all data sources?
What kind of data quality assessment for each source has been conducted?
Step 3: Assess the quality of the target company’s analytics and reporting.
Review how the target company approaches reporting and analytics. This step should include a review of their tools and technologies, KPIs and metrics, and reporting (i.e., self- service, interactive, dashboards, Excel reports, reports delivered by IT, etc.).
What kind of reporting does the company use?
Does the portfolio company have a heavy dependence on Excel for producing reports?
Describe the KPIs that are in place for each functional area. How has the company been tracking against these KPIs?
Does the company enable self-service analytics across the enterprise?
What is the inventory of all reports generated by the company?
What percentage of the reports are delivered by way of dashboarding?
Step 4: Review the people and processes involved in data management and analytics.
Determine the extent of the target company as a data-driven organization by examining the people and processes behind the data strategy. Document which FTEs are involved with data and analytics, how much time is dedicated to reporting and report development, as well as the current processes for analytics.
How many FTEs are engaged in financial and operational report development?
What does the data and analytics team consist of, in terms of data engineers, data scientists, data administrators, and others with data titles?
What kind of data governance is in place for the target company to regulate the structure of data, as well as where and how data can flow through the organization?
Step 5: Find opportunities for target company data value creation.
Assess, understand, and determine the opportunities for marketing and operational improvements, cost reduction, untapped areas of growth, data monetization, cash flow improvement, and more.
Which of the following advanced data and analytics use cases does the portfolio company have in place?
Marketing channel excellence
Working capital rationalization
Fixed asset deployment and maintenance
Operational labor transformation
Forecasting predictive analytics
Automated customer reporting
Supply chain optimization
What use cases does the company conduct for data science predictive and prescriptive analytics?
What is the target company’s data monetization strategy, and where are they with implementation?
What is the company’s usage of big data to enhance marketing, sales, and customer service understanding and strategies?
What third-party data does the company use to supplement internal data to drive enhanced insights into marketing and operating?
To accelerate data and analytics value creation for a portfolio company target, start the process during due diligence. Gaining tremendous insight into the potential for data will accelerate the plan once the deal is closed and allow for a running start on data analytics value creation. With these insights, the PE firm, in partnership with their portfolio company, will generate fast data ROI and enable financial and operational transformation, EBITDA growth, and enhanced cash flow.
At 2nd Watch, we help private equity firms implement comprehensive data analytics solutions from start to finish. Our data experts guide, oversee, and implement focused analytics projects to help clients attain more value from modern analytics. Contact us for a complimentary 90-minute whiteboard session to get started.
–Jim Anfield, Principle and Health Care Practice Leader