Data Clean Rooms: Share Your Corporate Data Fearlessly

Data sharing has become more complex, both in its application and our relationship to it. There is a tension between the need for personalization and the need for privacy. Businesses must share data to be effective and ultimately provide tailored customer experiences. However, legislation and practices regarding data privacy have tightened, and data sharing is tougher and fraught with greater compliance constraints than ever before. The challenge for enterprises is reconciling the increased demand for data with increased data protection.

The modern world runs on data. Companies share data to facilitate their daily operations. Data distribution occurs between business departments and external third parties. Even something as innocuous as exchanging Microsoft Excel and Google Sheets spreadsheets is data sharing!

Data collaboration is entrenched in our business processes. Therefore, rather than avoiding it, we must find the tools and frameworks to support secure and privacy-compliant data sharing. So how do we govern the flow of sensitive information from our data platforms to other parties?

The answer: data clean rooms. Data clean rooms are the modern vehicle for various data sharing and data governance workflows. Across industries – including media and entertainment, advertising, insurance, private equity, and more – a data clean room can be the difference-maker in your data insights.

Ready to get started with a data clean room solution? Schedule time to talk with a 2nd Watch data expert.

What is a data clean room?

There is a classic thought experiment wherein two millionaires want to find out who is richer without actually sharing how much money they are individually worth. The data clean room solves this issue by allowing parties to ask approved questions, which require external data to answer, without actually sharing the sensitive information itself!

In other words, a data clean room is a framework that allows two parties to securely share and analyze data by granting both parties control over when, where, and how said data is used. The parties involved can pool together data in a secure environment that protects private details. With data clean rooms, brands can access crucial and much-needed information while maintaining compliance with data privacy policies.

Data clean rooms have been around for about five years with Google being the first company to launch a data clean room solution (Google Ads Data Hub) in 2017. The era of user privacy kicked off in 2018 when data protection and privacy became law, most notably with the General Data Protection Regulation (GDPR).

This was a huge shake-up for most brands. Businesses had to adapt their data collection and sharing models to operate within the scope of the new legislation and the walled gardens that became popular amongst all tech giants. With user privacy becoming a priority, data sharing has become stricter and more scrutinized, which makes marketing campaign measurements and optimizations in the customer journey more difficult than ever before.

Data clean rooms are crucial for brands navigating the era of consumer protection and privacy. Brands can still gain meaningful marketing insights and operate within data privacy laws in a data clean room.

Data clean rooms work because the parties involved have full control over their data. Each party agrees upon access, availability, and data usage, while a trusted data clean room offering oversees data governance. This yields the secure framework needed to ensure that one party cannot access the other’s data and upholds the foundational rule that individual, or user-level data cannot be shared between different parties without consent.

Personally, identifying information (PII) remains anonymized and is processed and stored in a way that is not exposed to any parties involved. Thus, data sharing within a data clean room complies with privacy policies, such as GDPR and California Consumer Privacy Act (CCPA).

How does a data clean room work?

Let’s take a deeper dive into the functionality of a data clean room. Four components are involved with a data clean room:

#1 – Data ingestion
Data is funneled into the data clean room. This can be first-party data (generated from websites, applications, CRMs, etc.) or second-party data from collaborating parties (such as ad networks, partners, publishers, etc.)

#2 – Connection and enrichment
The ingested data sets are matched at the user level. Tools like third-party data enrichment complement the data sets.

#3 – Analytics
The data is analyzed to determine if there are intersections/overlaps, measurement/attribution, and propensity scoring. Data will only be shared where the data points intersect between the two parties.

#4 – Application
Once the data has finished its data clean room journey, each party will have aggregated data outputs. It creates the necessary business insights to accomplish crucial tasks such as optimizing the customer experience, performing reach and frequency measurements, building effective cross-platform journeys, and conducting deep marketing campaign analyses.

What are the benefits of a data clean room?

Data clean rooms can benefit businesses in any industry, including media, retail, and advertising. In summary, data clean rooms are beneficial for the following reasons:

You can enrich your partner’s data set.
With data clean rooms, you can collaborate with your partners to produce and consume data regarding overlapping customers. You can pool common customer data with your partners, find the intersection between your business and your partners, and share the data upstream without sharing sensitive information with competitors. An example would be sharing demand and sales information with an advertising partner for better-targeted marketing campaigns.

You can create governance within your enterprise.
Data clean rooms provide the framework to achieve the elusive “single source of truth.” You can create a golden record encompassing all the data in every system of records within your organization. This includes sensitive PII such as social security numbers, passport numbers, financial account numbers, transactional data, etc.

You can remain policy compliant.
In a data clean room environment, you can monitor where the data lives, who has access to it, and how it is used with a data clean room. Think of it as an automated middleman that validates requests for data. This allows you to share data and remain compliant with all the important acronyms: GDPR, HIPPA, CCPA, FCRA, ECPA, etc.

But you have to do it right…

With every data security and analytics initiative, there is a set of risks if the implementation is not done correctly. A truly “clean” data clean room will allow you to unlock data for your users while remaining privacy compliant. You can maintain role-based access, tokenized columns, and row-level security – which typically lock down particular data objects – and share these sensitive data sets quickly and in a governed way. Data clean rooms satisfy the need for efficient access and the need for the data producer to limit the consumer to relevant information for their use case.

Of course, there are consequences if your data clean room is actually “dirty.” Your data must be federated, and you need clarity on how your data is stored. The consequences are messy if your room is dirty. You risk:

  • Loss of customer trust
  • Fines from government agencies
  • Inadvertently oversharing proprietary information
  • Locking out valuable data requests due to a lack of process

Despite the potential risks of utilizing a data clean room, it is the most promising solution to the challenges of data-sharing in a privacy-compliant way.

Conclusion

To get the most out of your data, your business needs to create secure processes to share data and decentralize your analytics. This means pooling together common data with your partners and distributing the work to create value for all parties involved.

However, you must govern your data. It is imperative to treat your data like an asset, especially in the era of user privacy and data protection. With data clean rooms, you can reconcile the need for data collaboration with the need for data ownership and privacy.

2nd Watch can be your data clean room guide, helping you to establish a data mesh that enables sharing and analyzing distributed pools of data, all while maintaining centralized governance. Schedule time to get started with a data clean room.

Fred Bliss – CTO Data Insights 2nd Watch 


3 Data Integration Best Practices Every Successful Business Adopts

Here’s a hypothetical situation: Your leadership team is on a conference call, and the topic of conversation turns to operational reports. The head of each line of business (LOB) presents a conflicting set of insights, but each one is convinced that the findings from their analytics platform are the gospel truth. With data segregated across the LOBs, there’s no clear way to determine which insights are correct or make an informed, unbiased decision.

What Do You Do?

In our experience, the best course of action is to create a single source of truth for all enterprise analytics. Organizations that do so achieve greater data consistency and quality data sources, increasing the accuracy of their insights – no matter who is conducting analysis. Since the average organization draws from 400 different data sources (and one in five needs to integrate more than 1,000 disparate data sources), it’s no surprise that many organizations struggle to integrate their data. Yet with these data integration best practices, you’ll find fewer challenges as you create a golden source of insight.

Take a Holistic Approach

The complexity of different data sources and niche analytical needs within the average organization makes it difficult for many to hone in on their master plan for data integration. As a result, there are plenty of instances in which the tail ends up wagging the dog.

Maybe it’s an LOB with greater data maturity pushing for an analytics layer that aligns with their existing analytics platform to the detriment of others. Or maybe the organization is familiar with a particular stack or solution and is trying to force the resulting data warehouse to match those source schema. Whatever the reason, a non-comprehensive approach to data integration will hamstring your reporting.

In our experience, organizations see the best results when they design their reporting capabilities around their desired insight – not a specific technology. Take our collaboration with a higher education business. They knew from the outset that they wanted to use their data to convert qualified prospects into more enrollees. They trusted us with the logistics of consolidating their more than 90 disparate data sources (from a variety of business units across more than 10 managed institutions) into reports that helped them analyze the student journey and improve their enrollment rate as a whole.

With their vision in mind, we used an Alooma data pipeline to move the data to the target cloud data warehouse, where we transformed the data into a unified format. From there, we created dashboards that allowed users to obtain clear and actionable insight from queries capable of impacting the larger business. By working toward an analytical goal rather than conforming to their patchwork of source systems, we helped our client lay the groundwork to increase qualified student applications, reduce the time from inquiry to enrollment, and even increase student satisfaction.

Win Quickly with a Manageable Scope

When people hear the phrase “single source of truth” in relation to their data, they imagine their data repository needs to enter the world fully formed with an enterprise-wide scope. For mid-to-large organizations, that end-to-end data integration process can take months (if not years) before they receive any direct ROI from their actions.

One particular client of ours entered the engagement with that boil-the-ocean mentality. A previous vendor had proposed a three-year timeline, suggesting a data integration strategy that would:

  • Map their data ecosystem
  • Integrate disparate data sources into a centralized hub
  • Create dashboards for essential reporting
  • Implement advanced analytics and data science capabilities

Though we didn’t necessarily disagree with the projected capability, the waiting period before they experienced any ROI undercut the potential value. Instead, we’re planning out a quick win for their business, focusing on a mission-critical component that can provide a rapid ROI. From there, we will scale up the breadth of their target data system and the depth of their analytics.

This approach has two added benefits. One, you can test the functionality and accessibility of your data system in real time, making enhancements and adjustments before you expand to the enterprise level. Two, you can develop a strong and clear use case early in the process, lowering the difficulty bar as you try to obtain buy-in from the rest of the leadership team.

Identify Your Data Champion

The shift from dispersed data silos to a centralized data system is not a turnkey process. Your organization is undergoing a monumental change. As a result, you need a champion within the organization to foster the type of data-driven culture that ensures your single source of truth lives up to the comprehensiveness and accuracy you expect.

What does a data champion do? They act as an advocate for your new data-driven paradigm. They communicate the value of your centralized data system to different stakeholders and end users, encouraging them to transition from older systems to more efficient dashboards. Plus, they motivate users across departments and LOBs to follow data quality best practices that maintain the accuracy of insights enterprise wide.

It’s not essential that this person be a technical expert. This person needs to be passionate and build trust with members of the team, showcasing the new possibilities capable through your data integration solution. All of the technical elements of data integration or navigating your ELT/ETL tool can be handled by a trusted partner like 2nd Watch.

Schedule a whiteboard session with our team to discuss your goals, source systems, and data integration solutions.


A High-Level Overview of Looker: An Excerpt from Our BI Tool Comparison Guide

Looker is one of several leading business intelligence (BI tools) that can help your organization harness the power of your data and glean impactful insights that allow you to make the best decisions for your business.

Keep reading for a high-level overview of Looker’s key features, pros and cons of Looker versus competitors, and a list of tools and technologies that easily integrate with Looker to augment your reporting.

Overview of Looker

Looker is a powerful BI tool that can help a business develop insightful visualizations. Among other benefits, users can create interactive and dynamic dashboards, schedule and automate the distribution of reports, set custom parameters to receive alerts, and utilize embedded analytics.

Why Use Looker

If you’re looking for a single source of truth, customized visuals, collaborative dashboards, and top-of-the-line customer support, Looker might be the best BI platform for you. Being fully browser-based cuts down on confusion as your team gets going, and customized pricing means you get exactly what you need.

Pros of Looker

  • Looker offers performant and scalable analytics on a near-real-time basis.
  • Because you need to define logic before creating visuals, it enforces a single-source-of-truth semantic layer.
  • Looker is completely browser-based, eliminating the need for desktop software.
  • It facilitates dashboard collaboration, allowing parallel development and publishing with out-of-the-box git integration.

Cons of Looker

  • Looker can be more expensive than competitors like Microsoft Power BI; so while adding Looker to an existing BI ecosystem can be beneficial, you will need to take costs into consideration.
  • Compared to Tableau, visuals aren’t as elegant and the platform isn’t as intuitive.
  • Coding in LookML is unavoidable, which may present a roadblock for report developers who have minimal experience with SQL.

Select Complementary Tools and Technologies for Looker

  • Any SQL database
  • Amazon Redshift
  • AWS
  • Azure
  • Fivetran
  • Google Cloud
  • Snowflake

Was this high-level overview of Looker helpful? If you’d like to learn more about Looker reporting or discuss how other leading BI tools, like Tableau and Power BI, may best fit your organization, contact us to learn more.

The content of this blog is an excerpt of our Business Intelligence Tool Comparison Guide. Click here to download a copy of the guide.