Data ingestion is the first step in any analytical undertaking. It’s a process where data from one or many sources are gathered and imported into one place. Data can be imported in real time (like POS data) or in batches (like billing systems).
Why It Matters for Marketers:
The process of data ingestion consolidates all of the relevant information from across your data sources into a single, centralized storage system. Through this process, you can begin to convert disparate data created in your CRM, POS, and other source systems into a unified format that is ready for real-time or batch analysis.
Marketing teams pull data from a wide variety of resources, including Salesforce, Marketo, Facebook, Twitter, Google, Stripe, Zendesk, Shopify, Mailchimp, mobile devices, and more. It’s incredibly time-consuming to manually combine these data sources, but by using tools to automate some of these processes you can get data into the hands of your team faster.
This empowers marketers to answer more sophisticated questions about customer behavior, such as:
Why are customers leaving a specific product in their online shopping carts?
What is the probability that we’ll lose a customer early in the customer journey?
Which messaging pillar is resonating most with customers in the middle of the sales funnel who live in Germany?
Image 1: In this image, three source systems with varying formats and content are ingested into a central location in the data warehouse.
ETL vs. ELT
ETL and ELT are both data integration methods that make it possible to take data from various sources and move it into a singular storage space like a data warehouse. The difference is in when the transformation of data takes place.
As your business scales, ELT tools are better-equipped to handle the volume and variety of marketing data on hand. However, a robust data plan will make use of both ELT and ETL tools.
For example, a member of your team wants to know which marketing channels are the most effective at converting customers with the highest average order value. The data you need to answer that question is likely spread across multiple structured data sources (e.g., referral traffic from Google Analytics, transaction history from your POS or e-commerce system, and customer data from your CRM).
Through your ETL process, you can extract relevant data from the above sources, transform it (e.g., updating customer contact info across files for uniformity and accuracy), and load the clean data into one final location. This enables your team to run your query in a streamlined way with limited upfront effort.
In comparison, your social media marketing team wants to see whether email click-through rates or social media interactions lead to more purchases. The ELT process allows them to extract and load all of the raw data in real time from the relevant source systems and run ad-hoc analytics reports, making adjustments to campaigns on the fly.
Extract, Transform, Load (ETL)
This method of data movement first copies data from the original database into the target system and then converts the data into a singular format. Lastly, the transformed data is uploaded into a data warehouse for analytics.
When You Should Use ETL:
ETL processes are preferable for moving small amounts of structured data with no rush on when that data is available for use. A robust ETL process would clean and integrate carefully selected data sources to provide a single source of truth that delivers faster analytics and makes understanding and using the data extremely simple.
Image 2: This image shows four different data sources with varying data formats being extracted from their sources, transformed to all be formatted the same, and then loaded into a data warehouse. Having all the data sources formatted the same way allows you to have consistent and accurate data in the chart that is built from the data in the data warehouse.
Extract, Load, Transform (ELT)
Raw data is read from source databases, then loaded into the database in its raw form. Raw data is usually stored in a cloud-based data lake or data warehouse, allowing you to transform only the data you need.
When You Should Use ELT:
ELT processes shine when there are large amounts of complex structured and unstructured data that need to be made available more immediately. ELT processes also upload and store all of your data in its raw format, making data ingestion faster. However, performing analytics on that raw data is a more complex process because cleaning and transformation happen post-upload.
Image 3: This image is showing four different data sources with the data formatted in different ways. The data is being extracted from the various sources, loaded into the data warehouse, and then transformed within the data warehouse to all be formatted the same. This allows for accurate reporting of the data in the chart seen above.
A data pipeline is a series of steps in an automated process that moves data from one system to another, typically using ETL or ELT practices.
Why It Matters for Marketers:
The automatic nature of a data pipeline removes the burden of data manipulation from your marketing team. There’s no need to chase down the IT team or manually download files from your marketing automation tool, CRM, or other data sources to answer a single question. Instead, you can focus on asking the questions and honing in on strategy while the technology takes away the burden of tracking down, manipulating, and refreshing the information.
Say under the current infrastructure, your sales data is split between your e-commerce platform and your in-store POS systems. The different data formats are an obstacle to proper analysis, so you decide to move them to a new target system (such as a data warehouse).
A data pipeline would automate the process of selecting data sources, prioritizing the datasets that are most important, and transforming the data without any micromanagement of the tool. When you’re ready for analysis, the data will already be available in one destination and validated for accuracy and uniformity, enabling you to start your analysis without delay.
Data Storage Options
Databases, data warehouses, and data lakes are all systems for storing and using data, but there are differences to consider when choosing a solution for your marketing data.
A database is a central place where a structured and organized collection of data can be stored in a computer that is accessed via various applications such as MailChimp, Rollworks, Marketo, or even more traditional campaigns like direct mail. It is not meant for large-scale analytics.
A data warehouse is a specific way of structuring your data in database tables so that it is optimized for analytics. A data warehouse brings together all your various data sources under one roof and structures it for analytics.
A data lake is a vast repository of structured and unstructured data. It handles all types of data, and there is no hierarchy or organization to the storage.
Why It Matters for Marketers:
There are benefits and drawbacks to each type of data structure, and marketers should have a say in how data gets managed throughout the organization. For example, with a data lake, you will need to have a data scientist or other technical resource on staff to help make sense of all the data, but your marketing team can be more self-sufficient with a database or data warehouse.
Without organization and structure, the insights your data holds can be unreliable and hard to find. Pulling data from various source systems is often time-consuming and requires tedious and error-prone reformatting of the data in order to tell a story or answer a question. A database can help to store data from multiple sources in an organized central location.
Without databases, your team would have to use multiple Excel sheets and manual manipulation to store the data needed for analysis. This means your team would have to manually match up or copy/paste each Excel sheet’s data in order to create one place to analyze all of your data.
A data warehouse delivers an extra layer of organization across all databases throughout your business. Your CRM, sales platform, and social media data differ in format and complexity but often contain data about similar subjects. A data warehouse brings together all of those varying formats into a standardized and holistic view structured to optimize reporting. When that data is consolidated from across your organization, you can obtain a complete view of your customers, their spending habits, and their motivations.
You might hear people say “enterprise data warehouse” or “EDW” when they talk about data. This is a way to structure data that makes answering questions via reports quick and easy. More importantly, EDWs often contain information from the entire company, not just your function or department. Not only can you answer questions about your customer or marketing-specific topics, but you can understand other concepts such as the inventory flow of your products. With that knowledge, you can determine, for example, how inventory delays are correlated to longer shipping times, which often result in customer churn.
A data lake is a great option for organizations that need more flexibility with their data. The ability for a data lake to hold all data—structured, semi-structured, or unstructured—makes it a good choice when you want the agility to configure and refigure models and queries as needed. Access to all the raw data also makes it easier for data scientists to manipulate the data.
You want to get real-time reports from each step of your SMS marketing campaign. Using a data lake enables you to perform real-time analytics on the number of messages sent, the number of messages opened, how many people replied, and more. Additionally, you can save the content of the messages for later analysis, delivering a more robust view of your customer and enabling you to increase personalization of future campaigns.
So, how do you choose?
You might not have to pick just one solution. In fact, it might make sense to use a combination of these systems. Remember, the most important thing is that you’re thinking about your marketing data, how you want to use it, what makes sense for your business, and the best way to achieve your results.
Hopefully this information has helped you better understand your options for data ingestion and storage. Feel free to contact us with any questions or to learn more about data ingestion and storage options for your marketing data.
In this age of ever-expanding data security challenges, which have only increased with the mass move to remote workforces, data-centric organizations need to easily but securely access data. Enter ALTR: a cloud-native platform delivering Data Security as a Service (DSaaS) and helping companies to optimize data consumption governance.
Not sure you need another tool in your toolkit? We’ll dive into ALTR’s benefits so you can see for yourself how this platform can help you get ahead of the next changes in data security, simplify processes and enterprise collaboration, and maximize your technology capabilities, all while staying in control of your budget.
How Does ALTR Work?
With ALTR, you’re able to track data consumption patterns and limit how much data can be consumed. Even better, it’s simple to implement, immediately adds value, and is easily scalable. You’ll be able to see data consumption patterns from day one and optimize your analytics while keeping your data secure.
ALTR delivers security across three key stages:
Observe – ALTR’s DSaaS platform offers critical visibility into your organization’s data consumption, including an audit record for each request for data. Observability is especially critical as you determine new levels of operational risk in today’s largely remote world.
Detect and Respond – You can use ALTR’s observability to understand typical data consumption for your organization and then determine areas of risk. With that baseline, you’re able to create highly specific data consumption policies. ALTR’s cloud-based policy engine then analyzes data requests to prevent security incidents in real time.
Protect – ALTR can tokenize data at its inception to secure data throughout its lifecycle. This ensures adherence to your governance policies. Plus, ALTR’s data consumption reporting can minimize existing compliance scope by assuring auditors that your policies are solid.
What Other Benefits Does ALTR Offer?
ALTR offers various integrations to enhance your data consumption governance:
Share data consumption records and security events with your favorite security information and event management (SIEM) software.
View securely shared data consumption information in Snowflake.
Analyze data consumption patterns in Domo.
ALTR delivers undeniable value through seamless integration with technologies like these, which you may already have in place; paired with the right consultant, the ROI is even more immediate. ALTR may be new to you, but an expert data analytics consulting firm like 2nd Watch is always investigating new technologies and can ease the implementation process. (And if you need more convincing, ALTR was selected as a finalist for Bank Director’s 2020 Best of FinXTech Awards.)
Dedicated consultants can more quickly integrate ALTR into your organization while your staff stays on top of daily operations. Consultants can then put the power in the hands of your business users to run their own reports, analyze data, and make data-driven decisions. Secure in the knowledge your data is protected, you can encourage innovation by granting more access to data when needed.
As a tech-agnostic company, 2nd Watch helps you find the right tools for your specific needs. Our consultants have a vast range of product expertise to make the most of the technology investments you’ve already made, to implement new solutions to improve your team’s function, and to ultimately help you compete with the companies of tomorrow. Reach out to us directly to find out if ALTR, or another DSaaS platform, could be right for your organization.
Data sharing has become more complex, both in its application and our relationship to it. There is a tension between the need for personalization and the need for privacy. Businesses must share data to be effective and ultimately provide tailored customer experiences. However, legislation and practices regarding data privacy have tightened, and data sharing is tougher and fraught with greater compliance constraints than ever before. The challenge for enterprises is reconciling the increased demand for data with increased data protection.
The modern world runs on data. Companies share data to facilitate their daily operations. Data distribution occurs between business departments and external third parties. Even something as innocuous as exchanging Microsoft Excel and Google Sheets spreadsheets is data sharing!
Data collaboration is entrenched in our business processes. Therefore, rather than avoiding it, we must find the tools and frameworks to support secure and privacy-compliant data sharing. So how do we govern the flow of sensitive information from our data platforms to other parties?
The answer: data clean rooms. Data clean rooms are the modern vehicle for various data sharing and data governance workflows. Across industries – including media and entertainment, advertising, insurance, private equity, and more – a data clean room can be the difference-maker in your data insights.
There is a classic thought experiment wherein two millionaires want to find out who is richer without actually sharing how much money they are individually worth. The data clean room solves this issue by allowing parties to ask approved questions, which require external data to answer, without actually sharing the sensitive information itself!
In other words, a data clean room is a framework that allows two parties to securely share and analyze data by granting both parties control over when, where, and how said data is used. The parties involved can pool together data in a secure environment that protects private details. With data clean rooms, brands can access crucial and much-needed information while maintaining compliance with data privacy policies.
Data clean rooms have been around for about five years with Google being the first company to launch a data clean room solution (Google Ads Data Hub) in 2017. The era of user privacy kicked off in 2018 when data protection and privacy became law, most notably with the General Data Protection Regulation (GDPR).
This was a huge shake-up for most brands. Businesses had to adapt their data collection and sharing models to operate within the scope of the new legislation and the walled gardens that became popular amongst all tech giants. With user privacy becoming a priority, data sharing has become stricter and more scrutinized, which makes marketing campaign measurements and optimizations in the customer journey more difficult than ever before.
Data clean rooms are crucial for brands navigating the era of consumer protection and privacy. Brands can still gain meaningful marketing insights and operate within data privacy laws in a data clean room.
Data clean rooms work because the parties involved have full control over their data. Each party agrees upon access, availability, and data usage, while a trusted data clean room offering oversees data governance. This yields the secure framework needed to ensure that one party cannot access the other’s data and upholds the foundational rule that individual, or user-level data cannot be shared between different parties without consent.
Personally, identifying information (PII) remains anonymized and is processed and stored in a way that is not exposed to any parties involved. Thus, data sharing within a data clean room complies with privacy policies, such as GDPR and California Consumer Privacy Act (CCPA).
How does a data clean room work?
Let’s take a deeper dive into the functionality of a data clean room. Four components are involved with a data clean room:
#1 – Data ingestion
Data is funneled into the data clean room. This can be first-party data (generated from websites, applications, CRMs, etc.) or second-party data from collaborating parties (such as ad networks, partners, publishers, etc.)
#2 – Connection and enrichment
The ingested data sets are matched at the user level. Tools like third-party data enrichment complement the data sets.
#3 – Analytics
The data is analyzed to determine if there are intersections/overlaps, measurement/attribution, and propensity scoring. Data will only be shared where the data points intersect between the two parties.
#4 – Application
Once the data has finished its data clean room journey, each party will have aggregated data outputs. It creates the necessary business insights to accomplish crucial tasks such as optimizing the customer experience, performing reach and frequency measurements, building effective cross-platform journeys, and conducting deep marketing campaign analyses.
What are the benefits of a data clean room?
Data clean rooms can benefit businesses in any industry, including media, retail, and advertising. In summary, data clean rooms are beneficial for the following reasons:
You can enrich your partner’s data set.
With data clean rooms, you can collaborate with your partners to produce and consume data regarding overlapping customers. You can pool common customer data with your partners, find the intersection between your business and your partners, and share the data upstream without sharing sensitive information with competitors. An example would be sharing demand and sales information with an advertising partner for better-targeted marketing campaigns.
You can create governance within your enterprise.
Data clean rooms provide the framework to achieve the elusive “single source of truth.” You can create a golden record encompassing all the data in every system of records within your organization. This includes sensitive PII such as social security numbers, passport numbers, financial account numbers, transactional data, etc.
You can remain policy compliant.
In a data clean room environment, you can monitor where the data lives, who has access to it, and how it is used with a data clean room. Think of it as an automated middleman that validates requests for data. This allows you to share data and remain compliant with all the important acronyms: GDPR, HIPPA, CCPA, FCRA, ECPA, etc.
But you have to do it right…
With every data security and analytics initiative, there is a set of risks if the implementation is not done correctly. A truly “clean” data clean room will allow you to unlock data for your users while remaining privacy compliant. You can maintain role-based access, tokenized columns, and row-level security – which typically lock down particular data objects – and share these sensitive data sets quickly and in a governed way. Data clean rooms satisfy the need for efficient access and the need for the data producer to limit the consumer to relevant information for their use case.
Of course, there are consequences if your data clean room is actually “dirty.” Your data must be federated, and you need clarity on how your data is stored. The consequences are messy if your room is dirty. You risk:
Loss of customer trust
Fines from government agencies
Inadvertently oversharing proprietary information
Locking out valuable data requests due to a lack of process
Despite the potential risks of utilizing a data clean room, it is the most promising solution to the challenges of data-sharing in a privacy-compliant way.
To get the most out of your data, your business needs to create secure processes to share data and decentralize your analytics. This means pooling together common data with your partners and distributing the work to create value for all parties involved.
However, you must govern your data. It is imperative to treat your data like an asset, especially in the era of user privacy and data protection. With data clean rooms, you can reconcile the need for data collaboration with the need for data ownership and privacy.
2nd Watch can be your data clean room guide, helping you to establish a data mesh that enables sharing and analyzing distributed pools of data, all while maintaining centralized governance. Schedule time to get started with a data clean room.
Here’s a hypothetical situation: Your leadership team is on a conference call, and the topic of conversation turns to operational reports. The head of each line of business (LOB) presents a conflicting set of insights, but each one is convinced that the findings from their analytics platform are the gospel truth. With data segregated across the LOBs, there’s no clear way to determine which insights are correct or make an informed, unbiased decision.
What Do You Do?
In our experience, the best course of action is to create a single source of truth for all enterprise analytics. Organizations that do so achieve greater data consistency and quality data sources, increasing the accuracy of their insights – no matter who is conducting analysis. Since the average organization draws from 400 different data sources (and one in five needs to integrate more than 1,000 disparate data sources), it’s no surprise that many organizations struggle to integrate their data. Yet with these data integration best practices, you’ll find fewer challenges as you create a golden source of insight.
Take a Holistic Approach
The complexity of different data sources and niche analytical needs within the average organization makes it difficult for many to hone in on their master plan for data integration. As a result, there are plenty of instances in which the tail ends up wagging the dog.
Maybe it’s an LOB with greater data maturity pushing for an analytics layer that aligns with their existing analytics platform to the detriment of others. Or maybe the organization is familiar with a particular stack or solution and is trying to force the resulting data warehouse to match those source schema. Whatever the reason, a non-comprehensive approach to data integration will hamstring your reporting.
In our experience, organizations see the best results when they design their reporting capabilities around their desired insight – not a specific technology. Take our collaboration with a higher education business. They knew from the outset that they wanted to use their data to convert qualified prospects into more enrollees. They trusted us with the logistics of consolidating their more than 90 disparate data sources (from a variety of business units across more than 10 managed institutions) into reports that helped them analyze the student journey and improve their enrollment rate as a whole.
With their vision in mind, we used an Alooma data pipeline to move the data to the target cloud data warehouse, where we transformed the data into a unified format. From there, we created dashboards that allowed users to obtain clear and actionable insight from queries capable of impacting the larger business. By working toward an analytical goal rather than conforming to their patchwork of source systems, we helped our client lay the groundwork to increase qualified student applications, reduce the time from inquiry to enrollment, and even increase student satisfaction.
Win Quickly with a Manageable Scope
When people hear the phrase “single source of truth” in relation to their data, they imagine their data repository needs to enter the world fully formed with an enterprise-wide scope. For mid-to-large organizations, that end-to-end data integration process can take months (if not years) before they receive any direct ROI from their actions.
One particular client of ours entered the engagement with that boil-the-ocean mentality. A previous vendor had proposed a three-year timeline, suggesting a data integration strategy that would:
Map their data ecosystem
Integrate disparate data sources into a centralized hub
Create dashboards for essential reporting
Implement advanced analytics and data science capabilities
Though we didn’t necessarily disagree with the projected capability, the waiting period before they experienced any ROI undercut the potential value. Instead, we’re planning out a quick win for their business, focusing on a mission-critical component that can provide a rapid ROI. From there, we will scale up the breadth of their target data system and the depth of their analytics.
This approach has two added benefits. One, you can test the functionality and accessibility of your data system in real time, making enhancements and adjustments before you expand to the enterprise level. Two, you can develop a strong and clear use case early in the process, lowering the difficulty bar as you try to obtain buy-in from the rest of the leadership team.
Identify Your Data Champion
The shift from dispersed data silos to a centralized data system is not a turnkey process. Your organization is undergoing a monumental change. As a result, you need a champion within the organization to foster the type of data-driven culture that ensures your single source of truth lives up to the comprehensiveness and accuracy you expect.
What does a data champion do? They act as an advocate for your new data-driven paradigm. They communicate the value of your centralized data system to different stakeholders and end users, encouraging them to transition from older systems to more efficient dashboards. Plus, they motivate users across departments and LOBs to follow data quality best practices that maintain the accuracy of insights enterprise wide.
It’s not essential that this person be a technical expert. This person needs to be passionate and build trust with members of the team, showcasing the new possibilities capable through your data integration solution. All of the technical elements of data integration or navigating your ELT/ETL tool can be handled by a trusted partner like 2nd Watch.
Analyzing raw data without a singular, standardized format is as fruitful as trying to understand all 193 UN delegates shouting in their native tongues. Something important is being said, but good luck figuring out what that is. But reformat that raw data and shift them from their disparate sources into a single data warehouse, and the message rings through as clear as a bell.
That is the benefit that extract, transform, load (ETL) processes provide to organizations. Yet before you can access the hidden patterns and meanings in your data, you need to decide how you want to acquire your ETL tool: build one from scratch or buy an automated solution. Here’s what to consider as you make your decision.
Often, a small project scope with simple data flow benefits from a custom build, allowing your organization to calibrate your ETL tool to your precise needs and spend less in the process. Small shops may have fewer technical resources, but they will spend as much time integrating a pre-built ETL tool as building up simple data flows from the ground up.
When the scope is a massive enterprise-level ETL framework, it makes more sense to engage with a preexisting ETL tool and accelerate your analytics timeline. Even then, we recommend a data management partner experienced in ETL processes, one that’s done the technical work of hooking the sources together and transforming the data numerous times. They know the accelerators to get your program up and running enterprise-wide.
What technology are you using?
Your current tech stack is always a consideration. For example, if you prefer open source technology or depend on a web of legacy systems for daily operations, building your own system eliminates the worry that your integration won’t work. Building your ETL program is also a preferred option for organizations with a custom or niche development environment that aims to use fewer computing resources or accelerate your performance.
On the other hand, GUI environments that value ease of use are better-suited for buying their ETL program. For example, we had an online e-commerce client with a few internal technical resources. They understood their current state and their source systems but did not want to deal with setting up the actual workflows. In that scenario, we determined that integrating a preexisting ETL solution into their ecosystem would help their team to load data and run reports more effectively.
What’s the shelf-life of your proposed solution?
How long you’ll use a specific ETL solution has significant influence on the decision to build or buy. If you need a quick-and-dirty, one-off load, it doesn’t make sense to invest $15,000+ per year for an ETL solution. If you have resources capable of scripting an ad hoc solution, utilizing their talents will achieve faster results.
On the other hand, companies that need a scalable or long-term strategic solution tend to lean toward a prepackaged ETL tool. Due to the evolving data sources and streams available in these organizations, a preexisting ETL tool in which ongoing development and integration are handled by the vendor is ideal. The only major challenge is ensuring that your team is maximizing your investment by using the full capacity of your vendor’s ETL solution. Fortunately, it’s a feat that’s more manageable if you work with a technical consulting partner like 2nd Watch.
What’s your budget?
This one is a little deceptive. Though there is an initial investment in building your own solution, you lack the ongoing subscription and the initial integration costs that are frequently overlooked in preliminary estimates. Additionally, buying ETL solutions often means you’ll be charged per source system being transformed and loaded into your data warehouse. So depending on the number of disparate sources and volume of data, the build option is a good way to avoid overspending on data ingestion.
Though large enterprises will still end up paying these costs, they can justify them as a trade-off for greater traceability and cataloging for the sake of compliance. The ability to track business data and smoothly conduct audits is more than enough for some organizations to defend the elevated price tag, especially if those organizations are in the healthcare or financial sectors.
Who will manage the ETL process?
Control is a significant consideration for plenty of organizations. For those who want to own the ETL system, building is the right choice. Often, this makes the most sense when you already have a custom infrastructure, legacy storage system, or niche analytics needs.
Yet not every organization wants to divert attention from their primary business. Let’s say you’re a healthcare organization that wants to build a comprehensive data warehouse from a myriad of data sources while still maintaining compliance. Trusting an experienced vendor removes a considerable amount of risk.
Do you need flexibility in your analytics?
What types of reports will you be running? Standard ones for your industry or business environment? Or reports that are particular to your own unique needs? Your answer heavily influences the choices you make about your ETL tool.
If you feel your demands upon a data warehouse will be uncommon, then building is the ideal choice. That way, your reporting isn’t curtailed to fit a preconceived notion of your needs. Hand-coding your own ETL program enables you to write scripts for whatever schemas or parameters you had in mind. The only limitation is your own technical capability or that of your data management consulting partner.
If performance outranks customization, buying an ETL tool like Attunity, Talend, or others is the superior option. As we’ve said before, you’ll lose some level of flexibility and back-end control, but these enterprise-level ETL solutions allow you to gather, cleanse, and refine data with very minimal effort. Who said data transformation needed to be difficult?
Do you have access to technical experts?
Effective ETL processes require a skilled workforce to deliver maximum results. Even more, that workforce needs to know how to build a data warehouse. You either need internal resources, a data management partner, or a proficient solutions provider involved in the development, auditing, and testing processes.
Internal resources allow you to build and launch your own ETL program with their ability to hand-code scripts and manage data workflows. Additionally, you don’t need to hire outside resources to monitor ongoing performance or troubleshoot issues. The trade-off is that their work on your ETL solution and data integration can divert their attention from long-term strategic projects or operations. An effective compromise is having an internal resource take ownership of the project and outsource the scripting, loading, and data migration to a technical partner.
For organizations without spare technical talent, buying a prepackaged ETL tool simplifies a portion of the initial technical investment. However, most organizations still need assistance with current state audits to verify all the source systems, hands-on integration support to get reporting up and running, and training on the new reporting processes. Choosing the right technical consulting partner enables you to deliver results in reasonable timetables without hiring new IT talent to handle the ETL process.
The advantage of a data management partner like 2nd Watch is that we’re proficient in both build and buy situations. If you decide to build, we can help with the scripting and create a support team. If you decide to buy, we can help integrate the tool and teach your internal team how to maximize all of the ETL tool’s features. That way, you can prioritize other more strategic and/or inflexible considerations while still implementing your disparate data sources into a single data warehouse.
Can you provide internal training?
What happens after the implementation? Will your team be able to grab the baton and confidently sprint forward without any impediments? Or are you at risk from the “bus factor,” where one person getting hit by a bus shuts down your total knowledge of the ETL solution? The success of both building an ETL platform and buying a cloud-based subscription depends on the effectiveness of the associated training process.
Going with a custom build means you’re dependent on your own knowledge sharing. You may encounter a bottleneck scenario where only resourceful employees will understand how to run reports after ETL processes are conducted. And if a tool is time-consuming or frustrating, you’ll struggle to encourage buy-in.
However, with a purchased ETL tool, resources outside of your team should easily understand the logistics of the workflows and be able to support your system. Your organization can then recruit or contract staff that is already familiar with the technical function of your tool without painfully reverse-engineering your scripting. Beware, though! You will encounter the same problems as a built system if you integrate the tool poorly. (Don’t just write custom scripting within your workflows if you want to get the benefits from a purchased option.)
The right data management partner can avoid this situation entirely. For example, the 2nd Watch team is skilled at sharing organizational process changes and communicating best practices to users and stakeholders. That way, there’s no barrier to usage of any ETL tool across your organization.
Whether you build or buy your ETL tool, 2nd Watch can help you implement the right solution. Schedule a whiteboard session to review your options and start on the path to better data analytics.
The widespread adoption of the value-based care model is encouraging more healthcare organizations to revisit the management of their data. Increased emphasis on the quality of service, elevating care outcomes along the way, means that organizations depend more than ever on consistent, accessible, and high-quality data.
The problem is that the current state of data management is inconsistent and disorganized. Less than half of healthcare CIOs trust the current quality of their clinical, operational, and financial data. In turn, the low credibility of their data sources calls into question their reporting and analytics, which ripples outward, inhibiting the entirety of their decision-making. Clinical diagnoses, operational assessments, insurance policy designs, and patient/member satisfaction reports all suffer with poor data governance.
Fortunately, most healthcare organizations can take straightforward steps to improve their data governance – if they are aware of what’s hindering their reporting and analytics. With that goal in mind, here are some of the most common challenges and oversights for data governance and what your organization can do to overcome them.
Most healthcare organizations are now aware of the idea of data silos. As a whole, the industry has made commendable progress breaking down these barriers and unifying large swaths of raw data into centralized repositories. Yet the ongoing addition of new data sources can lead to the return of analytical blind spots if your organization doesn’t create permanent protocols to prevent them.
Consider this situation: Your billing department just implemented a live chat feature on your website or app, providing automated answers to a variety of patient or member questions. If there is not an established protocol automatically integrating data from these interactions into your unified view, then you’ll miss valuable pieces of each patient or member’s overall story. The lack of data might result in missed opportunities for outreach campaigns or even expanded services.
Adding any new technology (e.g., live chat, healthcare diagnostic devices, virtual assistants) creates a potential threat to the comprehensiveness of your insights. Yet by creating a data pipeline and a data-centric culture, you can prevent data siloing from reasserting itself. Remember that your data ecosystem is dynamic, and your data governance practices should be too.
Lack of Uniformity
None of the data within a healthcare organization exists in a vacuum. Even if the data within your EHR or medical practice management (MPM) software is held to the highest quality standards, a lack of consistency between these or other platforms can diminish the overall accuracy of analytics. Worst of all, this absence of standardization can impact your organization in a number of ways.
When most people think of inconsistencies, it probably relates to the accuracy of the data itself. There are the obviously harmful clinical inconsistencies (e.g., a pathology report indicates cancerous cells are acute while a clinical report labels them chronic) and less glaring but damaging organizational inconsistencies (e.g., two or more different contact numbers that hamper communication). In these examples and others, data inaccuracies muddy the waters and impair the credibility of your analytics. The other issue is more subtle, sneaking under the radar: mismatched vocabulary, terminology, or representations.
Here’s an example. Let’s say a healthcare provider is trying to analyze data from two different sources, their MPM and their EHR. Both deal with patient demographics, but might have different definitions of what constitutes their demographics. Their age brackets might vary (one might set a limit at ages 18 to 29 and another might draw the line at 18 to 35), which can prevent seamless integration for demographic analysis. Though less harmful, this lack of uniformity can curtail the ability of departments to have a common understanding and derive meaningful business intelligence from their shared data.
In all of the above instances, establishing a single source of truth with standardized information and terminology is essential if you’re going to extract accurate and meaningful insights during your analyses.
To combat these problems, your organization needs to decide upon a standardized representation of core data entities that create challenges upon analysis. Then, rather than cleansing the data in their respective source systems, you can use an ELT process to extract and load structured and unstructured data into a centralized repository. Once the data has been centralized, you can evaluate the data for inaccuracies, standardize the data by applying data governance rules against it, and finally normalize the data so your organization can analyze it with greater uniformity.
Even when your data is high-quality and consistent, your organization might still fall short of data governance best practices. The reason why? The accessibility of your data might not thread the needle between HIPAA compliance and appropriate end-user authorization.
Some organizations, dedicated to protecting the protected health information (PHI) of their patients or members, clip their own wings when the time comes to analyze data. In an attempt to avoid expensive HIPAA violations, they restrict stakeholders, analysts, or other covered entities from accessing the data. Though it’s essential to remain HIPAA compliant, data analysis can be conducted in ways that safeguard PHI while also improving treatment quality or reducing the cost of care.
Your organization can de-identify records (removing names, geographic indicators, contact info, social security numbers, etc.) in a specific data warehouse. Presenting scrubbed files to authorized users can help them gain a wide range of insights that can transform care outcomes, reduce patient outmigration, reduce waste, and more.
Elevating Your Overall Data Governance
With all of these challenges in sight, it’s easy to get overwhelmed about the next steps. Though we’ve provided some actions your organization can utilize, it’s important to recognize effective data governance is as much a change in your mindset as it is a series of best practices. Here are some additional considerations to keep in mind as you work to improve your data governance:
You Need a Defined Data Governance Strategy.
An ad hoc approach to data governance will fail in the long run. There needs to be agreement among your data stakeholders about data availability, consistency, and quality. Often, it helps to start with a pilot project on a single line of business or department to ensure that all of the kinks of the transition are ironed out before your data governance strategy is taken enterprise wide.
Even then, compromise between standardization and distributed action is important so users within your organization are following the same best practices as they conduct dispersed analytics.
Your Culture Likely Needs to Change.
Eliminating data inconsistencies or adjusting inaccuracies are temporary fixes if only your executives are committed to making a change. Employees across your organization need to embrace the ideals of effective data governance if your organization is going to gain useful and accurate intelligence from your data.
Is your organization suffering from poor data governance? Find out the ways you can improve your data management by scheduling a whiteboard session with a member of the 2nd Watch team.
Jim Anfield – Principal, Healthcare Practice Leader 2nd Watch