A healthcare client had been working on a local SQL server to start this development process but were having issues with scaling the operation up as the scope of the project expanded.
The solution 2nd Watch proposed involved setting up a full ETL pipeline that would take the data through multiple transformation steps. The end product of this solution essentially streamlined the ingestion/transformation process through the use of 3 tools set up to eventually run entirely on automation.
This will continue to have benefits in cost savings, as a lot of these practices apply to the best ways to run a data warehouse from a finance perspective, as well as from a data analyst perspective.
About the Business
The client is an independent healthcare navigator, working with employees and employers to simplify the healthcare experience. Not providing insurance themselves, they work with providers to field requests from customers and point them towards the best solution. They operate as the leaders in the healthcare navigation industry, which they were instrumental in building.
The Business Challenges
Working with a constantly growing number of customers and insurance clients, the company has maintained an enormous inflow of data from various sources. This includes external data, such as customer and provider information, as well as internal data, including call logs and employee information. This data comes from several different sources, all with individual resources for viewing and extracting it. Many of these sources also have unifying information, such as ID fields, which are shared across different platforms. This led to a desire to start centralizing all this incoming data into one location.
An internal team started to work on bringing this data out of the source systems and into a centralized server with API calls and other similar methods. They knew that they wanted to eventually build out dashboards and metrics they could use internally to analyze their business efficiency but were unsure how best to proceed with building that out. They had been working on a local SQL server to start this development process but were having issues with scaling the operation up as the scope of the project expanded. This led them to explore external solutions, which brought us into the picture.
The 2nd Watch Solution
The solution we proposed involved setting up a full ETL pipeline that would take the data through multiple transformation steps. The first step in this process was utilizing AWS to trigger API calls to all their individual data sources, funneling it all into one cloud storage system (s3). Bringing data through AWS would increase the ease of access for the data and allow for more versatility with the additional steps of the ETL process. The data would then be brought into Snowflake, a cloud data warehouse, to take the raw files and start building tables for that data. Snowflake would handle the ingestion from AWS s3 by utilizing triggers that send notifications when new files are ingested into s3.
After the data would move through s3 and Snowflake, an instance of dbt would be added on top of that. Dbt is an ETL tool that allows for SQL querying, taking raw data and transforming it into more user-accessible tables that can be accessed by BI tools like Tableau and ThoughtSpot. Scheduled jobs that create and re-populate these tables are created within dbt Cloud and output the results to Snowflake. From here, the internal BI team at the client can take these finished tables and utilize them to create dashboards and other business metrics that help them determine internal metrics. The end product of this solution essentially streamlines the ingestion/transformation process through the use of 3 tools set up to eventually run entirely on automation.
The Business Benefits
The end goal with this project was to create a reporting suite that would give our client business insights into their consumers and their own internal proceedings. While we are not building them these reports ourselves, we have assisted them in giving them access to a much larger base to pull data from. This data, after going through our new pipeline, takes several disparate sources and links them together with keys and joins that the client can use to create visuals with a much higher ease of use. Rather than having to spend time building out every metric and view within a BI tool, the work is already done and being refreshed regularly within Snowflake. This solution is saving time, in the long run, as data is now easier for stakeholders to access and analyze compared to the earlier raw state. Not to mention the benefits of cloud storage allowing more versatility than their previous local server.
The reports being built off this data have also proved to be very effective within the company. Their BI team has been reporting on new developments within the company that were discovered using our new system. Tables that we are building have helped point them towards gaps in their data collection, points of pain with their regular users, and places to improve efficiency with their call center staff. These insights will continue to grow as more data is ingested and tables are developed, but what they have seen so far from these builds have been incredibly effective. Our client is now learning more about how to improve the efficiency of their business model, as well as simplifying their data handling coming from collection sources.
We believe we were able to create a better atmosphere for client learning within the team we have been working with. We have gone through and continued to give them a greater understanding of how to format and organize their data into a more user accessible and effective system. This will continue to have benefits in cost savings, as a lot of these practices apply to the best ways to run a data warehouse from a finance perspective, as well as from a data analyst perspective. While they are working more on the physical deliverables of this endeavor, we have been focused on creating a knowledge base within the company that allows them to continue growing even after we are finished with this project.