A Developer’s Guide to Power BI

There are many options when it comes to data analytics tools. Choosing the right one for your organization will depend on a number of factors. Since many of the reviews and articles on these tools are focused on business users, the 2nd Watch team wanted to explore these tools from the developer’s perspective. In this developer’s guide to Power BI, we’ll go over the performance, interface, customization, and more to help you get a full understanding of this tool.

Why Power BI?

Power BI is a financially attractive alternative to the likes of Tableau and Looker, which either offer custom-tailored pricing models or a large initial per-user cost followed by an annual fee after the first year. However, don’t conflate cost with quality; getting the most out of Power BI is more dependent on your data environment and who is doing the data discovery. Companies already relying heavily on Microsoft tools should look to add Power BI to their roster, as it integrates seamlessly with SQL Server Analysis Services to facilitate faster and deeper analysis.

Performance for Developers

When working with large datasets, developers will experience some slowdown as they customize and publish their reports. Developing on Power BI works best with small-to-medium-sized data sets. At the same time, Microsoft has come out with more optimization options such as drill-through functionality, which allows for deeper analytical work for less processing power.

Performance for Users

User performance through Power BI Services is controlled through row-level security implementation. For any sized dataset, the number of rows can be limited depending on the user’s role. Overviews and executive dashboards may run somewhat slowly, but as the user’s role becomes more granular, dashboards will operate more quickly.

User Interface: Data Layer

Data is laid out in a tabular form; clicking any measure column header reveals a drop-down menu with sorting options, filtering selections, and the Data Analysis Expressions (DAX) behind the calculation.

User Interface: Relationship Layer

The source tables are draggable objects with labeled arrows between tables denoting the type of relationship.

Usability and Ease of Learning

Microsoft Power BI documentation is replete with tutorials, samples, quickstarts, and concepts for the fundamentals of development. For a more directed learning experience, Microsoft also put out the Microsoft Power BI Guided Learning set, which is a freely available collection of mini courses on modeling, visualization, and exploration of data through Power BI. It also includes an introduction to DAX development as a tool to transform data in the program. Additionally, the Power BI community forums almost always have an answer to any technical question a developer might have.

Modeling

Power BI can easily connect to multiple data sources including both local folders and most major database platforms. Data can be cleaned and transformed using the Query Editor; the Editor can change data type, add columns, and combine data from multiple sources. Throughout this transformation process, the Query Editor records each step so that every time the query connects to the data source, the data is transformed accordingly. Relationships can be created by specifying a from: table and to table, the keys to relate, a cardinality, and a cross-filter direction.

Customization

In terms of data transformation, Power Query is a powerful language for ensuring that your report contains the exact data and relationships you and your business user are looking to understand. Power Query simplifies the process of data transformation with an intuitive step-by-step process for joining, altering, or cleaning your tables within Power BI. For actual report building, Power BI contains a comprehensive list of visualizations for almost all business needs; if one is not found within the default set, Microsoft sponsors a visual gallery of custom user-created visualizations that anyone is free to explore and download.

Permissions and User Roles

Adding permissions to workspaces, datasets, and reports within your org is as simple as adding an email address and setting an access level. Row-level security is enabled in Power BI Desktop; role management allows you flexibly customize access to specific data tables using DAX functions to specify conditional filters. Default security filtering is single-directional; however, bi-directional cross-filtering allows for the implementation of dynamic row-level security based on usernames and/or login IDs.

Ease of Dev Opp and Source Control

When users have access to a data connection or report, source and version control are extremely limited without external GitHub resources. Most of the available activities are at the macro level: viewing/editing reports, adding sources to gateways, or installing the application. There is no internal edit history for any reports or dashboards.

Setup and Environment

Setup is largely dependent on whether your data is structured in the cloud, on-premises, or a hybrid. Once the architecture is established, you need to create “data gateways” and assign them to different departments and data sources. This gateway acts as a secure connection between your data source and development environments. From there, security and permissions can be applied to ensure the right people within your organization have access to your gateways. When the gateways are established, data can be pulled into Power BI via Power Query and development can begin.

Implementation

The most common implementation of Power BI utilizes on-premises source data and Power BI Desktop for data preparation and reporting, with Power BI Service used in the cloud to consume reports and dashboards, collaborate, and establish security. This hybrid implementation strategy takes advantage of the full range of Power BI functionality by leveraging both the Desktop and Service versions. On-premises data sources connect to Power BI Desktop for development, leading to quicker report creation (though Power BI also supports cloud-based data storage).

Summary and Key Points

Power BI is an extremely affordable and comprehensive analytics tool. It integrates seamlessly with Excel, Azure, and SQL Server, allowing for established Microsoft users to start analyzing almost instantly. The tool is easy to learn for developers and business users alike, and there are many available resources, like Microsoft mini-courses and community forums.

A couple things to be aware of with Power BI: It may lack some of the bells and whistles as compared to other analytics tools, and it’s best if you’re already in the Microsoft ecosystem and are coming in with a solid data strategy.

If you want to learn more about Power BI or any other analytics tools, contact us today to schedule a no-obligation whiteboard session.


A Developer’s Guide to Tableau

Why Tableau?

Tableau gets a good reputation for being sleek and easy to use and by bolstering an impeccable UI/UX. It’s by and large an industry leader due to its wide range of visualizations and ability to cohesively and narratively present data to end users. As a reliable, well-established leader, Tableau can easily integrate with many sources, has extensive online support, and does not require a high level of technical expertise for users to gain value.

Want better Tableau dashboards? Our modern data and analytics experts are here to help. Learn more about our modern cloud analytics solutions with Snowflake and Tableau.

Performance for Developers

One of the easiest ways to ensure good performance with Tableau is to be mindful of how you import your data. Utilizing extracts rather than live data and performing joins or unions in your database reduces a lot of the processing that Tableau would otherwise have to do. While you can easily manipulate data without any coding, these capabilities reduce performance significantly, especially when dealing with large volumes of information. All data manipulation should be done in your database or data warehouse prior to adding it as a source. If that isn’t an option, Tableau offers a product called Tableau Prep that enables data manipulation and enhanced data governance capabilities.

Performance for Users

Dashboard performance for users depends almost entirely on practices employed by developers when building out reports. Limiting the dataset to information required for the goals of the dashboard reduces the amount of data Tableau processes as well as the number of filters included for front-end users. Cleaning up workbooks to reduce unnecessary visualizations will enhance front-end performance as well.

User Interface: Data Source

After connecting to your source, Tableau presents your data using the “Data Source” tab. This is a great place to check that your data was properly loaded and doesn’t have any anomalies. Within this view of the data, you have the chance to add more sources and the capability to union and join tables together as well as filter the data to a specific selection and exclude rows that were brought in.

User Interface: Worksheet

The “Worksheet” tabs are where most of the magic happens. Each visualization that ends up on the dashboard will be developed in separate worksheets. This is where you will do most of the testing and tweaking as well as where you can create any filters, parameters, or calculated fields.

User Interface: Dashboards

In the “Dashboard” tab, you bring together all of the individual visualizations you have created. The drag-and-drop UI allows you to use tiles predetermined by Tableau or float the objects to arrange them how you please. Filters can be applied to all of the visualizations to create a cohesive story or to just a few visualizations to break down information specific to a chart or table. It additionally allows you to toggle between different device layouts to ensure end-user satisfaction.

User Interface: Stories

One of the most unique Tableau features is its “Stories” capability. Stories work great when you need to develop a series of reports that present a narrative to a business user. By adding captions and placing visualizations in succession, you can convey a message that speaks for itself.

Usability and Ease of Learning

The Tableau basics are relatively easy to learn due to the intuitive point-and-click UI and vast amount of educational resources such as their free training videos. Tableau also has a strong online community where answers to specific questions can be found either on the Help page or third-party sites.

Creating an impressive variety of simple visualizations can be done without a hitch. This being said, there are a few things to watch out for:

  • Some tricks and more niche capabilities can easily remain undiscovered.
  • Complex features such as table calculations may confuse new users.
  • The digestible UI can be deceiving – visualizations often appear correct when the underlying data is not. One great way to check for accuracy is to right-click on the visualization and select “View Data.”

Modeling

Unlike Power BI, Tableau does not allow users to create a complicated semantic layer within the tool. However, users can establish relationships between different data sources and across varied granularities through a method called data blending. One way to implement this method is by selecting the “Edit Relationships” option in the data drop-down menu.

Data blending also eliminates duplicates that may occur by using a function that returns a single value for the duplicate rows in the secondary source. Creating relationships among multiple sources in Tableau requires attention to detail as it can take some manipulation and may have unintended consequences or lead to mistakes that are difficult to spot.

Customization

The wide array of features offered by Tableau allows for highly customizable visualizations and reports. Implementing filter actions (which can apply to both worksheets and dashboards), parameters, and calculated fields empowers developers to modify the source data so that it better fits the purpose of the report. Using workarounds for calculations not explicitly available in Tableau frequently leads to inaccuracy; however, this can be combated by viewing the underlying data. Aesthetic customizations such as importing external images and the large variety of formatting capabilities additionally allow developers boundless creative expression.

Permissions and User Roles

The type of license assigned to a user determines their permissions and user roles. Site administrators can easily modify the site roles of users on the Tableau Server or Tableau Online based on the licenses they hold. The site role determines the most impactful action (e.g., read, share, edit) a specific user can make on the visualizations. In addition to this, permissions range from viewing or editing to downloading various components of a workbook. The wide variety of permissions applies to various components within Tableau. A more detailed guide to permissions capabilities can be found here.

Ease of Dev Opp and Source Control

Dev opp and source control improved greatly when Tableau implemented versioning of workbooks in 2016. This enables users to select the option to save a history of revisions, which saves a version of the workbook each time it is overwritten. This enables users to go back to previous versions of the workbook and access work that may have been lost. When accessing prior versions, keep in mind that if an extract is no longer compatible with the source, its data refresh will not work.

Setup and Environment

With all of the necessary information on your sources, setup in Tableau is a breeze. It has built-in connectors with a wide range of sources and presents your data to you upon connection. You also have a variety of options regarding data manipulation and utilizing live or static data (as mentioned above). Developers utilize the three Tableau environments based primarily on the level of interactions and security they desire.

  • Tableau Desktop: Full developer software in a silo; ability to connect to databases or personal files and publish work for others to access
  • Tableau Server: Secure environment accessed through a web browser to share visualizations across the organization; requires a license for each user
  • Tableau Online: Essentially the same as Tableau Server but based in the cloud with a wider range of connectivity options

Implementation

Once your workbook is developed, select the server and make your work accessible for others either on Tableau Online or on Tableau Server by selecting “publish.” During this process, you can determine the specific project you are publishing and where to make it available. There are many other modifications that can be adjusted such as implementing editing permissions and scheduling refreshes of the data sources.

Summary and Key Points

Tableau empowers developers of all skill levels to create visually appealing and informative dashboards, reports, and storytelling experiences. As developers work, there is a wealth of customization options to tailor reports to their specific use case and draw boundless insights for end users. To ensure that Tableau gleans the best results for end users, keep these three notes in mind:

  1. Your underlying data must be trustworthy as Tableau does little to ensure data integrity. Triple-check the numbers in your reports.
  2. Ensure your development methods don’t significantly damage performance for both developers and end users.
  3. Take advantage of the massive online community to uncover vital features and leverage others’ knowledge when facing challenges.

If you have any questions on Tableau or need help getting better insights from your Tableau dashboards, contact us for an analytics assessment.


Here’s Why Your Data Science Project Failed (and How to Succeed Next Time)

87% of data science projects never make it beyond the initial vision into any stage of production. Even some that pass-through discovery, deployment, implementation, and general adoption fail to yield the intended outcomes. After investing all that time and money into a data science project, it’s not uncommon to feel a little crushed when you realize the windfall results you expected are not coming.

Yet even though there are hurdles to implementing data science projects, the ROI is unparalleled – when it’s done right.

Looking to get started with ML, AI, or other data science initiatives? Learn how to get started with our Data Science Readiness Assessment.

Opportunities

You can enhance your targeted marketing.

Coca-Cola has used data from social media to identify its products or competitors’ products in images, increasing the depth of consumer demographics and hyper-targeting them with well-timed ads.

You can accelerate your production timelines.

GE has used artificial intelligence to cut product design times in half. Data scientists have trained algorithms to evaluate millions of design variations, narrowing down potential options within 15 minutes.

With all of that potential, don’t let your first failed attempt turn you off to the entire practice of data science. We’ve put together a list of primary reasons why data science projects fail – and a few strategies for forging success in the future – to help you avoid similar mistakes.

Hurdles

You lack analytical maturity.

Many organizations are antsy to predict events or decipher buyer motivations without having first developed the proper structure, data quality, and data-driven culture. And that overzealousness is a recipe for disaster. While a successful data science project will take some time, a well-thought-out data science strategy can ensure you will see value along the way to your end goal.

Effective analytics only happens through analytical maturity. That’s why we recommend organizations conduct a thorough current state analysis before they embark on any data science project. In addition to evaluating the state of their data ecosystem, they can determine where their analytics falls along the following spectrum:

Descriptive Analytics: This type of analytics is concerned with what happened in the past. It mainly depends on reporting and is often limited to a single or narrow source of data. It’s the ground floor of potential analysis.

Diagnostic Analytics: Organizations at this stage are able to determine why something happened. This level of analytics delves into the early phases of data science but lacks the insight to make predictions or offer actionable insight.

Predictive Analytics: At this level, organizations are finally able to determine what could happen in the future. By using statistical models and forecasting techniques, they can begin to look beyond the present into the future. Data science projects can get you into this territory.

Prescriptive Analytics: This is the ultimate goal of data science. When organizations reach this stage, they can determine what they should do based on historical data, forecasts, and the projections of simulation algorithms.

Your project doesn’t align with your goals.

Data science, removed from your business objectives, always falls short of expectations. Yet in spite of that reality, many organizations attempt to harness machine learning, predictive analytics, or any other data science capability without a clear goal in mind. In our experience, this happens for one of two reasons:

1. Stakeholders want the promised results of data science but don’t understand how to customize the technologies to their goals. This leads them to pursue a data-driven framework that’s prevailed for other organizations while ignoring their own unique context.

2. Internal data scientists geek out over theoretical potential and explore capabilities that are stunning but fail to offer practical value to the organization.

Outside of research institutes or skunkworks programs, exploratory or extravagant data science projects have a limited immediate ROI for your organization. In fact, the odds are very low that they’ll pay off. It’s only through a clear vision and practical use cases that these projects are able to garner actionable insights into products, services, consumers, or larger market conditions.

Every data science project needs to start with an evaluation of your primary goals. What opportunities are there to improve your core competency? Are there any specific questions you have about your products, services, customers, or operations? And is there a small and easy proof of concept you can launch to gain traction and master the technology?

The above use case from GE is a prime example of having a clear goal in mind. The multinational company was in the middle of restructuring, reemphasizing its focus on aero engines and power equipment. With the goal of reducing their six- to 12-month design process, they decided to pursue a machine learning project capable of increasing the efficiency of product design within their core verticals. As a result, this project promises to decrease design time and budget allocated for R&D.

Organizations that embody GE’s strategy will face fewer false starts with their data science projects. For those that are still unsure about how to adapt data-driven thinking to their business, an outsourced partner can simplify the selection process and optimize your outcomes.

Your solution isn’t user-friendly.

The user experience is often an overlooked aspect of viable data science projects. Organizations do all the right things to create an analytics powerhouse customized to solve a key business problem, but if the end users can’t figure out how to use the tool, the ROI will always be weak. Frustrated users will either continue to rely upon other platforms that provided them with limited but comprehensible reporting capabilities, or they will stumble through the tool without unlocking its full potential.

Your organization can avoid this outcome by involving a range of end users in the early stages of project development. This means interviewing both average users and extreme users. What are their day-to-day needs? What data are they already using? What insight do they want but currently can’t obtain?

An equally important task is to determine your target user’s data literacy. The average user doesn’t have the ability to derive complete insights from the represented data. They need visualizations that present a clear-cut course of action. If the data scientists are only thinking about how to analyze complex webs of disparate data sources and not whether end users will be able to decipher the final results, the project is bound to struggle.

You don’t have data scientists who know your industry.

Even if your organization has taken all of the above considerations into mind, there’s still a chance you’ll be dissatisfied with the end results. Most often, it’s because you aren’t working with data science consulting firms that comprehend the challenges, trends, and primary objectives of your industry.

Take healthcare, for example. Data scientists who only grasp the fundamentals of machine learning, predictive analytics, or automated decision-making can only provide your business with general results. The right partner will have a full grasp of healthcare regulations, prevalent data sources, common industry use cases, and what target end users will need. They can address your pain points and already know how to extract full value for your organization.

And here’s another example from one of our own clients. A Chicago-based retailer wanted to use their data to improve customer lifetime value, but they were struggling with a decentralized and unreliable data ecosystem. With the extensive experience of our retail and marketing team, we were able to outline their current state and efficiently implement a machine-learning solution that empowered our client. As a result, our client was better able to identify sales predictors and customize their marketing tactics within their newly optimized consumer demographics. Our knowledge of their business and industry helped them to get the full results now and in the future.

Is your organization equipped to achieve meaningful results through data science? Secure your success by working with 2nd Watch. Schedule a whiteboard session with our team to get you started on the right path.


5 Ways Insurance Companies Are Driving ROI through Analytics

Insurance providers are rich with data far beyond what they once had at their disposal for traditional historical analysis. The quantity, variety, and complexity of that data enhance the ability of insurers to gain greater insights into consumers, market trends, and strategies to improve their bottom line. But which projects offer you the best return on your investment? Here’s a glimpse at some of the most common insurance analytics project use cases that can transform the capabilities of your business.

Want better dashboards? Our data and analytics insurance team are here to help. Learn more about our data visualization starter pack.

Issuing More Policies

Use your historical data to predict when a customer is most likely to buy a new policy.

Both traditional insurance providers and digital newcomers are competing for the same customer base. As a result, acquiring new customers requires targeted outreach with the right message at the moment a buyer is ready to purchase a specific type of insurance.

Predictive analytics allows insurance companies to evaluate the demographics of the target audience, their buying signals, preferences, buying patterns, pricing sensitivity, and a variety of other data points that forecast buyer readiness. This real-time data empowers insurers to reach policyholders with customized messaging that makes them more likely to convert.

Quoting Accurate Premiums

Provide instant access to correct quotes and speed up the time to purchase.

Consumers want the best value when shopping for insurance coverage, but if their quote fails to match their premium, they’ll take their business elsewhere. Insurers hoping to acquire and retain policyholders need to ensure their quotes are precise – no matter how complex the policy.

For example, one of our clients wanted to provide ride-share drivers with four-hour customized micro policies on-demand. Using real-time analytical functionality, we enabled them to quickly and accurately underwrite policies on the spot.

Improving Customer Experience

Better understand your customer’s preferences and optimize future interactions.

A positive customer experience means strong customer retention, a better brand reputation, and a reduced likelihood that a customer will leave you for the competition. In an interview with CMSWire, the CEO of John Hancock Insurance said many customers see the whole process as “cumbersome, invasive, and long.” A key solution is reaching out to customers in a way that balances automation and human interaction.

For example, the right analytics platform can help your agents engage policyholders at a deeper level. It can combine the customer story and their preferences from across customer channels to provide more personalized interactions that make customers feel valued.

Detecting Fraud

Stop fraud before it happens.

You want to provide all of your customers with the most economical coverage, but unnecessary costs inflate your overall expenses. Enterprise analytics platforms enable claims analysis to evaluate petabytes of data to detect trends that indicate fraud, waste, and abuse.

See for yourself how a tool like Tableau can help you quickly spot suspicious behavior with visual insurance fraud analysis.

Improving Operations and Financials

Access and analyze financial data in real time.

In 2019, ongoing economic growth, rising interest rates, and higher investment income were creating ideal conditions for insurers. However, that’s only if a company is maximizing their operations and ledgers.

Now, high-powered analytics has the potential to provide insurers with a real-time understanding of loss ratios, using a wide range of data points to evaluate which of your customers are underpaying or overpaying.

Are you interested in learning how a modern analytics platform like Tableau, Power BI, Looker, or other BI technologies can help you drive ROI for your insurance organization? Schedule a no-cost insurance whiteboarding strategy session to explore the full potential of your insurance data.


Improve Dashboard Performance when Using Tableau, Power BI, and Looker

As dashboards and reports become more and more complex, slow run times can present major roadblocks. Here’s a collection of some of the top tips on how to improve dashboard performance and cut slow run times when using Tableau, Power BI, and Looker.

Universal Tips

Before getting into how to improve dashboard performance within the three specific tools, here are a few universal principles that will lead to improved performance in almost any case.

Limit logic used in the tool itself: If you’re generating multiple calculated tables/views, performing complex joins, or adding numerous calculations in the BI tool itself, it’s a good idea for performance and governance to execute all those steps in the database or a separate business layer. The more data manipulation done by your BI tool, the more queries and functions your tool has to execute itself before generating visualizations.

Note: This is not an issue for Looker, as Looker offloads all of its computing onto the database via SQL.

Have the physical data available in the needed format: When the physical data in the source matches the granularity and level of aggregation in the dashboard, the BI tool doesn’t need to execute a function to aggregate it. Developing this in the data mart/warehouse can be a lot of work but can save a lot of time and pain during dashboard development.

Keep your interface clean and dashboards focused: Consolidate or delete unused report pages, data sources, and fields. Limiting the number of visualizations on each dashboard also helps cut dashboard refresh time.

Simplify complex strings: In general, processing systems execute functions with strings much more slowly than ints or booleans. Where possible, convert fields like IDs to ints and avoid complex string calculations.

Tableau

Take advantage of built-in performance tracking: Always the sleek, powerful, and intuitive leading BI tool, Tableau has a native function that analyzes performance problem areas. The performance recorder tells you which worksheets, queries, and dashboards are slow and even shows you the query text.

Execute using extracts rather than live connections: Tableau performs much faster when executing queries on extracts versus live connections. Use extracts  whenever possible,  and keep them trimmed down to limit query execution time. If you want to stream data or have a constantly refreshing dataset, then extracts won’t be an option.

Again, limit logic: Tableau isn’t built to handle too much relational modeling or data manipulation – too many complex joins or calculations really slow down its processing. Try to offload as many of these steps as possible onto the database or a semantic layer.

Limit marks and filters: Each mark included on a visualization means more parsing that Tableau needs to perform, and too many filters bog down the system. Try instead to split complex worksheets/visualizations into multiple smaller views and connect them with filter actions to explore those relationships more quickly.

Further Sources: Tableau’s website has a succinct and very informative blog post that details most of these suggestions and other specific recommendations. You can find it here.

Power BI

Understand the implications of DirectQuery: Similar in concept to Tableau’s extract vs. live connection options, import and DirectQuery options for connecting to data sources have different impacts on performance. It’s important to remember that if you’re using DirectQuery, the time required to refresh visuals is dependent on how long the source system takes to execute Power BI’s query. So if your database server is flushed with users or operating slowly for some other reason, you will have slow execution times in Power BI and the query may time out. (See other important considerations when using DirectQuery here.)

Utilize drillthrough: Drillthrough pages are very useful for data exploration and decluttering reports, but they also have the added benefit of making sure your visuals and dashboards aren’t overly complex. They cut down query execution time and improve runtime while still allowing for in-depth exploration.

Be careful with row-level security: Implementing row-level security has powerful and common security use cases, but unfortunately, its implementation has the tendency to bog down system performance. When RLS is in place, Power BI has to query the backend and generate caching separately for each user role. Try to create only as many roles as absolutely necessary, and be sure to test each role to know the performance implications.

Further Sources: Microsoft’s Power BI documentation has a page dedicated to improving performance that further details these options and more. Check it out here.

Looker

Utilize dashboard links: Looker has a wonderful functionality that allows for easy URL linking in their drill menus. If you’re experiencing long refresh times, a nifty remedy is to split up your dashboard into different dashboards and provide links between them in drill menus.

Improve validation speed: LookML validation checks the entire project –  all model, view, and LookML dashboard files. Increased complexity and crossover between logic in your files lead to longer validation time. If large files and complex relationships make lag in validation time problematic, it can be a good idea to break up your projects into smaller pieces where possible. The key here is handling complex SQL optimally by utilizing whatever methods will maximize SQL performance on the database side.

Pay attention to caching: Caching is another important consideration with Looker performance. Developers should be very intentional with how they set up caching and the conditions for dumping and refreshing a cache, as this will greatly affect dashboard runtime. See Looker’s documentation for more information on caching.

Optimize performance with Persistent Derived Tables (PDTs) and Derived Tables (DTs): Caching considerations come into play when deciding between using PDTs and DTs. A general rule of thumb is that if you’re using constantly refreshing data, it’s better to use DTs. If you’re querying the database once and then developing heavily off of that query, PDTs can greatly increase your performance. However, if your PDTs themselves are giving you performance issues, check out this Looker forum post for a few remedies.

Further Sources: Looker’s forums are rich with development tips. These two forum pages are particularly helpful to learn more about how to improve dashboard performance using Looker:

Want to learn more about how to improve dashboard performance? Our data and analytics experts are here to help. Learn about our data visualization starter pack.


How Insurance Fraud Analytics Can Protect Your Business from Fraudulent Claims

With your experience in the insurance industry, you understand more than most about how the actions of a smattering of people can cause disproportionate damage. The $80 billion in fraudulent claims paid out across all lines of insurance each year, whether soft or hard fraud, is perpetrated by lone individuals, sketchy auto mechanic shops, or the occasional organized crime group. The challenge for most insurers is that detecting, investigating, and mitigating these deceitful claims is a time-consuming and expensive process.

Rather than accepting loss to fraud as part of the cost of doing business, some organizations are enhancing their detection capabilities with insurance analytics solutions. Here is how your organization can use insurance fraud analytics to enhance fraud detection, uncover emerging criminal strategies, and still remain compliant with data privacy regulations.

Recognizing Patterns Faster

When you look at exceptional claim’s adjusters or special investigation units, one of the major traits they all share is an uncanny ability to recognize fraudulent patterns. Their experience allows them to notice the telltale signs of fraud, whether it’s frequent suspicious estimates from a body shop or complex billing codes intended to hide frivolous medical tests. Though you trust adjusters, many rely on heuristic judgments (e.g., trial and error, intuition, etc.) rather than hard rational analysis. When they do have statistical findings to back them up, they struggle to keep up with the sheer volume of claims.

This is where machine learning techniques can help to accelerate pattern recognition and optimize the productivity of adjusters and special investigation units. An organization starts by feeding a machine learning model a large data set that includes verified legitimate and fraudulent claims. Under supervision, the machine learning algorithm reviews and evaluates the patterns across all claims in the data set until it has mastered the ability to spot fraud indicators.

Let’s say this model was given a training set of legitimate and fraudulent auto insurance claims. While reviewing the data for fraud, the algorithm might spot links in deceptive claims between extensive damage in a claim and a lack of towing charges from the scene of the accident. Or it might notice instances where claims involve rental cars rented the day of the accident that are all brought to the same body repair shop. Once the algorithm begins to piece together these common threads, your organization can test the model’s unsupervised ability to create a criteria for detecting deception and spot all instances of fraud.

What’s important in this process is finding a balance between fraud identification and instances of false positives. If your program is overzealous, it might create more work for your agents, forcing them to prove that legitimate claims received an incorrect label. Yet when the machine learning model is optimized, it can review a multitude of dimensions to identify the likelihood of fraudulent claims. That way, if an insurance claim is called into question, adjusters can comb through the data to determine if the claim should truly be rejected or if the red flags have a valid explanation.

Detecting New Strategies

The ability of analytics tools to detect known instances of fraud is only the beginning of their full potential. As with any type of crime, insurance fraud evolves with technology, regulations, and innovation. With that transformation comes new strategies to outwit or deceive insurance companies.

One recent example has emerged through automation. When insurance organizations began to implement straight through processing (STP) in their claim approvals, the goal was to issue remittances more quickly, easily, and cheaply than manual processes. For a time, this approach provided a net positive, but once organized fraudsters caught wind of this practice, they pounced on a new opportunity to deceive insurers.

Criminals learned to game the system, identifying amounts that were below the threshold for investigation and flying their fraudulent claims under the radar. In many cases, instances of fraud could potentially double without the proper tools to detect these new deception strategies. Though most organizations plan to enhance their anti-fraud technology, there’s still the potential for them to lose millions in errant claims – if their insurance fraud analytics are not programmed to detect new patterns.

In addition to spotting red flags for common fraud occurrences, analytics programs need to be attuned to any abnormal similarities or unlikely statistical trends. Using cluster analysis, an organization can detect statistical outliers and meaningful patterns that reveal potential instances of fraud (such as suspiciously identical fraud claims).

Even beyond the above automation example, your organization can use data discovery to find hidden indicators of fraud and predict future incidents. Splitting claims data into various groups through a few parameters (such as region, physician, billing code, etc., in healthcare) can help in detecting unexpected correlations or warning signs for your automation process or even human adjusters to flag as fraud.

Safeguarding Personally Identifiable Information

As you work to improve your fraud detection, there’s one challenge all insurers face: protecting the personally identifiable information (PII) of policyholders while you analyze your data. The fines related to HIPAA violations can amount to $50,000 per violation, and other data privacy regulations can result in similarly steep fines. The good news is that insurance organizations can balance their fraud prediction and data discovery with security protocols if their data ecosystem is appropriately designed.

Maintaining data privacy compliance and effective insurance fraud analytics requires some maneuvering. Organizations that derive meaningful and accurate insight from their data must first bring all of their disparate data into a single source of truth. Yet, unless they also implement access control through a compliance-focused data governance strategy, there’s a risk of regulatory violations while conducting fraud analysis.

One way to limit your exposure is to create a data access layer that tokenizes the data, replacing any sensitive PII with unique identification symbols to keep data separate. Paired with clear data visualization capabilities, your adjusters and special investigation units can see clear-cut trends and evolving strategies without revealing individual claimants. From there, they can take their newfound insights into any red flag situation, saving your organization millions while reducing the threat of noncompliance.

Want to learn more about how the right analytics solutions can help you reduce your liability, issue more policies, and provide better customer service? Check out our insurance analytics solutions page for use cases that are transforming your industry.


What Real-Time Analytics Looks Like for Real-World Businesses

Real-time analytics. Streaming analytics. Predictive analytics. These buzzwords are thrown around in the business world without a clear-cut explanation of their full significance. Each approach to analytics presents its own distinct value (and challenges), but it’s tough for stakeholders to make the right call when the buzz borders on white noise.

Which data analytics solution fits your current needs? In this post, we aim to help businesses cut through the static and clarify modern analytics solutions by defining real-time analytics, sharing use cases, and providing an overview of the players in the space.

TL;DR

  • Real-time or streaming analytics allows businesses to analyze complex data as it’s ingested and gain insights while it’s still fresh and relevant.
  • Real-time analytics has a wide variety of uses, from preventative maintenance and real-time insurance underwriting to improving preventive medicine and detecting sepsis faster.
  • To get the full benefits of real-time analytics, you need the right tools and a solid data strategy foundation.

What is Real-Time Analytics?

In a nutshell, real-time or streaming analysis allows businesses to access data within seconds or minutes of ingestion to encourage faster and better decision-making. Unlike batch analysis, data points are fresh and findings remain topical. Your users can respond to the latest insight without delay.

Yet speed isn’t the sole advantage of real-time analytics. The right solution is equipped to handle high volumes of complex data and still yield insight at blistering speeds. In short, you can conduct big data analysis at faster rates, mobilizing terabytes of information to allow you to strike while the iron is hot and extract the best insight from your reports. Best of all, you can combine real-time needs with scheduled batch loads to deliver a top-tier hybrid solution.

Stream Analytics Overview Courtesy of Microsoft

Streaming Analytics in Action

How does the hype translate into real-world results? Depending on your industry, there is a wide variety of examples you can pursue. Here are just a few that we’ve seen in action:

Next-Level Preventative Maintenance

Factories hinge on a complex web of equipment and machinery working for hours on end to meet the demand for their products. Through defects or standard wear and tear, a breakdown can occur and bring production to a screeching halt. Connected devices and IoT sensors now provide technicians and plant managers with warnings – but only if they have the real-time analytics tools to sound the alarm.

Azure Stream Analytics is one such example. You can use Microsoft’s analytics engine to monitor multiple IoT devices and gather near-real-time analytical intelligence. When a part needs a replacement or it’s time for routine preventative maintenance, your organization can schedule upkeep with minimal disruption. Historical results can be saved and integrated with other line-of-business data to cast a wider net on the value of this telemetry data.

Real-Time Insurance Underwriting

Insurance underwriting is undergoing major changes thanks to the gig economy. Rideshare drivers need flexibility from their auto insurance provider in the form of modified commercial coverage for short-term driving periods. Insurance agencies prepared to offer flexible micro policies that reflect real-time customer usage have the opportunity to increase revenue and customer satisfaction.

In fact, one of our clients saw the value of harnessing real-time big data analysis but lacked the ability to consolidate and evaluate their high-volume data. By partnering with our team, they were able to create real-time reports that pulled from a variety of sources ranging from driving conditions to driver ride-sharing scores. With that knowledge, they’ve been able to tailor their micro policies and enhance their predictive analytics.

Healthcare Analytics

How about this? Real-time analytics saves lives. Death by sepsis, an excessive immune response to infection that threatens the lives of 1.7 million Americans each year, is preventable when diagnosed in time. The majority of sepsis cases are not detected until manual chart reviews conducted during shift changes – at which point, the infection has often already compromised the bloodstream and/or vital tissues. However, if healthcare providers identified warning signs and alerted clinicians in real time, they could save multitudes of people before infections spread beyond treatment.

HCA Healthcare, a Nashville-based healthcare provider, undertook a real-time healthcare analytics project with that exact goal in mind. They created a platform that collects and analyzes clinical data from a unified data infrastructure to enable up-to-the-minute sepsis diagnoses. Gathering and analyzing petabytes of unstructured data in a flash, they are now able to get a 20-hour early warning sign that a patient is at risk of sepsis. Faster diagnoses results in faster and more effective treatment.

That’s only the tip of the iceberg. For organizations in the healthcare payer space, real-time analytics has the potential to improve member preventive healthcare. Once again, real-time data from smart wearables, combined with patient medical history, can provide healthcare payers with information about their members’ health metrics. Some industry leaders even propose that payers incentivize members to make measurable healthy lifestyle choices, lowering costs for both parties at the same time.

Getting Started with Real-Time Analysis

There’s clear value produced by real-time analytics but only with the proper tools and strategy in place. Otherwise, powerful insight is left to rot on the vine and your overall performance is hampered in the process. If you’re interested in exploring real-time analytics for your organization, contact us for an analytics strategy session. In this session lasting 2-4 hours, we’ll review your current state and goals before outlining the tools and strategy needed to help you achieve those goals.


Supply Chain Industry Using Predictive Analytics to Boost Their Competitive Edge

Professionals in the supply chain industry need uncanny reflexes. The moment they get a handle on raw materials, labor expenses, international legislation, and shipping conditions, the ground shifts beneath them and all the effort they put into pushing their boulder up the hill comes undone. With the global nature of today’s supply chain environment, the factors governing your bottom line are exceptionally unpredictable. Fortunately, there’s a solution for this problem: predictive analytics for supply chain management.

This particular branch of analytics offers an opportunity for organizations to anticipate challenges before they happen. Sounds like an indisputable advantage, yet only 30% of supply chain professionals are using their data to forecast their future.

Want to improve your supply chain operations and better understand your customer’s behavior? Learn about our demand forecasting data science starter kit.

Though most of the stragglers plan to implement predictive analytics in the next 10 years, they are missing incredible opportunities in the meantime. Here are some of the competitive advantages companies are missing when they choose to ignore predictive operational analytics.

Enhanced Demand Forecasting

How do you routinely hit a moving goalpost? As part of an increasingly complex global system, supply chain leaders are faced with an increasing array of expected and unexpected sales drivers from which they are pressured to determine accurate predictions about future demand. Though traditional demand forecasting yields some insight from a single variable or small dataset, real-world supply chain forecasting requires tools that are capable of anticipating demand based on a messy, multifaceted assembly of key motivators. Otherwise, they risk regular profit losses as a result of the bullwhip effect, buying far more products or raw materials than are necessary.

For instance, one of our clients, an international manufacturer, struggled to make accurate predictions about future demand using traditional forecasting models. Their dependence on the historical sales data of individual SKUs, longer order lead times, and lack of seasonal trends hindered their ability to derive useful insight and resulted in lost profits. By implementing machine learning models and statistical packages within their organization, we were able to help them evaluate the impact of various influencers on the demand of each product. As a result, our client was able to achieve an 8% increase in weekly demand forecast accuracy and 12% increase in monthly demand forecast accuracy.

This practice can be carried across the supply chain in any organization, whether your demand is relatively predictable with minor spikes or inordinately complex. The right predictive analytics platform can clarify the patterns and motivations behind complex systems to help you to create a steady supply of products without expensive surpluses.

Smarter Risk Management

The modern supply chain is a precise yet delicate machine. The procurement of raw materials and components from a decentralized and global network has the potential to cut costs and increase efficiencies – as long as the entire process is operating perfectly. Any type of disruption or bottleneck in the supply chain can create a massive liability, threatening both customer satisfaction and the bottom line. When organizations leave their fate up to reactive risk management practices, these disruptions are especially steep.

Predictive risk management allows organizations to audit each component or process within their supply chain for its potential to destabilize operations. For example, if your organization currently imports raw materials such as copper from Chile, predictive risk management would account for the threat of common Chilean natural disasters such as flooding or earthquakes. That same logic applies to any country or point of origin for your raw materials.

You can evaluate the cost and processes of normal operations and how new potentialities would impact your business. Though you can’t prepare for every possible one of these black swan events, you can have contingencies in place to mitigate losses and maintain your supply chain flow.

Formalized Process Improvement

As with any industry facing internal and external pressures to pioneer new efficiencies, the supply chain industry cannot rely on happenstance to evolve. There needs to be a twofold solution in place. One, there needs to be a culture of continuous organizational improvement across the business. Two, there need to be apparatuses and tools in place to identify opportunities and take meaningful action.

For the second part, one of the most effective tools is predictive analytics for supply chain management. Machine learning algorithms are exceptional at unearthing inefficiencies or bottlenecks, giving stakeholders the fodder to make informed decisions. Because predictive analytics removes most of the grunt work and exploration associated with process improvement, it’s easier to create a standardized system of seeking out greater efficiencies. Finding new improvements is almost automatic.

Ordering is an area that offers plenty of opportunities for improvement. If there is an established relationship with an individual customer (be it retailer, wholesaler, distributor, or the direct consumer), your organization has stockpiles of information on individual and demographic customer behavior. This data can in turn be leveraged alongside other internal and third-party data sources to anticipate product orders before they’re made. This type of ordering can accelerate revenue generation, increase customer satisfaction, and streamline shipping and marketing costs.


3 Benefits of Machine Learning for Retail Benefits

You already know that data is a gateway for retailers to improve customer experiences and increase sales. Through traditional analysis, we’ve been able to combine a customer’s purchase history with their browser behavior and email open rates to help pinpoint their current preferences and meet their precise future needs. Yet the new wave of buzzwords such as “machine learning” and “AI” promise greater accuracy and personalization in your forecasts and the marketing actions they inform.

What distinguishes the latest predictive analytics technology from the traditional analytics approach? Here are three of the numerous examples of this technology’s impact on addressing retail challenges and achieving substantial ROI.

Want better dashboards? Our data and analytics experts are here to help. Learn more about our data visualization starter pack.

1. Increase customer lifetime value.

Repeat customers contribute to 40% of a brand’s revenue. But how do you know where to invest your marketing dollars to increase your customer return rate? All of this comes down to predicting which customers are most likely to return and factors that influence the highest customer lifetime value (CLV) for these customers, which are both great use cases for machine learning.

Consider this example: Your customer is purchasing a 4K HD TV and you want to predict future purchases. Will this customer want HD accessories, gaming systems, or an upgraded TV in the near future? If they are forecasted to buy more, which approach will work to increase their chances of making the purchase through you? Predictive analytics can provide the answer.

One of the primary opportunities is to create more personalized sales process without mind-boggling manual effort. The sophistication of machine learning algorithms allows you to quickly review large inputs on purchase histories, internet and social media behavior, customer feedback, production costs, product specifications, market research, and other data sources with accuracy.

Historically, data science teams had to run one machine-learning algorithm at a time. Now, modern solutions from providers like DataRobot allows a user to run hundreds of algorithms at once and even identify the most applicable ones. This vastly increases the time-to-market and focuses your expensive data science team’s hours on interpreting results rather than just laying groundwork for the real work to begin.

2. Attract new customers.

Retailers cannot depend on customer loyalty alone. HubSpot finds that consumer loyalty is eroding, with 55% of customers no longer trusting the companies they buy from. With long-running customers more susceptible to your competitors, it’s important to always expand your base. However, as new and established businesses vie for the same customer base, it also appears that customer acquisition costs have risen 50% in five years.

Machine learning tools like programmatic advertising offer a significant advantage. For those unfamiliar with the term, programmatic advertising is the automated buying and selling of digital ad space using intricate analytics. For example, if your business is attempting to target new customers, the algorithms within this tool can analyze data from your current customer segments, page context, and optimal viewing time to push a targeted ad to a prospect at the right moment.

Additionally, businesses are testing out propensity modeling to target consumers with the highest likelihood of customer conversion. Machine learning tools can score consumers in real time using data from CRMs, social media, e-commerce platforms, and other sources to identify the most promising customers. From there, your business can personalize their experience to better shepherd them through the sales funnel – even going as far as reducing cart abandon rates.

3. Automate touch points.

Often, machine learning is depicted as a way to eliminate a human workforce. But that’s a mischaracterization. Its greatest potential lies in augmenting your top performers, helping them automate routine processes to free up their time for creative projects or in-depth problem-solving.

For example, you can predict customer churn based on irregularities in buying behavior. Let’s say that a customer who regularly makes purchases every six weeks lapses from their routine for 12 weeks. A machine learning model can identify if their behavior is indicative of churn and flag customers likely not to return. Retailers can then layer these predictions with automated touch points such as sending a reminder about the customer’s favorite product – maybe even with a coupon – straight to their email to incentivize them to return.

How to Get Started

Though implementing machine learning can transform your business in many ways, your data needs to be in the right state before you can take action. That involves identifying a single customer across platforms, cleaning up the quality of your data, and identifying specific use cases for machine learning. With the right partner, you can not only make those preparations but rapidly reap the rewards of powering predictive analytics with machine learning.

Want to learn how the 2nd Watch team can apply machine learning to your business? Contact us now.


Benefits of a Data Vault Model for 3PLs and How It Can Help Drive Better Decisions

As many third-party logistics (3PL) companies transition to a data-driven approach, it’s essential to underscore the importance of your data management practices. The way you choose to organize and store data impacts everything from how fast you can access information to which metrics are available. Many data-forward 3PL companies have begun implementing a data vault model to address this strategic decision. The data vault model allows them to address industry-wide challenges such as disparate data, lack of visibility into what is happening, reworking of analytics when acquisitions occur, and slow retrieval or transfer of information.

To assist you in determining the best possible way to organize your data, we will outline the benefits of a data vault model for 3PLs and highlight four use cases to illustrate the benefits for better decision-making.

What is data vault?

A data vault model is known for its practice of separating your data’s primary keys, relationships, and attributes from each other. Let’s say you want to analyze which customers are moving the most loads through you. The relationship between the customer and the load would be stored in one table, while the details about each load and customer would be stored in two separate, but related tables.

Structuring data in this manner helps you account for changing relationships within your data and seamless integration of new data sources when acquisitions or business rules inevitably change. Additionally, it enables quicker data loading through parallel streams and automatically stores historical data. For more details on what a data vault model is and the benefits it provides, check out this blog by 2nd Watch.

Data vault makes it easier to build a data warehouse with accurate, centralized data

The built-in relationships between data vault entities (hubs, satellites, links) make it easier to build a data warehouse. Structuring your data model around flexible but integrated primary keys allows you to combine data from various source systems easily in your data warehouse. It helps you ensure the data loaded into your reporting is not duplicated or out of date.

A lack of a data governance strategy often means that reporting is inconsistent and inaccurate. It reduces executives’ visibility into departments throughout the organization and limits your ability to create effective reporting because data is disjointed. Implementing a data vault model inherently accounts for centralizing your source data and enforcing primary keys. This will not only allow you to offer better reporting to customers, but it has also been found that accurate data is key to shipping accuracy. A strong data warehouse will further your internal analytics abilities by unlocking dashboards that highlight key metrics from revenue to cost-per-pound or on-time performance.

Data vault models make it easy to add new data sources and update business rules without interrupting access to data

A data vault model enables you to centralize data from various sources, while still addressing their differences such as load frequency and metadata. This is accomplished by storing the primary keys for an entity in one table, then creating attribute tables (satellites) specific to separate source systems.

Under a traditional model, most of this data would be held in one table and would require changes to the table structures, and therefore interruptions to data in production, each time a new source system is added. A scalable data model, like data vault, allows you to quickly adjust data delivery and reporting if your customers expand to new markets or merge with another company. Not only will this satisfy your current customers, but it is additionally a quality many logistics companies seek when choosing a 3PL partner. Accommodating multiple source systems and implementing business rules flexibly is key for any 3PL company’s data solution.

Data vault models allow for parallel loading, which gets you and your customers access to data faster

Data vault separates its source systems and data components into different tables. In doing so, it eliminates dependencies within your data and allows for parallel loading, meaning that multiple tables can be loaded at once rather than in a sequence. Parallel loading dramatically reduces the time it takes to access refreshed data.

Many 3PL companies offer customers access to high-quality reporting. Implementing a data vault model to load data quicker allows customers to gain insights in near-real-time. Furthermore, key metrics such as order accuracy, return rates, and on-time shipping percentage rely on timely data. They either require you to respond to a problem or could become inaccurate if your data takes too long to load. The faster you access your data, the more time you have to address your insights. This ultimately enables you to increase your accuracy and on-time shipments, leading to more satisfied customers.

Data vault models automatically save historic data required for advanced analytics

Whether you are looking for more advanced forecasting or planning to implement machine learning analytics, you will need to rely on historical data. Satellite tables, mentioned previously, store attribute information. Each time a feature of an order, a shipment, an employee, etc., changes, it is recorded in a satellite table with a timestamp when the change occurred. The model tracks the same information for changing relationships. This data allows you to automatically tie larger events to specific attributes involved when the events occurred.

3PL companies without data vault models often lose this history of attributes and relationships. When they pursue initiatives to find nuanced trends within their data through advanced analytics, their implementation is roadblocked by the task of generating adequate data. Alternatively, 3PL companies with a data vault model are ready to hit the ground running. Having historical data at your fingertips makes you prepared for any advanced analytics strategy.

2nd Watch has vast experience integrating 3PL companies’ key financial and operational data into a centralized hub. This immediately enables quick, reliable, and holistic insights to internal stakeholders and customers. Furthermore, it lays the groundwork for advanced predictive analytics that allow your teams to proactively address key industry challenges, including late deliveries, volatile market rates, and equipment failure.

Reach out to 2nd Watch for assistance getting started with data vault or evaluating how it may fit in with your current data strategy.