So, you’ve been tasked with building an analytics dashboard. It’s tempting to jump into development straight away, but hold on a minute! There are numerous pitfalls that are easy to fall into and can ruin your plans for an attractive, useful dashboard. Here are five important principles for dashboard development to keep in mind every time you open up Power BI, Tableau, Looker, or any other BI tool.
1. Keep it focused and defined.
Before you start answering questions, you need to know exactly what you’re trying to find out. The starting point of most any dashboarding project should be a whiteboarding session with the end users; the dashboard becomes a collection of visuals that hold the ability to answer their questions.
For every single visual you create, make sure you’re answering a specific question. Each graph needs to be intentional and purposeful, and it’s very important to have your KPIs clearly defined well before you start building. If you don’t include your stakeholders from the very beginning, you’ll almost certainly have a lot more reworking to do after initial production is complete.
Courtesy of discourse.looker.com
2. A good data foundation is key.
Generating meaningful visualizations is nearly impossible without a good data foundation. Unclean data means holes and problems will need to be patched and fixed further down the pipeline. Many BI tools have functions that can format/prepare your data and generate some level of relational modeling for building your visualizations. However, too much modeling and logic in the tool itself will lead to large performance issues, and most BI tools aren’t specifically built with data wrangling in mind. A well-modeled semantic layer in a separate tool that handles all the necessary business logic is often essential for performance and governance.
Don’t undervalue the semantic layer!
The semantic layer is the step in preparation where the business logic is performed, joins are defined, and data is formatted from its raw form so it’s understandable and logical for users going forward. For Power BI users, for example, you would likely generate tabular models within SSAS. With a strong semantic layer in place before you even get to the BI tool, there will be little to no data management to be done in the tool itself. This means there is less processing the BI tool needs to handle and a much cleaner governance system.
In many BI tools, you can load in a raw dataset and have a functional dashboard in 10 minutes. However, building a semantic layer forces you to slow down and put some time in upfront for definition, development, and reflection about what the data is and what insights you’re trying to get for your business. This ensures you’re actually answering the right questions.
This is one of the many strengths of Looker, which is built specifically to handle the semantic layer as well as create visualizations. It forces you to define the logic in the tool itself before you start creating visuals.
It’s often tempting to skip the data prep steps in favor of putting out a finished product quickly, but remember: Your dashboard is only as good as the data underneath it.
3. PLEASE de-clutter.
There are numerous, obvious problems with the dashboard below, but there is one lesson to learn that many developers forget: Embrace white space! White space wants to be your friend. Like in web development, trying to pack too many visuals into the same dashboard is a recipe for disaster. Edward Tufte calls it the “data to ink ratio” in his book The Visual Display of Quantitative Information, one of the first and most impactful resources on data visualization.
Basically, just remove anything that isn’t essential or move important but non-pertinent information to a different page of the dashboard/report.
4. Think before using that overly complicated visual.
About to use a tree-map to demonstrate relationships among three variables at once? What about a 3-D, three-axis representation of sales? Most of the time: don’t. Visualizing data isn’t about making something flashy – it’s about creating something simple that someone can gain insight from at a glance. For almost any complex visualization, there is a simpler solution available, like splitting up the graph into multiple, more focused graphs.
5. Keep your interface clean, understandable, and consistent.
In addition to keeping your data clean and your logic well-defined, it’s important to make sure everything is understandable from start to finish and is easy to interpret by the end users. This starts with simply defining dimensions and measures logically and uniformly, as well as hiding excess and unused columns in the end product. A selection panel with 10 well-named column options is much easier than one with 30, especially if end-users will be doing alterations and exploration themselves.
You may notice a theme with most of these principles for dashboard development: Slow down and plan. It’s tempting to jump right into creating visuals, but never underestimate the value of planning and defining your steps first. Doing that will help ensure your dashboard is clean, consistent, and most important, valuable.
If you need help planning, implementing, or finding insights in your dashboards, the 2nd Watch team can help. Our certified consultants have the knowledge, training, and experience to help you drive the most value from your dashboard tool. Contact us today to learn about our data visualization starter pack.
Using a modern data warehouse, like Snowflake, can give your organization improved access to your data and dramatically improved analytics. When paired with a BI tool, like Tableau, or a data science platform, like Dataiku, you can gain even faster access to impactful insights that help your organization fuel innovation and drive business decisions.
In this post, we’ll provide a high-level overview of Snowflake, including a description of the tool, why you should use it, pros and cons, and complementary tools and technologies.
Overview of Snowflake
Snowflake was built from the ground up for the cloud, initially starting on AWS and scaling to Azure and GCP. With no servers to manage and near-unlimited scale in compute, Snowflake separates compute from storage and charges based on the size and length of time that compute clusters (known as “virtual warehouses”) are running queries.
Cross cloud lets organizations choose the cloud provider to use
Dynamic compute scaling saves on cost
Micro-partitioned storage with automatic maintenance
Rapid auto-scaling of compute nodes allows for increased cost savings and high concurrency on demand, and compute and storage are separated
Built for MPP (massive parallel processing)
Optimized for read via a columnar backend
Dedicated compute means no concurrency issues
Ability to assign dedicated compute
High visibility into spend
Native support for JSON, XML, Avro, Parquet, and ORC semi-structured data formats
SnowSQL has slight syntax differences
Introduction of Snowpark for Snowflake native development
Full visibility into queries executed, by whom, and how long they ran
Precision point-in-time restore available via “time-travel” feature
Why Use Snowflake
Decoupled from cloud vendors, it allows a true multi-cloud experience. You can deploy on Azure, AWS, GCP, or any combination of those cloud services. With near-unlimited scale and minimal management, it offers a best-in-class data platform but with a pay-for-what-you-use consumption model.
Pros of Snowflake
Allows for a multi-cloud experience built on top of existing AWS, Azure, or GCP resources, depending on your preferred platform
Easy implementation of security and role definitions for less frustrating user experience and easier delineation of cost while keeping data secure
Integrated ability to share data to partners or other consumers outside of an organization and supplement data with publicly available datasets within Snowflake
Cons of Snowflake
Ecosystem of tooling continues to grow as adoption expands, but some features are not readily available
Due to the paradigm shift in a cloud-born architecture, taking full advantage of Snowflake’s advanced features requires a good understanding of cloud data architecture
Select Complementary Tools and Technologies for Snowflake
Azure Data Factory
We hope you found this high-level overview of Snowflake helpful. If you’re interested in learning more about Snowflake or other modern data warehouse tools like Amazon Redshift, Azure Synapse, and Google BigQuery, contact us to learn more.
To remain competitive, organizations are increasingly moving towards modern data warehouses, also known as cloud-based data warehouses or modern data platforms, instead of traditional on-premise systems. Modern data warehouses differ from traditional warehouses in the following ways:
There is no need to purchase physical hardware.
They are less complex to set up.
It is much easier to prototype and provide business value without having to build out the ETL processes right away.
There is no capital expenditure and a low operational expenditure.
It is quicker and less expensive to scale a modern data warehouse.
Modern cloud-based data warehouse architectures can typically perform complex analytical queries much faster because of how the data is stored and their use of massively parallel processing (MPP).
Modern data warehousing is a cost-effective way for companies to take advantage of the latest technology and architectures without the upfront cost to purchase, install, and configure the required hardware, software, and infrastructure.
Comparing Modern Data Warehousing Options
Traditional data warehouse deployed on (IaaS): Requires our customers to install traditional data warehouse software on computers provided by a cloud provider (e.g., Azure, AWS, Google).
Platform as a service (PaaS): The cloud provider manages the hardware deployment, software installation, and software configuration. However, the customer is responsible for managing the environment, tuning queries, and optimizing the data warehouse software.
A true SaaS data warehouse (SaaS): In a SaaS approach, software and hardware upgrades, security, availability, data protection, and optimization are all handled for you. The cloud provider provides all hardware and software as part of its service, as well as aspects of managing the hardware and software.
With all of the above scenarios, the tasks of purchasing, deploying, and configuring the hardware to support the data warehouse environment falls on the cloud provider instead of the customer.
IaaS, PaaS, and SaaS – What Is the Best Option for My Organization?
Infrastructure as a service (IaaS) is an instant computing infrastructure, provisioned and managed over the internet. It helps you avoid the expense and complexity of buying and managing your own physical servers and other data center infrastructure. In other words, if you’re prepared to buy the engine and build the car around it, the IaaS model may be for you.
In the scenario of platform as a service (PaaS), a cloud provider merely supplies the hardware and its traditional software via the cloud; the solution is likely to resemble its original, on-premise architecture and functionality. Many vendors offer a modern data warehouse that was originally designed and deployed for on-premises environments. One such technology is Amazon Redshift. Amazon acquired rights to ParAccel, named it Redshift, and hosted it in the AWS cloud environment. Redshift is a highly successful modern data warehouse service. It is easy in AWS to instantiate a Redshift cluster, but then you need to complete all of the administrative tasks.
You have to reclaim space after rows are deleted or updated (the process of vacuuming in Redshift), manage capacity planning, provisioning compute and storage nodes, determine your distribution keys, etc. All of the things you had to do with ParAccel (or with any traditional architecture), you have to do with Redshift.
Alternatively, any data warehouse solution built for the cloud using a true software as a solution (SaaS) data warehouse architecture allows for the cloud provider to include all hardware and software as part of its service as well as aspects of managing the hardware and software. One such technology, which requires no management and features separate compute, storage, and cloud services that can scale and change independently, is Snowflake. It differentiates itself from IaaS and PaaS cloud data warehouses because it was built from the ground up on cloud architecture.
All administrative tasks, tuning, patching, and management of the environment falls on the vendor. In lieu of the architecture we have seen with IaaS and PaaS solutions in the market today, Snowflake has a new architecture called a multi-clustered shared data that essentially makes the administrative headache of maintaining solutions go away. However, that doesn’t mean it’s the absolute right choice for your organization – that’s where an experienced consulting partner like 2nd Watch comes in.