When paired with a modern data warehouse, implementing a leading business intelligence (BI) tool, like Power BI, can help your organization use your data to gain powerful insights to help you make the strongest decisions for your business.
In this post, we’ll provide a high-level overview of Power BI, including a description of the tool, why you should use it, pros and cons, and easily integrated tools and technologies.
Overview of Power BI
Power BI is a financially attractive alternative to the likes of Tableau and Looker, and it’s easy to learn for developers and business users alike. Companies already relying heavily on Microsoft tools may want to add Power BI to their toolkit to facilitate faster and deeper analysis.
Why Use Power BI
Power BI is initially cost-effective, especially if you’re already part of the Microsoft ecosystem. After upfront technical user assistance, business users can easily self-serve to build data visualizations and gain insights. However, dashboard collaboration is limited.
Pros of Power BI
Pricing is more reasonable than other big-name options, though add-on features can bump up the cost.
Power BI offers row-level security.
Drillthrough pages are useful for in-depth data exploration and decluttering of reports, while cutting down query execution time and improving runtime.
Power BI documentation is replete with tutorials, samples, quickstarts, and concepts, in addition to strong community forums to answer technical questions.
Cons of Power BI
A desktop component is required in most instances.
When working with large datasets, developers will experience some slowdown as they customize and publish reports.
Row-level security can slow down system performance as Power BI has to query the backend and generate caching for each user role.
You need to be all-in on a Microsoft ecosystem to maximize the benefits of Power BI.
Select Complementary Tools and Technologies for Power BI
We hope you found this high-level overview of Power BI helpful. If you’re interested in learning more about Power BI or other leading BI tools, like Looker and Tableau, contact us to learn more.
To get your source data ingested and loaded, or for a deep dive into how you can build a fully automated data integration process in Snowflake on Azure, schedule a Snowflake whiteboarding session with our team of data architects.
If you have a Microsoft ecosystem but have been wanting to take advantage of more tools on the market, Snowflake on Azure means additional opportunities for you to upgrade your analytics platform while not throwing away the investment in and keeping the uniformity of your current environment.
For those who are not familiar, Snowflake is a cloud-based, massively parallel processing (MPP), columnar storage database. It’s a newer option for data warehousing that is set up for more efficient querying of large data volumes. It also consumes structured and semi-structured data in a way that traditional relational databases are not designed to do as effectively – think “big data” without the “big overhead.”
With this release, Microsoft companies can evaluate using Snowflake to increase the performance and flexibility of their analytics environment by adding it on top of their existing data integration process. To determine where Snowflake might be a good fit, our 2nd Watch consultants took a dive into where Snowflake could sit in our current and prospective Microsoft clients’ ecosystems.
Where does Snowflake fit in your current Azure environment?
Snowflake is best used as the home of an analytical layer (or dimensional model, for the more technical) that enables reporting. Think of this as the substitute for products like SQL Server Analysis Services (SSAS).
While we still recommend that you maintain a data integration hub and related process for all of your consuming application needs (i.e., sending consolidated and cleansed data back to each source system to keep them in sync), Snowflake can sit right at the end of that process. Because it’s optimized for read activities, it complements the needs of business intelligence tools like Power BI, Looker, Tableau, etc., making it faster for business users to grab the information they need via those tools. Integration and ETL are possible with many tools and services, including Azure Data Factory.
When would you want to use Snowflake?
There are two primary ideas behind Snowflake’s competitive advantage when it comes to data warehousing platforms: its automatic optimization of query execution and the hands-off nature of its maintenance.
We recommend Snowflake for two main use cases:
Developing a more efficient analytics platform
Creating a platform for flexible data discovery
Our clients that fit within the above use cases usually had:
Multiple business users – The more people you have querying your database at one time, the more the database has to be configured to handle that load so as to not lock up other processes. Traditional databases can be scaled up to handle larger reads (think of a query that produces data), but this takes a decent amount of time and effort to achieve, and they are more often optimized for writes (think of loading data into a table). In Snowflake, a user can spin up resources for just the one query, then spin it back down right after. This allows for a more modular use of higher power resources.
Lots of data – If you’ve ever tried to perform a huge query on your current system, you likely noticed a slowdown from your usual processing. Traditional databases are not as optimized for read activities as columnar databases are. This makes options like Snowflake more attractive to those performing heavier analytical queries on a regular basis.
“Database systems have traditionally optimized performance for write-intensive workloads. Recently, there has been renewed interest in architectures that optimize read performance by using column-oriented data representation and light-weight compression. This previous work has shown that under certain broad classes of workloads, column-based systems can outperform row-based systems. ” – MIT Computer Science and Artificial Intelligence Laboratory
A mix of structured and semi-structured data – Though many traditional databases offer options for consuming semi-structured data (e.g., JSON, XML, etc.), they aren’t optimized to do so. If you have a mix of structured and semi-structured data, options like Snowflake might be more efficient for your process.
What’s the tradeoff?
Snowflake can be a great option for clients who need a true analytical platform where they can perform data discovery or need to read out a lot of data at once. That said, the following are situations where an alternative to Snowflake might make more sense:
You need a data hub to keep your various applications in sync (application integration and API endpoints).
You aren’t trying to perform complex aggregations or data discovery.
You’re invested in an existing solution that doesn’t have many performance or management overhead concerns.
So what’s the gist?
Snowflake makes it easy to kick off a data warehouse project with its ease of use and start-small, scale-big approach. Despite this innovative architecture, we still recommend applying the same data warehousing fundamentals that you would have in the past.
Yes, Snowflake will allow you to create views on top of messy raw source data, but you will ultimately lose the battle on performance as your data grows in complexity. We suggest approaching a new Snowflake deployment with a vision of your data platform as a whole – not just the analytics layer. Performing all your complex business logic in Snowflake is possible, but due to the columnar architecture, integration use cases are better served in complementary cloud database platforms, such as Azure SQL.
With a roadmap and strategy in place for both your integration and analytics needs, you’ll be better able to select the right mix of tools and technologies for your solution. Learn more about Snowflake deployment best practices in our eBook.
Dealing with Windows patching can be a royal pain as you may know. At least once a month Windows machines are subject to system security and stability patches, thanks to Microsoft’s Patch Tuesday. With Windows 10 (and its derivatives), Microsoft has shifted towards more of a Continuous Delivery model in how it manages system patching. It is a welcome change, however, it still doesn’t guarantee that Windows patching won’t require a system reboot.
Rebooting an EC2 instance that is a member of an Auto Scaling Group (depending upon how you have your Auto Scaling health-check configured) is something that will typically cause an Elastic Load Balancing (ELB) HealthCheck failure and result in instance termination (this occurs when Auto Scaling notices that the instance is no longer reporting “in service” with the load balancer). Auto Scaling will of course replace the terminated instance with a new one, but the new instance will be launched using an image that is presumably unpatched, thus leaving your Windows servers vulnerable.
The next patch cycle will once again trigger a reboot and the vicious cycle continues. Furthermore, if the patching and reboots aren’t carefully coordinated, it could severely impact your application performance and availability (think multiple Auto Scaling Group members rebooting simultaneously). If you are running an earlier version of Windows OS (e.g. Windows Server 2012r2), rebooting at least once a month on Patch Tuesday is an almost certainty.
Another major problem with utilizing the AWS stock Windows AMIs with Auto Scaling is that AWS makes those AMIs unavailable after just a few months. This means that unless you update your Auto Scaling Launch Configuration to use the newer AMI IDs on a continual basis, future Auto Scaling instance launches will fail as they try to access an AMI that is no longer accessible. Anguish.
Automatically and Reliably Patch your Auto-Scaled Windows instances
Given the aforementioned scenario, how on earth are you supposed to automatically and reliably patch your Auto-Scaled Windows instances?!
One approach would be to write some sort of an orchestration layer that detects when Auto Scaling members have been patched and are awaiting their obligatory reboot, suspend Auto Scaling processes that would detect and replace perceived failed instances (e.g. HealthCheck), and then reboot the instances one-by-one. This would be rather painful to orchestrate and has a potentially severe drawback that cluster capacity is reduced by N-1 during the rebooting (maybe more if you don’t take into account service availability between reboots).
Reducing capacity to N-1 might not be a big deal if you have a cluster of 20 instances but if you are running a smaller cluster of something— say 4, 3, or 2 instances—then that has a significant impact to your overall cluster capacity. And, if you are running on an Auto Scaling group with a single instance (not as uncommon as you might think) then your application is completely down during the reboot of that single member. This of course doesn’t solve the issue of expired stock AWS AMIs.
Another approach is to maintain and patch a “golden image” that the Auto Scaling Launch Configuration uses to create new instances from. If you are unfamiliar with the term, a golden-image is an operating system image that has everything pre-installed, configured, and saved in a pre-baked image file (an AMI in the case of Amazon EC2). This approach requires a significant amount of work to make this happen in a reasonably automated fashion and has numerous potential pitfalls.
While it prevents a service outage by replacing the unavailable public AMI with a stock AMI, you still need a way to reliably and automatically handle this process. Using a tool like Hashicorp’s Packer can get you partially there, but you would still have to write a number of Providers to handle the installation of Windows Update and anything else you need to do in order to prep the system for imaging. In the end, you would still have to develop or employ a fair number of tools and processes to completely automate the entire process of detecting new Windows Updates, creating a patched AMI with those updates, and orchestrating the update of your Auto Scaling Groups.
A Cloud-Minded Approach
I believe that Auto Scaling Windows servers intelligently requires a paradigm shift. One assumption we have to make is that some form of configuration management (e.g. Puppet, Chef)—or at least a basic bootstrap script executed via cfn-init/UserData—is automating the configuration of the operating system, applications, and services upon instance launch. If configuration management or bootstrap scripts are not in play, then it is likely that a golden-image is being utilized. Without one of these two approaches, you don’t have true Auto Scaling because it would require some kind of human interaction to configure a server (ergo, not “auto”) every time a new instance was created.
Both approaches (launch-time configuration vs. golden-image) have their pros and cons. I generally prefer launch-time configuration as it allows for more flexibility, provides for better governance/compliance, and enables pushing changes dynamically. But…(and this is especially true of Windows servers) sometimes launch-time configuration simply takes longer to happen than is acceptable, and the golden-image approach must be used to allow for a more rapid deployment of new Auto Scaling group instances.
Either approach can be easily automated using a solution like to the one I am about to outline, and thankfully AWS publishes new stock Windows Server AMIs immediately following every Patch Tuesday. This means, if you aren’t using a golden-image, patching your instances is as simple as updating your Auto Scaling Launch Configuration to use the new AMI(s) and preforming a rolling replacement of the instances. Even if you are using a golden-image or applying some level of customization to the stock AMI, you can easily integrate Packer into the process to create a new patched image that includes your customizations.
At a high level, the solution can be summarized as:
An Orchestration Layer (e.g. AWS SNS and Lambda, Jenkins, AWS Step Functions) that detects and responds when new patched stock Windows AMIs have been released by Amazon.
A Packer Launcher process that manages launching Packer jobs in order to create custom AMIs. Note: This step is only required If copying AWS stock AMIs to your own AWS account is desired OR if you want to apply customization to the stock AMI. Either use case requires that the custom images are available indefinitely. We solved this problem by creating a Packer Launcher process by creating an EC2 instance with a Python UserData script that launches Packer jobs (in parallel) to create copies of the new stock AMIs into our AWS account. Note: if you are using something like Jenkins, this could be handled by having Jenkins launch a local script or even a Docker container to manage launching Packer jobs.
A New AMI Messaging Layer (e.g. Amazon SNS) to publish notifications when new/patched AMIs have been created
Some form of an Auto Scaling Group Rolling Updater will be required to replace exiting Auto Scaling Group instances with new ones based on the Patched AMI.
Great news for anyone using AWS CloudFormation… CFT inherently supports Rolling Updates for Auto Scaling Groups! Utilizing it requires attaching an UpdatePolicy and adding a UserData or cfn-init script to notify CloudFormation when the instance has finished its configuration and is reporting as healthy (e.g. InService on the ELB). There are some pretty good examples of how to accomplish this using CloudFormation out there, but here is one specifically that AWS provides as an example.
If you aren’t using CloudFormation, all hope is not lost. With Hashicorp Terraform’s ever increasing popularity for deploying and managing AWS infrastructure as code, Terraform has still yet to implement a Rolling Update feature for AWS Auto Scaling Groups. There is a Terraform feature request from a few years ago for this exact feature, but as of today, it is not yet available, nor do the Terraform developers have any short-term plans to implement it. However, several people (including Hashicorp’s own engineers) have developed a number of ways to work around the lack of an integrated Auto Scaling Group Rolling Updater in Terraform. Here are a few I like:
Of course, you can always roll your own solution using a combination of AWS services (e.g. SNS, Lambda, Step Functions), or whatever tooling best fits your needs. Creating your own solution will allow you added flexibility if you have additional requirements that can’t be met by CloudFormation, Terraform, or other orchestration tool.
The following is an example framework for performing automated Rolling Updates to Auto Scaling Groups utilizing AWS SNS and AWS Lambda:
a. An Auto Scaling Launch Config Modifier worker that subscribes to the New AMI messaging layer performs an update to the Auto Scaling Launch Configuration(s) when a new AMI is released. In this use case, we are using an AWS Lambda function to subscribe to an SNS topic. Upon notification of new AMIs, the worker must then update the predefined (or programmatically derived) Auto Scaling Launch Configurations to use the new AMI. This is best handled by using infrastructure templating tools like CloudFormation or Terraform to make updating the Auto Scaling Launch Configuration ImageId as simple as updating a parameter/variable in the template and performing an update/apply operation.
b. An Auto Scaling Group Instance Cycler messaging layer (again, an Amazon SNS topic) to be notified when an Auto Scaling Launch Configuration ImageId has been updated by the worker.
c. An Auto Scaling Group Instance Cycler worker that will perform replacing the Auto Scaling Group instances in a safe, reliable, and automated fashion. For example, another AWS Lambda function that will subscribe to the SNS topic and trigger new instances by increasing the Auto Scaling Desired Instance count to a value of twice the current number of ASG instances.
d. Once the scale-up event generated by the Auto Scaling Group Instance Cycler worker has completed and the new instances are reporting as healthy, another message will be published to the Auto Scaling Group Instance Cycler SNS topic indicating scale-up has completed.
e. The Auto Scaling Group Instance Cycler worker will respond to the prior event and return the Auto Scaling group back to its original size which will terminate the older instances leaving the Auto Scaling Group with only the patched instances launched from the updated AMI. This assumes that we are utilizing the default AWS Auto Scaling Termination Policy which ensures that instances launched from the oldest Launch Configurations are terminated first.
NOTE: The AWS Auto Scaling default termination policy will not guarantee that the older instances are terminated first! If the Auto Scaling Group is spanned across multiple Availability Zones (AZ) and there is an imbalance in the number of instances in each AZ, it will terminate the extra instance(s) in that AZ before terminating based on the oldest Launch Configuration. Terminating on Launch Configuration age will certainly ensure that the oldest instances will be replaced first. My recommendation is to use the OldestInstance termination policy to make absolutely certain that the oldest (i.e. unpatched) instances are terminated during the Instance Cycler scale-down process. Consult the AWS documentation on the Auto Scaling termination policies for more on this topic.
Whichever solution you choose to implement to handle the Rolling Updates to your Auto Scaling Group, the solution outlined above will provide you with a sure-fire way to ensure your Windows Auto Scaled servers are always patched automatically and minimize the operational overhead for ensuring patch compliance and server security. And the good news is that the heavy lifting is already being handled by AWS Auto Scaling and Hashicorp Packer. There is a bit of trickery to getting the Packer configs and provisioners working just right with the EC2 Config service and Windows Sysprep, but there are a number of good examples out on github to get you headed in the right direction. The one I referenced in building our solution can be found here.
One final word of caution... if you do not disable the EC2Config Set Computer Name option when baking a custom AMI, your Windows hostname will ALWAYS be reset to the EC2Config default upon reboot. This is especially problematic for configuration management tools like Puppet or Chef which may use the hostname as the SSL Client Certificate subject name (default behavior), or for deriving the system role/profile/configuration.
Here is my ec2config.ps1 Packer provisioner script which disables the Set Computer Name option:
Hopefully, at this point, you have a pretty good idea of how you can leverage existing software, tools, and services—combined with a bit of scripting and automation workflow—to reliably and automatically manage the patching of your Windows Auto Scaling Group EC2 instances! If you require additional assistance, are resource-bound for getting something implemented, or you would just like the proven Cloud experts to manage Automating Windows Patching of your EC2 Autoscaling Group Instances, contact 2nd Watch today!
We strongly advise that processes like the ones described in this article be performed on a environment prior to production to properly validate that the changes have not negatively affected your application’s functionality, performance, or availability.
This is something that your orchestration layer in the first step should be able to handle. This is also something that should integrate well with a Continual Integration and/or Delivery workflow.
-Ryan Kennedy, Principal Cloud Automation Architect, 2nd Watch
The outbreak of a virulent strain of ransomware, alternately known as WannaCry or WannaCrypt, is finally winding down. A form of malware, the WannaCry attack exploited certain vulnerabilities in Microsoft Windows and infected hundreds of thousands of Windows computers worldwide. As the dust begins to settle, the conversation inevitably turns to what could have been done to prevent it.
The first observation is that most organizations could have been protected simply by following best practices—most notably, the regular installation of known security and critical patches that help to minimize vulnerabilities. WannaCry was not an exotic “zero day” incident. The patch for the underlying vulnerabilities (MS17-010) has been available since March. Companies like 2nd Watch maintain a regular patch schedule to protect their systems from these and similar attacks. It should be noted that due to the prolific nature of this malware and the active attack vectors, 2nd Watch is requiring that all Windows systems must be patched by 5/31/2017.
Other best practices include:
Maintaining support contracts for out-of-date operating systems
Enabling firewalls, in addition to intrusion detection and prevention systems
Proactively monitoring and validating traffic going in and out of the network
Implementing security mechanisms for other points of entry attackers can use, such as email and websites
Deploying application control to prevent suspicious files from executing in addition to behavior monitoring that can thwart unwanted modifications to the system
Employing data categorization and network segmentation to mitigate further exposure and damage to data
Backing up important data. This is the single, most effective way of combating ransomware infection. However, organizations should ensure that backups are appropriately protected or stored off-line so that attackers can’t delete them.
The importance of regularly scheduled patching and keeping systems up-to-date cannot be overemphasized. It may not be sexy, but it is highly effective.
All of these recommendations seem simple enough, but why did the outbreak spread so quickly if the vulnerabilities were known and patches were readily available? It spread because the patches were released for currently supported systems, but the vulnerability has been present in all versions of Windows dating back to Windows XP. For these older systems – no longer supported by Microsoft but still widely used – the patches weren’t there in the first place. One of the highest profile victims, Britain’s National Health Service, discovered that 90 percent of NHS trusts run at least one Windows XP device, an operating system Microsoft first introduced in 2001 and hasn’t supported since 2014. In fact, it was only because of the high-profile nature of this malware that Microsoft took the rare step this week of publishing a patch for Windows XP, Windows Server 2003 and Windows 8.
This brings us to the challenging topic of “technical debt”—the extra cost and effort to continue using older technology. The WannaCry/WannaCrypt outbreak is simply the most recent teachable moment about those costs.
A big benefit of moving to cloud computing is its ability to help rid one’s organization of technical debt. By migrating workloads into the cloud, and even better, by evolving those workloads into modern, cloud-native architectures, the issue of supporting older servers and operating systems is minimized. As Gartner pointed out in the 2017 Gartner Magic Quadrant for Public Cloud Infrastructure Managed Service Providers, Worldwide, through 2018, the cloud managed service market will remain relatively immature, and more than 75% of fully successful implementations will be delivered by highly skilled, forward-looking, boutique managed service providers with a cloud-native, DevOps-centric service delivery approach, like 2nd Watch. A free download of the report can be found here.
Partners like 2nd Watch can also help reduce your overall management cost by tailoring solutions to manage your infrastructure in the cloud. The best practices mentioned above can be automated in many environments– regular patching, resource isolation, traffic monitoring, etc. – are all done for you so you can focus on your business.
Even more important, companies like 2nd Watch help ensure the ongoing optimization of your workloads, both from a cost and a performance point of view. The life-cycle of optimization and modernization of your cloud environments is perhaps the single grea mechanism to ensure that you never take on and retain high levels of technical debt.