An Introduction to AWS Proton

As a business scales, so does its software and infrastructure. As desired outcomes adapt and become more complex that can quickly cause a lot of overhead and difficulty for platform teams to manage over time and these challenges often limit the benefits of embracing containers and serverless. Shared services offer many advantages in these scenarios by providing a consistent developer experience while also increasing productivity and effectivity of governance and cost management.

Introduced in December 2020 Amazon Web Services announced the general availability of Proton: an application targeted at providing tooling to manage complex environments while bridging infrastructure and deployment for developers. In this blog we will take a closer look into the benefits of the AWS Proton service offering.

What is AWS Proton?

AWS Proton is a fully managed delivery service, targeted at container and serverless workloads, that provides engineering teams the tooling to automate provisioning and deploy applications while enabling them to provide observability and enforce compliance and best practices. With AWS Proton, development teams utilize resources for infrastructure and to deploy their code. This in turn increases developer productivity by allowing them to focus on their code, software delivery, reduce management overhead, and increase release frequency. Teams can use AWS Proton through the AWS Console and the AWS CLI, allowing for teams to get started quickly and automate complicated operations over time.

How does it work?

The AWS Proton framework allows administrators to define versioned templates which standardize infrastructure, enforce guard rails, leverage Infrastructure as Code with CloudFormation, and provide CI/CD with Code Pipeline and Code Build to automate provisioning and deployments. Once service templates are defined, developers can choose a template and use it to deploy their software. As new code is released, the CI/CD pipelines automatically deploys the changes. Additionally, as new template versions are defined, AWS Proton provides a “one-click” interface which allows administrators to roll out infrastructure updates across all the outdated template versions.

When is AWS Proton right for you?

AWS Proton is built for teams looking to centrally manage their cloud resources. The service interface is built for teams to provision deploy and monitor applications. AWS Proton is worth considering if you are using cloud native services like Serverless applications or if you utilize containers in AWS. The benefits continually grow when working with a service-oriented architecture, microservices, or distributed software as it eases release management, reduces lead time, and creates an environment for teams to operate within a set of rules with little to no additional overhead. AWS Proton is also a good option if you are looking to introduce Infrastructure as Code or CI/CD pipelines to new or even existing software as AWS Proton supports linking existing resources.

Getting Started is easy!

Platform Administrators

Since AWS Proton itself is free and you only pay for the underlying resources, you are only a few steps away from giving it a try! First a member of the platform infrastructure team creates an environment template. An environment defines infrastructure that is foundational to your applications and services including compute networking (VPCs), Code Pipelines, Security, and Monitoring. Environments are defined via CloudFormation templates and use Jinja for parameters rather than the conventional parameters section in standard CloudFormation templates. You can find template parameter examples in the AWS documentation. You can create, view, update, and manage your environment templates and their versions in the AWS Console.

Once an environment template is created the platform administrator would create a service template which defines all resources that are logically relative to a service. For example, if we had a container which performs some ETL this could contain an ECR Repository, ECS Cluster, ECS Service Definition, ECS Task Definition, IAM roles, and the ETL source and target storage.

In another example, we could have an asynchronous lambda which performs some background tasks and its corresponding execution role. You could also consider using schema files for parameter validation! Like environment templates, you can create, view, update, and manage your service templates and their versions in the AWS Console.

Once the templates have been created the platform administrator can publish the templates and provision the environment. Since services also include CI/CD pipelines platform administrators should also configure repository connections by creating the GitHub app connector. This is done in the AWS Developer Tools service or a link can be found on the AWS Proton page in the Console.

Once authorized, the GitHub app is automatically created and integrated with AWS and CI/CD pipelines will automatically detect available connections during service configuration.

 

At this time platform administrators should see a stack which contains the environment’s resources. They can validate each resource, interconnectivity, security, audits, and operational excellence.

Developers

At this point developers can choose which version they will use to deploy their service. Available services can be found in the AWS Console and developers can review the template and requirements before deployment. Once they have selected the target template they choose the repository that contains their service code, the GitHub app connection created by the platform administrator, and any parameters required by the service and CodePipeline.

After some time, developers should be able to see their application stack in CloudFormation, their application’s CodePipeline resources, and the resources for their application accordingly!

In Closing

AWS Proton is a new and exciting service available for those looking to adopt Infrastructure as Code, enable CI/CD pipelines for their products, and enforce compliance, consistent standards, and best practices across their software and infrastructure. Here we explored a simple use case, but real world scenarios likely require a more thorough examination and implementation.

AWS Proton may require a transition for teams that already utilize IaC, CI/CD, or that have created processes to centrally manage their platform infrastructure. 2nd Watch has over 10 years’ experience in helping companies move to the cloud and implement shared services platforms to simplify modern cloud operations. Start a conversation with a solution expert from 2nd Watch today and together we will assess and create a plan built for your goals and targets!

-Isaiah Grant, Cloud Consultant

The Most Popular and Fastest-Growing AWS Products of 2021

Enterprise IT departments are increasing cloud usage at an exponential rate. These tools and technologies enable greater innovation, cost savings, flexibility, productivity and faster-time-to-market, ultimately facilitating business modernization and transformation.

Amazon Web Services (AWS) is a leader among IaaS vendors, and every year around this time, we look back at the most popular AWS products of the past year, based on the percentage of 2nd Watch clients using them. We also evaluate the fastest-growing AWS products, based on how much spend our clients are putting towards various AWS products compared to the year before.

We’ve categorized the lists into the “100%s” and the “Up-and-Comers.” The 100%s are products that were used by all of our clients in 2020 – those products and services that are nearly universal and necessary in a basic cloud environment. The Up-and-Comers are the five fastest-growing products of the past year. We also highlight a few products that didn’t make either list but are noteworthy and worth watching.

12 Essential AWS Products

In 2020, there were 12 AWS products that were used by 100% of our client base:

  • AWS CloudTrail
  • AWS Key Management Service
  • AWS Lambda
  • AWS Secrets Manager
  • Amazon DynamoDB
  • Amazon Elastic Compute Cloud
  • Amazon Relational Database Service
  • Amazon Route 53
  • Amazon Simple Notification Service
  • Amazon Simple Queue Service
  • Amazon Simple Storage Service
  • Amazon CloudWatch

Why were these products so popular in 2020? For the most part, products that are universally adopted reflect the infrastructure that is required to run a modern AWS cloud footprint today.

Products in the 100%s club also demonstrate how AWS has made a strong commitment to the integration and extension of the cloud-native management tools stack, so external customers can have access to many of the same features and capabilities used in their own internal services and infrastructure.

AWS Trending Products and Services

The following AWS products were the fastest growing in 2020:

  • AWS Systems Manager
  • Amazon Transcribe
  • Amazon Comprehend
  • AWS Support BJS (Business)
  • AWS Security Hub

The fastest-growing products in 2020 seem to be squarely focused on digital application in some form, whether text/voice translation using machine learning (Comprehend and Transcribe) or protection of those applications and ensuring better security management overall (Security Hub). This is a bit of a change from 2019, when the fastest-growing products were focused on application orchestration (AWS Step Functions) or infrastructure topics with products like Cost Explorer, Key Management Service or Container Service.

With a huge demand for data analytics and machine learning across enterprise organizations, utilizing services such as Comprehend and Transcribe allows you to gather insights into customer sentiment when examining customer reviews, support tickets, social media, etc. Businesses can use the services to extract key phrases, places, people, brands, or events, and, with the help of machine learning, gain an understanding of how positive or negative conversations were conducted. This provides a company with a lot of power to modify practices, offerings, and marketing messaging to enhance customer relationships and improve sentiment.

Emerging Technology

The following products were new to our Most Popular list in 2020 and therefore are worth watching:

AWS X-Ray allows users to understand how their application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors. One factor contributing to its rising popularity is more distributed systems, like microservices, being developed and traceability becoming more important.

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Increased use of Athena indicates more analysis is happening using a greater number of data sources, which signifies companies are becoming more data driven in their decision making.

A surge in the number of companies using EC2 Container Service and EC2 Container Registry demonstrates growing interest in containers and greater cloud maturity across the board. Companies are realizing the benefits of consistent/isolated environments, flexibility, better resource utilization, better automation and DevOps practices, and greater control of deployments and scaling.

Looking Ahead

For 2021, we expect there to be a continued focus on adoption of existing and new products focused on security, data, application modernization and cloud management. In our own client interactions, these are the constant topics of discussion and services engagements we are executing as part of cloud modernization across industries.

-Joey Yore, Principal Consultant

Takeaways from AWS re:Invent 2020

AWS re:Invent 2020 was a little different, to say the least. The line for the restroom was way shorter, if you entered the Tatonka Challenge you were guaranteed to win or at least have a really good shot at the trophy depending on how many of your family members joined in and how many wings your air fryer can handle, and the lines for shuttle busses were non-existent as the commute time between sessions was reduced from hours to seconds depending on how fast you can click. However, in typical AWS fashion, they made lemonade out of lemons and put on one of the best public cloud virtual event of the year.

Instead of the typical action packed, sleepless week in Vegas, AWS broke it up in to 3 weeks sprinkled with all of their major announcements throughout.  Vendors set up to provide breakout sessions and virtual booths to discuss solutions/products and have 1 on 1 sessions with potential leads via chat and live demos. Hunters of the precious SWAG had to engage with vendors as well as participate in specific activities to obtain their various rewards. With all of the turmoil going on in the world, AWS was still able to announce over 140 new products and features at re:Invent 2020. Here are just a few of the highlights.

Week 1

For the first time ever, re:Invent was opened to the world free of charge and attracted over 500,000 participants. Andy Jassy’s overall keynote theme was centered around the customer driving innovation within AWS based on solving their needs. In part due to the pandemic, cloud adoption has accelerated this year and has fueled AWS’ continued growth.

AWS announced new compute innovations including MacOS (literally integrating a Mac Mini into a server chassis) as well as making tremendous investments in the processor space with their Graviton 2 processors and Trainium chips. If you didn’t catch week 1, here’s what you missed:

Reinventing Compute:

  • EC2: macOS instances, Intel/AMD/ARM/Graviton2 options
  • New C6g Graviton instance announced, almost 50% savings
  • Lower cost for AWS Inferentia, used by Alexa
  • Habana Gaudi Based EC2 Instances GPU based, machine learning instances
  • AWS Trainium AWS ML chip used in EC2 and Sagemaker

Reinventing Storage:

  • Gp3 for EBS allowing 4x peak throughput
  • Io2 Block Express First SAN built for cloud

The mindset of “100% in the cloud all the time” is slowly being shifted to include new options for hybrid environments with the announcements of ECS and EKS anywhere allowing customers to run their workloads in their own data center. Taking it a step further is the announcement of AWS Monitron that uses machine learning to help predict failures in data center infrastructure. Placing compute closer to the customer (Edge Computing) has become more important especially as connectivity providers roll out 5G. To allow for this evolution, AWS has released AWS Wavelength. Also, additional options for Outposts (1 and 2U server sizes) have been released for customers not requiring a full cabinet of hardware.

Data science, AI, and machine learning have become front and center as customers continue to take advantage of cloud native technologies. Making the best use of your data and making it work for your business have been a huge focus this year. Some of the highlights include:

  • Amazon SageMaker Data Wrangler: Clean and aggregate data to prepare it for machine learning.
  • AWS Glue Elastic Views: Easily combine and replicate data from different data stores.
  • Amazon Code Guru: Automate code reviews and identify your most expensive lines of code.
  • Amazon DevOps Guru: automatically detect operational issues and recommend actions to fix
  • Amazon Quicksight: Ask any question in natural language and get answers in seconds.
  • Amazon Connect Wisdom: Reduces the time agents spend finding answers for customers.

AWS partner relationships continue to be a central focus as well, and this was highlighted by Doug Yeum in his keynote:

  • Cohesity DMaaS (Data Management as a Service) service announcement.
  • AWS SaaS Boost: Open source SaaS reference environment to accelerate traditional applications to SaaS on AWS.
  • AWS ISC Partner path: More access to millions of active AWS customers with AWS field sellers globally.
  • Managed entitlements for AWS Marketplace: Automate 3rd party software license distribution and simplify entitlement tracking.
  • AWS Service Catalog App Registry: Define and associate resources to better manage applications.
  • AWS Energy Competency: Helping customers accelerate their transition to a more balanced and sustainable energy future.

Week 2

Kicking off week two was an infrastructure specific deep dive with Peter DeSantis. Given my background in the data center space, I found his keynote to be extremely interesting as I have noticed over the past few years that questions and conversations around how cloud services are actually provided are very common. Back before the “cloud” and even virtual machines existed, servers were deployed into data centers and enterprises ran their mission critical workloads on them. Some companies deployed and managed their own physical infrastructure, some outsourced the management of those environments to MSPs, but the overall principals have not changed over the years.  Yes, your workloads run “in the cloud” but behind that are still data centers housing servers, networking gear, storage, cooling, water chillers, power distribution, connectivity, etc.

AWS has taken those principals and scaled them to another level and has been focusing on redundancy and sustainability to ensure that, if built properly, their customers’ workloads have no single point of failure and can keep running should an outage occur. AWS has not only made strides in the disk storage and processor space, but they have also designed and integrated their own switching gear control systems and custom designed, rack installed UPS infrastructure.

These are items that users of the cloud don’t have to deal with and one of the major selling points of moving to cloud. You don’t have to worry about rack space, power, cooling, hardware purchases, maintenance contracts, and the list goes on and on. BUT rest assured that the man behind the curtain is very aware of these items and is taking best in class steps to ensure that the infrastructure behind the scenes is always on.

Next on the list was the machine learning Keynote with Swami Sivasubramanian. This was more of a deep dive into some of the announcements made by Andy Jassy in week one, and he did not disappoint. As customers continue the shift to cloud native, ML and AI services have become front and center in their Application Modernization journey. Out of the 250+ new products and product enhancements announced by AWS in 2020, most of those were centered around SageMaker and 11 other AI and ML products.

ML Frameworks and Infrastructure

AWS announced AWS Inferentia, a high performance, machine learning chip that powers EC2 Inf1 instances. Inferentia boasts 45% lower costs and 30% higher throughput than comparable GPU based instances and helps Alexa achieve 25% lower end to end latency. AWS Tranium is another high-performance machine learning chip with the most teraflops of compute power for ML that enables a broader set of ML applications.

Amazon SageMaker

AWS had several announcements around Amazon SageMaker.

“Thus, we need a platform where the data scientist will be able to leverage his existing skills to engineer and study data, train and tune ML models and finally deploy the model as a web-service by dynamically provisioning the required hardware, orchestrating the entire flow and transition for execution with simple abstraction and provide a robust solution that can scale and meet demands elastically.” – Jojo John Moolayil, AWS AI Research Scientist

  • SageMaker Data Wrangler is a faster way to prepare data for ML without a single line of code.
  • SageMaker Clarify provides machine learning developers with greater visibility into their training data and models so they can identify and limit bias and explain predictions.
  • SageMaker Debugger helps identify bottlenecks, visualize system resources like GPU, CPU, I/O, memory and provides adjustment recommendations.

AI Services:

The most important take-away from this keynote is AWS’ goal of the democratization of machine learning, or the transparent embedding of ML functionality into other AWS services.

“The company’s overall aim is to enable machine learning to be embedded into most applications before the decade is out by making it accessible to more than just experts.” – Andy Jassy, AWS CEO

With that goal in mind, AWS announced Redshift ML, which imports trained models into the data warehouse and makes them accessible using standard SQL queries. Use SQL statements to create and train Amazon SageMaker machine learning models using your Redshift data and embed them directly in reports.

Aurora ML enables you to add ML-based predictions to applications via the familiar SQL programming language, so you don’t need to learn separate tools or have prior machine learning experience. It provides simple, optimized, and secure integration between Aurora and AWS ML services without having to build custom integrations or move data around.

Neptune ML brings predictions to their fully managed graph database service in the form of graph neural networks and the Deep Graph Library.

For companies involved with handling medical data, Amazon Healthlake is worth looking at. With built-in data query, search and ML capabilities, you can seamlessly transform data to understand meaningful \ medical information at petabtye scale.

Week 3

Wrapping up the final week of re:Invent 2020 was Werner Vogels rocking his typical iconic t-shirt, however not announcing who would be playing at re:Play this year, unfortunately. Presenting from the Netherlands in the historic SugarCity factory, he masterfully wove in the story of transforming and adapting to external events. To say that COVID has impacted all aspects of our lives in 2020 would be an understatement, but when presented with challenges, innovators continue to find ways to overcome those obstacles.

Collaboration and remote working were beyond challenging to everyone in 2020. AWS CloudShell was announced to provide users access to AWS critical resources such as the AWS console, AWS CLI and even 1GB of persistent storage at no cost. In addition, enhancements to AWS Cloud9 were announced that enables users to develop, run, and debug code from a browser.

To help mitigate potential issues in the future, AWS announced Fault Injection Simulator that sounds more like a load test on steroids utilizing chaos engineering. Chaos engineering allows an application or environment to be pushed to its limits to highlight any potential issues, bottle necks, or failures before they are pushed into production for end user use.

Additionally, Werner focused on helping the community and sustainability. The pandemic has financially hurt millions of people and AWS has developed the re:Start program designed to help the unemployed develop new skills that will allow them to pursue new career paths.

In summary, AWS continues to dominate the public cloud market and rapidly innovates based on their customer requirements. We may not have been standing elbow to elbow with 60,000 of our closest friends, navigating the miles and miles of casino floors, or enjoying all of the surprises of re:Invent in-person this year, but AWS did a stellar job of bringing us together virtually. Hopefully in a year’s time, we will all be back together and enjoying the wonderful craziness that is AWS re:Invent, Vegas style.

-Jeff Collins, Optimization Product Manager, 2nd Watch

Catch our re:Invent Breakout Session ‘Reality Check: Moving the Data Lake from Storage to Strategic’

Watch our AWS re:Invent session ANT283-S “Reality Check: Moving the Data Lake from Storage to Strategic” for your chance to win a Sony PlayStation 5!

Many organizations have created data lakes to store both relation and non-relational data to enable faster decision making. All too often, these data lakes move from proof-of-concept to production and quickly become just another data repository, not achieving the required strategic business dependency. Watch Reality Check: Moving Your Data Lake from Storage to Strategic to get a reality check on how the data lake management approaches you’ve employed have led to failure. Discover the steps needed to build strategic importance and restore data dependency and learn how cloud native creates efficiency along with a long-term competitive advantage.

Winning is easy:

  1. Watch our breakout session, ‘ANT283-S Reality Check: Moving the Data Lake from Storage to Strategic​’
  2. Share what you learned from the session on social by 12/18
  3. Tag @2nd Watch, and you’ll be entered into the drawing held on 12/22!

WATCH NOW

Meet us at AWS re:Invent and Get Your Free 2nd Watch Sweatpants!

AWS re:Invent 2020 is off to a great virtual start, and we want to meet you here! Visit the 2nd Watch re:Invent Sponsor Page now through December 18 to speak with one of our cloud experts, watch our session “ANT283-S Reality Check: Moving the Data Lake from Storage to Strategic” (and don’t forget to comment on the session on social for your chance to win a Sony PlayStation 5), access a ton of downloadable content, and claim your free 2nd Watch re:Invent sweatpants.

Your Trusted Cloud Advisor

As a cloud native AWS Premier Partner, we orchestrate your cloud transformation from strategy to execution, fueling business growth. Our focus is on enabling accelerated cloud migration, application modernization, IT optimization and data engineering to facilitate true business transformation.

Keep up your Redshift Lake House Property Values

When you deployed Redshift a few years ago, your new data lake was going to allow your organization to make better, faster, more informed business decisions.  It would break down data silos allowing your Data Scientists to have greater access to all data sources, quickly, enabling them to be more efficient in delivering consumable data insights.

Now that some time has passed, though, there is a good chance your data lake may no longer be returning to you the value it initially did.  It has turned into a catch all for your data and maybe even a giant data mess with your clusters filling up too quickly, resulting in the need to constantly delete data or scale up.  Teams are blaming one another for consuming too many resources, even though they are split and shouldn’t be impacting one another.  Slow queries have resulted from a less than optimal table structure decided upon when initially deployed that no longer fits the business and data you are generating today.  All of this results in your expensive Data Scientists and Analysts being less productive than when you initially deployed Redshift.

Keep in mind, though, that the Redshift you deployed a few years ago is not the same Redshift today.  We all know that AWS is continuously innovating, but over the last 2 years they have added more than 200 new features to Redshift that can address many of these problems, such as:

  • Utilizing AQUA nodes, which can deliver a 10x performance improvement
  • Refreshing instance families that can lower your overall spend
  • Federated query, which allows you to query across Redshift, S3, and relational database services to come up with aggregated data sets, which can then be put back into the data lakes to be consumed by other analytic services
  • Concurrency scaling, which automatically adds and removes capacity to handle unpredictable demand from thousands of concurrent users, so you do not take a performance hit
  • The ability to take advantage of machine learning with automatic workload management (WLM) to dynamically manage memory and concurrency, helping maximize query throughput

As a matter of fact, clients repeatedly tell us there have been so many innovations with Redshift, it’s hard for them to determine which ones will benefit them, let alone be aware of all of them all.

Having successfully deployed and maintained AWS Redshift for years here at 2nd Watch, we have packaged our best practice learnings to deliver the AWS Redshift Health Assessment.  The AWS Redshift Health Assessment is designed to ensure your Redshift Cluster is not inhibiting the productivity of your valuable and costly specialized resources.

At the end of our 2-3 week engagement, we deliver a lightweight prioritized roadmap of the best enhancements to be made to your Redshift cluster that will deliver immediate impact to your business.  We will look for ways to not only improve performance but also save you money where possible, as well as analyze your most important workloads to ensure you have an optimal table design deployed utilizing the appropriate and optimal Redshift features to get you the results you need.

AWS introduced the concept of a Lake House analogy to better describe what Redshift has become.  A Lake House is prime real estate that everyone wants because it gives you a view of something beautiful, with limitless opportunities of enjoyment.  With the ability to use a common query or dashboard across your data warehouses and multiple data lakes, like a lake house, Redshift provides you the beautiful sight of all your data and limitless possibilities.  However, every lake house needs ongoing maintenance to ensure it brings you the enjoyment you desired when you first purchased it and a lake house built with Redshift is no different.

Contact 2nd Watch today to maximize the value of your data, like you intended when you deployed Redshift.

-Rob Whelan, Data Engineering & Analytics Practice Manager

Amazon Redshift Stands Strong Despite Maintenance Challenges

AWS says Amazon Redshift is the world’s fastest cloud data warehouse, allowing customers to analyze petabytes of structured and semi-structured data at high speeds that allow for exploratory analysis. According to a 2018 Forrester report, Redshift is the most popular cloud data warehouse for enterprises.

To better understand how enterprises are using Redshift, 2nd Watch surveyed Redshift users at large companies. A majority of respondents (57%) said their Redshift implementation had delivered on corporate expectations, while another 26% said it had “somewhat” delivered.

With all the benefits Redshift enables, it’s no wonder tens of thousands of customers use it. Benefits like three times the performance of any cloud data warehouse or being 50% less expensive than all other cloud data warehouses make it an attractive service to Fortune 500 companies and startups alike, including McDonald’s, Lyft, Comcast, and Yelp, among others.

Overall Findings:

Despite its apparent success in the market, not all Redshift deployments have gone according to plan. 45% of respondents said queries stacking up in queues was a recurring problem in their Redshift deployment; 30% said some of their Data Analyst’s time was unproductive as a result of tuning Redshift queries; and 34% said queries were taking more than one minute to return results. Meanwhile, 33% said they were struggling to manage requests for permissions, and 25% said their Redshift costs were higher than anticipated.

Query and Queuing Learnings:

Queuing of queries is not a new problem. Redshift has a long-underutilized feature called Workload Management queues, or WLM. These queues are like different entrances to a baseball stadium. They all go to the same baseball game, but with different ways to get in. WLM queues divvy up compute and processing power among groups of users so no single “heavy” user ends up dominating the database and preventing others from accessing. It’s common to have queries stack up in the Default WLM queue. A better pattern is to have at least three or four different workload management queues:

  1. ETL processes
  2. Administration
  3. Ad hoc exploration
  4. Data loading and unloading

As for time lost due to performance tuning, this is a tradeoff with Redshift: it is inexpensive on the compute side but takes some care and attention on the human side. Redshift is extremely high-performing when designed and implemented correctly for your use case. It’s common for Redshift users to design tables at the beginning of a data load, then not return to the design until there is a problem, after other data sets enter the warehouse. It’s a best practice to routinely run ANALYZE and have auto-vacuum turned on, and to know how your most common queries are structured, so you can sort tables accordingly.

If queries are taking a long time to run, you need to ask whether the latency is due to the heavy processing needs of the query, or if the tables are designed inefficiently with respect to the query. For example, if a query aggregates sales by date, but the timestamp for sales is not a sort key, the query planner might have to traverse many different tables just to make sure it has all the right data, therefore taking a long time. On the other hand, if your data is already nicely sorted but you have to aggregate terabytes of data into a single value, then waiting a minute or more for data is not unusual.

Permissions

Some survey respondents mentioned that permissions were difficult to manage. There are several options for configuring access to Redshift. Some users create database users and groups internal to Redshift and manage authentication at the database level (for example, logging in via SQL Workbench). Others delegate permissions with an identity provider like Active Directory.

Implementation and Cost Savings

Enterprise IT directors are working to overcome their Redshift implementation challenges. 30% said they are rewriting queries, and 28% said they have compressed their data in S3 as part of a LakeHouse architecture. Query tuning was having the greatest impact on the performance of Redshift clusters.

When Redshift costs exceed the plan, it is a good practice to assess where the costs are coming from. Is it from storage, compute, or something else? Generally, if you are looking to save on Redshift spend, you should explore a LakeHouse architecture, which is a storage pattern that shifts data between S3 and your Redshift cluster. When you need lots of data for analysis, data is loaded into Redshift. When you don’t need that data anymore, it is moved back to S3 where storage is much cheaper. However, the tradeoff is that analysis is slower when data is in S3.

Another place to look for cost savings is in the instance size. It is possible to have over-provisioned your Redshift nodes. Look for metrics like CPU utilization; if it is consistently 25% or even 30% or lower, then you have too much headroom and might be over-provisioned.

Popular Features

Challenges aside, enterprise IT directors seem to love Redshift. The top four Redshift features, according to our survey, are query monitoring rules (cited by 44% of respondents), federated queries (35%) and custom-built ETL workflows (33%).

Query Monitoring Rules are custom rules that track bad or slow queries. Customers love Query Monitoring Rules because they are simple to write and give you great visibility into queries that will disrupt operations. You can choose obvious metrics like query_execution_time, or more subtle things like query_blocks_read, which would be a proxy for how much searching the query planner has to do to get data. Customers like these features because the reporting is central, and it frees them from having to manually check queries themselves.

Federated queries allow you to bring in live, external data to join with your internal Redshift data. You can query, for example, an RDS instance in the same SQL statement as a query against your Redshift cluster. This allows for dynamic and powerful analysis that normally would take many time-consuming steps to get the data in the same place.

Finally, custom-built ETL workflows have become popular for several reasons. One, the sheer compute power sitting in Redshift makes it a very popular source for compute resources. Unused compute can be used for ongoing ETL. You would have to pay for this compute whether or not you use it. Two, and this is an interesting twist, Redshift has become a popular ETL tool because of its capabilities in processing SQL statements. Yes, ETL written in SQL has become popular, especially for complicated transformations and joins that would be cumbersome to write in Python, Scala, or Java.

Conclusion

Redshift’s place in the enterprise IT stack seems secure, though how IT departments use the solution will likely change over time – significantly, perhaps. The reason for persisting in all the maintenance tasks listed above, is that Redshift is increasingly becoming the centerpiece for a data-driven analytics program. Data volume is not shrinking; it is always growing. If you take advantage of these performance features, you will make the most of your Redshift cluster and therefore your analytics program.

Download the infographic on our survey findings.

-Rob Whelan, Data Engineering & Analytics Practice Director

 

You’re on AWS. Now What? 5 Strategies to Increase Your Cloud’s Value

Now that you’ve migrated your applications to AWS, how can you take the value of being on the cloud to the next level? To provide guidance on next steps, here are 5 things you should consider to amplify the value of being on AWS.

You’re on AWS, now what? Five things you should consider now.

You migrated your applications to AWS for a reason. Maybe it was for the unlimited scalability, powerful computing capability, ease and flexibility of deployment, movement from CapEx to OpEx model, or maybe it was simply because the boss told you to. However you got there, you’re there. So, what’s next? How do you take advantage of your applications and data that reside in AWS? What should you be thinking about in terms of security and compliance? Here are 5 things you should consider in order to amplify the value of being on AWS:

  1. Create competitive advantage from your AWS data
  2. Accelerate application development
  3. Increase the security of your AWS environment
  4. Ensure cloud compliance
  5. Reduce cloud spend without reducing application deployment

Create competitive advantage from your data

You have a wealth of information in the form of your AWS datasets. Finding patterns and insights not just within these datasets, but across all datasets is key to using data analysis to your advantage. You need a modern, cloud-native data lake.

Data lakes, though, can be difficult to implement and require specialized, focused knowledge of data architecture. Utilizing a cloud expert can help you architect and deploy a data lake geared toward your specific business needs, whether it’s making better-informed decisions, speeding up a process, reducing costs or something else altogether.

Download this datasheet to learn more about transforming your data analytics processes into a flexible, scalable data lake.

Accelerate application development

If you arrived at AWS to take advantage of the rapid deployment of infrastructure to support development, you understand the power of bringing applications to market faster. Now may be the time to fully immerse your company in a DevOps transformation.

A DevOps Transformation involves adopting a set of cultural values and organizational practices that improve business outcomes by increasing collaboration and feedback between business stakeholders, Development, QA, IT Operations, and Security. This includes an evolution of your company culture, automation and tooling, processes, collaboration, measurement systems, and organizational structure—in short, things that cannot be accomplished through automation alone.

To learn more about DevOps transformation, download this free eBook about the Misconceptions and Challenges of DevOps Transformation.

Increase the security of your AWS environment

How do you know if you’re AWS environment is truly secure? You don’t, unless you deploy a comprehensive security assessment of your AWS environment that measures your environment against the latest industry standards and best practices. This type of review provides a list of vulnerabilities and actionable remediations, an evaluation of your Incident Response Policy, and a comprehensive consultation of the system issues that are causing these vulnerabilities.

To learn more, review this Cloud Security Rapid Review document and learn how to gain protection from immediate threats.

Ensure cloud compliance

Deploying and managing cloud infrastructure requires new skills, software and management to maintain regulatory compliances within your organization. Without the proper governance in place, organizations can be exposed to security vulnerabilities and potentially compromise confidential information.

A partner like 2nd Watch can be a great resource in this area. The 2nd Watch Compliance Assessment and Remediation service is designed to evaluate, monitor, auto-remediate, and report on compliance of your cloud infrastructure, assessing industry standard policies including CIS, GDPR, HIPAA, NIST, PCI-DSS, and SOC2.

Download this datasheet to learn more about our Compliance Assessment & Remediation service.

Reduce cloud spend without reducing application deployment

Need to get control of your cloud spend without reducing the value that cloud brings to your business? This is a common discussion we have with clients. To reduce your cloud spend without decreasing the benefits of your cloud environment, we recommend examining the Pillars of Cloud Cost Optimization to prevent over-expenditure and wasted investment. The pillars include:

  • Auto-parking and on-demand services
  • Cost models
  • Rightsizing
  • Instance family / VM type refresh
  • Addressing waste
  • Shadow IT

For organizations that incorporate cloud cost optimization into their cloud infrastructure management, significant savings can be found, especially in larger organizations with considerable cloud spend.

Download our A Holistic Approach to Cloud Cost Optimization eBook to learn more.

After you’ve migrated to AWS, the next logical step in ensuring IT satisfies corporate business objectives is knowing what’s next for your organization in the cloud. Moving to the cloud was the right decision then and can remain the right decision going forward. Implement any of the five recommendations and accelerate your organization forward.

-Michael Elliott, Sr Director of Product Marketing

What to Ask Yourself When Considering VMware Cloud on AWS

Deciding on the best cloud strategy for your business can be overwhelming, especially if you’re new to the cloud. If you’re considering VMware Cloud on AWS (VMC on AWS), ask yourself these questions to find out if it’s the best solution for your needs.

1. Is it cost-effective for your business?

VMware is a premium brand and if you’re just looking at the compute cost, it may seem out of budget. To get an accurate comparison, you need to evaluate the compute cost against the expenses incurred in an on-prem environment – real estate, line pull, hardware, software maintenance, headcount, management, upgrades, and travel costs. Because it can be difficult to estimate these operational costs ahead of implementation, VMware provides some tools to help.

  • Production Pricing Calculator: Post a roadmap of the features you need in the cloud, along with workload sizing to get a cost calculation, or post-sizing calculation, that includes software overhead.
  • Operations Manager in VMware: Get a granular estimate of the cost for a sub-segment of your workload using this VMware management tool. Best for larger organizations where workload has a bigger impact on costs.
  • Network Insight in VMware: Another VMware management tool, Network Insight tracks traffic flow, something often neglected when comparing on-prem and cloud costs.

2. Do you use proof of concept environments?

Proof of concept (POC) environments let you evaluate a product in your architecture and demonstrate its capabilities. As opposed to POCs on hardware when someone has to unrack the hardware, unplug it, find the original box it came in, and ship it once you’ve completed your trial, closing a POC with VMC on AWS takes as few as three clicks. This might not seem like a big deal, but it’s a huge time and resource saver for technicians. Additionally, it makes everyone more willing to try new products, ensuring your environment is best equipped for your business.

3. Do you want to add hosts easily?

Adding hosts to your environment increases computing and storage capacity. With a datacenter, you buy hardware based on an estimation of capacity alongside your budget. After getting a quote and a purchase order, it can take six months to get your hardware. Then you need to rack and stack it and depend on the datacenter guys to give you a report. Over the next three to five years, you amortize the cost of the hardware and your effort.

With VMC on AWS, you input how many hosts you want and nine minutes later, an additional host is added to the cluster. When you no longer need the host, you can turn it off and only be billed for the time it was used. The quick control over your storage needs keep costs low, productivity high, and resource use optimized.

4. Do you need disaster recovery?

Using VMC for disaster recovery (DR) is becoming more popular with larger companies and those needing virtual desktop infrastructure (VDI), failover, and burst capability. This allows you to get started on VMC without it being heavily utilized until you’re ready.

Smaller companies considering DR on VMC need to consider the size versus cost model to determine what’s best for them. If you’re doing a business continuity case using VMC as a pilot light, then you can layer on Site Recovery Manager (SRM), VMware’s DR solution, very easily. In fact, you may be able to use VMC on AWS for more than just DR, including cloud strategy, business continuity or the pilot light, and potentially bursting capability for your on-prem. When you can rely on one solution for multiple purposes, you save time and resources through simplicity and standardization.

5. Do you just want it to work?

Professionals outside of tech have one simple goal – they just need this stuff to run reliably. They need a solution that allows them to focus on their responsibilities, rather than navigating issues, set-up, and dealing with other distractions.

One of the best things about VMC on AWS is the hands-off, ‘set it and forget it’ capability. The hardware and the upgrades are no longer your concern. There’s no need to spend so much money, time, and effort reinventing the wheel. It’s the bill versus pay model and it can put a lot of people in your organization at ease.

Building your cloud strategy, determining what products to use, and creating the architecture is all unique to your individual company. Our VMware Cloud experts can help you navigate your options for the best long-term results. Contact Us to take the next step in your cloud journey.