When you deployed Redshift a few years ago, your new data lake was going to allow your organization to make better, faster, more informed business decisions. It would break down data silos allowing your Data Scientists to have greater access to all data sources, quickly, enabling them to be more efficient in delivering consumable data insights.
Now that some time has passed, though, there is a good chance your data lake may no longer be returning to you the value it initially did. It has turned into a catch all for your data and maybe even a giant data mess with your clusters filling up too quickly, resulting in the need to constantly delete data or scale up. Teams are blaming one another for consuming too many resources, even though they are split and shouldn’t be impacting one another. Slow queries have resulted from a less than optimal table structure decided upon when initially deployed that no longer fits the business and data you are generating today. All of this results in your expensive Data Scientists and Analysts being less productive than when you initially deployed Redshift.
Keep in mind, though, that the Redshift you deployed a few years ago is not the same Redshift today. We all know that AWS is continuously innovating, but over the last 2 years they have added more than 200 new features to Redshift that can address many of these problems, such as:
Utilizing AQUA nodes, which can deliver a 10x performance improvement
Refreshing instance families that can lower your overall spend
Federated query, which allows you to query across Redshift, S3, and relational database services to come up with aggregated data sets, which can then be put back into the data lakes to be consumed by other analytic services
Concurrency scaling, which automatically adds and removes capacity to handle unpredictable demand from thousands of concurrent users, so you do not take a performance hit
The ability to take advantage of machine learning with automatic workload management (WLM) to dynamically manage memory and concurrency, helping maximize query throughput
As a matter of fact, clients repeatedly tell us there have been so many innovations with Redshift, it’s hard for them to determine which ones will benefit them, let alone be aware of all of them all.
Having successfully deployed and maintained AWS Redshift for years here at 2nd Watch, we have packaged our best practice learnings to deliver the AWS Redshift Health Assessment. The AWS Redshift Health Assessment is designed to ensure your Redshift Cluster is not inhibiting the productivity of your valuable and costly specialized resources.
At the end of our 2-3 week engagement, we deliver a lightweight prioritized roadmap of the best enhancements to be made to your Redshift cluster that will deliver immediate impact to your business. We will look for ways to not only improve performance but also save you money where possible, as well as analyze your most important workloads to ensure you have an optimal table design deployed utilizing the appropriate and optimal Redshift features to get you the results you need.
AWS introduced the concept of a Lake House analogy to better describe what Redshift has become. A Lake House is prime real estate that everyone wants because it gives you a view of something beautiful, with limitless opportunities of enjoyment. With the ability to use a common query or dashboard across your data warehouses and multiple data lakes, like a lake house, Redshift provides you the beautiful sight of all your data and limitless possibilities. However, every lake house needs ongoing maintenance to ensure it brings you the enjoyment you desired when you first purchased it and a lake house built with Redshift is no different.
Contact 2nd Watch today to maximize the value of your data, like you intended when you deployed Redshift.
-Rob Whelan, Data Engineering & Analytics Practice Manager
AWS says Amazon Redshift is the world’s fastest cloud data warehouse, allowing customers to analyze petabytes of structured and semi-structured data at high speeds that allow for exploratory analysis. According to a 2018 Forrester report, Redshift is the most popular cloud data warehouse for enterprises.
To better understand how enterprises are using Redshift, 2nd Watch surveyed Redshift users at large companies. A majority of respondents (57%) said their Redshift implementation had delivered on corporate expectations, while another 26% said it had “somewhat” delivered.
With all the benefits Redshift enables, it’s no wonder tens of thousands of customers use it. Benefits like three times the performance of any cloud data warehouse or being 50% less expensive than all other cloud data warehouses make it an attractive service to Fortune 500 companies and startups alike, including McDonald’s, Lyft, Comcast, and Yelp, among others.
Despite its apparent success in the market, not all Redshift deployments have gone according to plan. 45% of respondents said queries stacking up in queues was a recurring problem in their Redshift deployment; 30% said some of their Data Analyst’s time was unproductive as a result of tuning Redshift queries; and 34% said queries were taking more than one minute to return results. Meanwhile, 33% said they were struggling to manage requests for permissions, and 25% said their Redshift costs were higher than anticipated.
Query and Queuing Learnings:
Queuing of queries is not a new problem. Redshift has a long-underutilized feature called Workload Management queues, or WLM. These queues are like different entrances to a baseball stadium. They all go to the same baseball game, but with different ways to get in. WLM queues divvy up compute and processing power among groups of users so no single “heavy” user ends up dominating the database and preventing others from accessing. It’s common to have queries stack up in the Default WLM queue. A better pattern is to have at least three or four different workload management queues:
Ad hoc exploration
Data loading and unloading
As for time lost due to performance tuning, this is a tradeoff with Redshift: it is inexpensive on the compute side but takes some care and attention on the human side. Redshift is extremely high-performing when designed and implemented correctly for your use case. It’s common for Redshift users to design tables at the beginning of a data load, then not return to the design until there is a problem, after other data sets enter the warehouse. It’s a best practice to routinely run ANALYZE and have auto-vacuum turned on, and to know how your most common queries are structured, so you can sort tables accordingly.
If queries are taking a long time to run, you need to ask whether the latency is due to the heavy processing needs of the query, or if the tables are designed inefficiently with respect to the query. For example, if a query aggregates sales by date, but the timestamp for sales is not a sort key, the query planner might have to traverse many different tables just to make sure it has all the right data, therefore taking a long time. On the other hand, if your data is already nicely sorted but you have to aggregate terabytes of data into a single value, then waiting a minute or more for data is not unusual.
Some survey respondents mentioned that permissions were difficult to manage. There are several options for configuring access to Redshift. Some users create database users and groups internal to Redshift and manage authentication at the database level (for example, logging in via SQL Workbench). Others delegate permissions with an identity provider like Active Directory.
Implementation and Cost Savings
Enterprise IT directors are working to overcome their Redshift implementation challenges. 30% said they are rewriting queries, and 28% said they have compressed their data in S3 as part of a LakeHouse architecture. Query tuning was having the greatest impact on the performance of Redshift clusters.
When Redshift costs exceed the plan, it is a good practice to assess where the costs are coming from. Is it from storage, compute, or something else? Generally, if you are looking to save on Redshift spend, you should explore a LakeHouse architecture, which is a storage pattern that shifts data between S3 and your Redshift cluster. When you need lots of data for analysis, data is loaded into Redshift. When you don’t need that data anymore, it is moved back to S3 where storage is much cheaper. However, the tradeoff is that analysis is slower when data is in S3.
Another place to look for cost savings is in the instance size. It is possible to have over-provisioned your Redshift nodes. Look for metrics like CPU utilization; if it is consistently 25% or even 30% or lower, then you have too much headroom and might be over-provisioned.
Challenges aside, enterprise IT directors seem to love Redshift. The top four Redshift features, according to our survey, are query monitoring rules (cited by 44% of respondents), federated queries (35%) and custom-built ETL workflows (33%).
Query Monitoring Rules are custom rules that track bad or slow queries. Customers love Query Monitoring Rules because they are simple to write and give you great visibility into queries that will disrupt operations. You can choose obvious metrics like query_execution_time, or more subtle things like query_blocks_read, which would be a proxy for how much searching the query planner has to do to get data. Customers like these features because the reporting is central, and it frees them from having to manually check queries themselves.
Federated queries allow you to bring in live, external data to join with your internal Redshift data. You can query, for example, an RDS instance in the same SQL statement as a query against your Redshift cluster. This allows for dynamic and powerful analysis that normally would take many time-consuming steps to get the data in the same place.
Finally, custom-built ETL workflows have become popular for several reasons. One, the sheer compute power sitting in Redshift makes it a very popular source for compute resources. Unused compute can be used for ongoing ETL. You would have to pay for this compute whether or not you use it. Two, and this is an interesting twist, Redshift has become a popular ETL tool because of its capabilities in processing SQL statements. Yes, ETL written in SQL has become popular, especially for complicated transformations and joins that would be cumbersome to write in Python, Scala, or Java.
Redshift’s place in the enterprise IT stack seems secure, though how IT departments use the solution will likely change over time – significantly, perhaps. The reason for persisting in all the maintenance tasks listed above, is that Redshift is increasingly becoming the centerpiece for a data-driven analytics program. Data volume is not shrinking; it is always growing. If you take advantage of these performance features, you will make the most of your Redshift cluster and therefore your analytics program.
Download the infographic on our survey findings.
-Rob Whelan, Data Engineering & Analytics Practice Director
Well, it’s that time of year again. Where I live, the leaves are changing color, temperature is diving, and turkeys are starting to fear for their lives. These signs all point to AWS re:Invent being right around the corner. This year, AWS re:Invent will kick off its 9th annual conference on November 30th, 2020 with a couple major caveats. It will be 3 weeks long, 100% virtual, and free to all. This year will be a marathon, not a sprint, so make sure to pace yourself. As always, 2nd Watch is here to help prepare you with what we think you can expect this year, so let’s get to it!
At the time I am writing this article things are a bit unclear on how everything will work at re:Invent this year. We can definitely count on live keynotes from the likes of Andy Jassy, Peter DeSantis, Werner Vogels and more. For the hundreds of sessions, it’s unclear if the sessions will be broadcasted live at a scheduled time and then rebroadcasted or if everything will be on-demand. We do know sponsor-provided sessions will be pre-recorded and available on-demand on the sponsor pages. I am sure this will be flushed out once the session catalog is released in mid-November. Regardless, all sessions are recorded and then posted later on various platforms such as the AWS YouTube channel. Per the AWS re:Invent 2020 FAQ’s, “You can continue logging in to the re:Invent platform and viewing sessions until the end of January 2021. After January, you will be able to view the sessions on the AWS YouTube channel.”
AWS expects somewhere in the range of a mind boggling 250,000+ people to register this year, so we can all hold our expectations for getting that famous re:Invent hoodie. The event will be content-focused, so each sponsor will get their own sponsor page, which is the equivalent of a sponsor booth. Sponsor pages are sure to have downloadable content, on-demand videos and other goodies available to attendees, but again, how you’re going to fill up your swag bag is yet to be seen. Let’s move on to our advice and predictions, and then we will take the good with the bad to wrap it up.
Be humble– Hold off on boasting to your colleagues this year that you are part of the elite that get sent to re:Invent. News flash: they are going this year too. Along with 250,000+ other people.
Pace yourself – You will not be able to attend every session that you are interested in. Pick one learning track and try to get the most out of it.
No FOMO – Fear not, all the sessions are recorded and posted online for you to view on-demand, at your convenience.
Stay connected – Take advantage of any virtual interactive sessions that you can to meet new people and make new connections in the industry.
Get hands on– Take advantage of the Jams and Gamedays to work with others and get hands on experience with AWS services.
Let’s take a quick look at some of the predictions our experts at 2nd Watch have for service release announcements this year at re:Invent.
AWS Glue will have some serious improvements around the graphical user interface.
Better datatype detection and automatic handling of datatype conflicts.
Glue job startup will also speed up significantly.
Amazon SageMaker algorithms will become easier to use – data integration will be smoother and less error-prone.
AWS will release managed implementations of text generation algorithms like GPT-2.
Some kind of automatic visualization or analysis feature for Amazon Redshift will be released, so you don’t have to build analyses from scratch every time.
Expanded/enhanced GPU instance type offerings will be made available.
Lambda integration with AWS Outposts will be made available.
Serverless aurora goes fully on-demand with pay-per-request and also, potentially, global deployment.
Make sure to check back here on the 2nd Watch blog or the 2nd Watch Facebook, LinkedIn and Twitter pages for weekly re:Invent recaps, We’ll also be live-tweeting AWS announcements during the keynotes, so keep your eye on our Twitter feed for all the highlights!
Finally, we thought it would be fun to highlight some of the good that comes with the changes this year.
Take the Good with the Bad
No crowded hallways
No rush commutes from hotel to hotel
No limits on sessions
No need to wear a mask
Potential for some fun, virtual after-hours events
No hangovers (maybe)
No in-person events or after-parties
Yet to be determined swag giveaways
No feeling special because you registered for a session before your friends or colleagues
No going out for amazing food and drink every night
No legendary 2nd Watch party
Nothing is as we are used to this year, and re:Invent falls right in line with that sentiment. We are eager with anticipation of a great event, nevertheless, and hope you are too. Since we won’t get to see you in-person at our booth this year, please visit our pre-re:Invent site at offers.2ndwatch.com/aws-reinvent-2020 now to pre-schedule a meeting with us and find out about all the fun giveaways and contests we have this year. Don’t miss out on your free 2nd Watch re:Invent sweatpants, a chance to win a Sony PlayStation 5, a great virtual session on taking your data lake from storage to strategic, and a lot more! Then, make sure to visit our re:Invent sponsor page 11/30-12/18 on the re:Invent portal.
We would love to meet you and discuss all the possibilities for your cloud journey. Have a fantastic re:Invent 2020 and stay safe!
-Dustin Snyder, Director of Cloud Infrastructure & Architecture
Here is a short list of links you may find useful:
A colleague of mine postulated that the IT department would eventually go the way of the dinosaur. He put forward that as Everything-as-a-Service model becomes the norm, IT would no longer provide meaningful value to the business. My flippant response was to point out that they have been saying that mainframes are dead for decades.
This of course doesn’t get to the heart of the conversation. What is the future role of IT as we move towards the use of Everything-as-a-Service? Will marketing, customer services, finance and other departments continue to look to IT for their application deployment? Will developers and engineers move to containerization to build and release code, turning to a DevOps model where the Ops are simply a cloud provider?
We’ve already proven that consumers can adapt to very complex applications. Every day when you deploy and use an application on your phone, you are operating at a level of complexity that once required IT assistance. And yes, the development of intuitive UXs has enabled this trend, however the same principal is occurring at the enterprise level. Cloud, in many ways, has already brought this simplification forward. It has democratized IT.
So, what is the future of IT? What significant disruptions to operations processes will occur through democratization? I liken it to the evolution of eSports (Madden NFL). You don’t manage each player on the field. You choose the skill players for the team, then run the plays. The only true decision you make is which offensive play to run, or which defensive scheme to set. In IT terms, you review the field (operations), orchestrate the movement of resources, and ensure the continuation of the applications looking for any potential issues and resolving them before they become an issue. This is the future of IT.
What are the implications? I believe IT evolves into a higher order (read more business value) function. They enable digital transformation, not from a resource perspective, but from a strategic business empowerment perspective. They get out of the job that keeps them from being strategic, the tactical day to day of managing resources, to enabling and implementing business strategy. However, that takes a willingness to specifically allocate how IT is contributing to the business value output/increase at some very granular levels. To achieve this, it might require reengineering teams, architectures, and budgets to tightly link specific IT contributions to specific business outputs. The movement to modern cloud technology supports this fundamental shift, and over time, will soon start to solve chronic problems of underfunding or lack of support for ongoing improvement. IT is not going the way of the dinosaur. They’re becoming the fuel that enables business to grow strategically.
Want more tips on how to empower IT to contribute to growing your business strategy? Contact us
-Michael Elliott, Sr Director of Product Marketing
Historically, and common among enterprise IT processes, the 2nd Watch optimization team was pulling in cost usage reports from Amazon and storing them in S3 buckets. The data was then loaded into Redshift, Amazon’s cloud data warehouse, where it could be manipulated and analyzed for client optimization. Unfortunately, the Redshift cluster filled up quickly and regularly, forcing us to spend unnecessary time and resources on maintenance and clean up. Additionally, Redshift requires a large cluster to work with, so the process for accessing and using data became slow and inefficient.
Of course, to solve for this we could have doubled the size, and therefore the cost, of our Redshift usage, but that went against our commitment to provide cost-effective options for our clients. We also could have considered moving to a different type of node that is storage optimized, instead of compute optimized.
Lakehouse Architecture for speed improvements and cost savings
The better solution we uncovered, however, was to follow the Lakehouse Architecture pattern to improve our use of Redshift to move faster and with more visibility, without additional storage fees. The Lakehouse Architecture is a way to strike a balance between cost and agility by selectively moving data in and out of Redshift depending on the processing speed needed for the data. Now, after a data dump to S3, we use AWS Glue crawlers and tables to create external tables in the Glue Data Catalogues. The external tables or schemas are linked to the Redshift cluster, allowing our optimization team to read from S3 to Redshift using Redshift Spectrum.
Our cloud data warehouse remains tidy without dedicated clean-up resources, and we can query the data in S3 via Redshift without having to move anything. Even though we’re using the same warehouse, we’ve optimized its use for the benefit of both our clients and 2nd Watch best practices. In fact, our estimated savings are $15,000 per month, or 100% of our previous Redshift cost.
How we’re using Redshift today
With our new model and the benefits afforded to clients, 2nd Watch is applying Redshift for a variety of optimization opportunities.
Discover new opportunities for optimization. By storing and organizing data related to our clients’ AWS, Azure, and/or Google Cloud usage versus spend data, the 2nd Watch optimization team can see where further optimization is possible. Improved data access and visibility enables a deeper examination of cost history, resource usage, and any known RIs or savings plans.
Increase automation and reduce human error. The new model allows us to use DBT (data build tool) to complete SQL transforms on all data models used to feed reporting. These reports go into our dashboards and are then presented to clients for optimization. DBT empowers analysts to transform warehouse data more efficiently, and with less risk, by relying on automation instead of spreadsheets.
Improve efficiency from raw data to client reporting. Raw data that lives in a data lake in s3 is transformed and organized into a structured data lake that is prepared to be defined in AWS Glue Catalog tables. This gives the analysts access to query the data from Redshift and use DBT to format the data into useful tables. From there, the optimization team can make data-based recommendations and generate complete reports for clients.
In the future, we plan on feeding a power business intelligence dashboard directly from Redshift, further increasing efficiency for both our optimization team and our clients.
Client benefits with Redshift optimization
Cost savings: Only pay for the S3 storage you use, without any storage fees from Redshift.
Unlimited data access: Large amounts of old data are available in the data lake, which can be joined across tables and brought into Redshift as needed.
Increased data visibility: Greater insight into data enables us to provide more optimization opportunities and supports decision making.
Improved flexibility and productivity: Analysts can get historical data within one hour, rather than waiting 1-2 weeks for requests to be fulfilled.
Reduced compute cost: By shifting the compute cost of loading data into to Amazon EKS.
Today, we’re excited to announce a new enhancement to our Managed Optimization service – Spot Instance and Container Optimization – for enterprise IT departments looking to more thoughtfully allocate cloud resources and carefully manage cloud spend.
Enterprises using cloud infrastructure and services today are seeing higher cloud costs than anticipated due to factors such as cloud sprawl, shadow IT, improper allocation of cloud resources, and a failure to use the most efficient resource based on workload. To address these concerns, we take a holistic approach to Optimization and have partnered with Spot by NetApp to enhance our Managed Optimization service.
The service works by recommending workloads that can take advantage of the cost savings associated with running instances, VMs and containers on “spot” resources. A spot resource is an unused cloud resource that is available for sale in a marketplace for less than the on-demand price. Because spot resources enable users to request unused EC2 instances or VMs to run their workloads at steep discounts, users can significantly lower their cloud compute costs, up to 90% by some measures. To deliver its service, we’re partnering with Spot, whose cloud automation and optimization solutions help companies maximize return on their cloud investments.
“Early on, spot resources were difficult to manage, but the tasks associated with managing them can now be automated, making the use of spot a smart approach to curbing cloud costs,” says Chris Garvey, EVP of Product at 2nd Watch. “Typically, non-mission critical workloads such as development and staging have been able to take advantage of the cost savings of spot instances. By combining 2nd Watch’s expert professional services, managed cloud experience and solutions from Spot by NetApp, 2nd Watch has been able to help companies use spot resources to run production environments.”
“Spot by NetApp is thrilled to be working with partners like 2nd Watch to help customers maximize the value of their cloud investment,” says Amiram Shachar, Vice President and General Manager of Spot by NetApp. “Working together, we’re helping organizations go beyond one-off optimization projects to instead ensure continuous optimization of their cloud environment using Spot’s unique technology. With this new offering, 2nd Watch demonstrates a keen understanding of this critical customer need and is leveraging the best technology in the market to address it.”
A cloud center of excellence (CCoE) is essential for successful, efficient, and effective cloud implementation across your organization. Although the strategies look different for each business, there are three areas of focus, and four phases of maturity within those areas, that are important markers for any CCoE.
1. Financial Management
As you move to the public cloud and begin accessing the innovation and agility offered, it comes with the potential for budget overruns. Without proper planning and inclusion of financial leaders, you may find you’re not only paying for datacenters, but you’re also racking up large, and growing, public cloud bills. Financial management needs to be centrally governed, but extremely deliberate because it touches hundreds of thousands of places across your organization.
You may think involving finance will be painful but brining all stakeholders to the table equally has proven highly effective. Over the last five years, there’s been a revolution in how finance can effectively engage in cloud and infrastructure management. This emerging model, guided by the CCoE, enables organizations to justify leveraging the cloud, not only based on agility and innovation, but also cost. Increasingly, organizations are achieving both better economics and gaining the ability to do things in the cloud that cannot be done inside datacenters.
To harness the power and scale possible in the cloud, you need to put standards and best practices in place. These often start around configuration – tagging policies, reference architectures, workloads, virtual machines, storage, and performance characteristics. Standardization is a prerequisite to repeatability and is the driving force behind gaining the best ROI from the cloud.
Today, we’re actually seeing that traditional application of the cloud does not yield the best economic benefits available. For decades, we accepted an architectural model where the operating system was central to the way we built, deployed, and managed applications. However, when you look beyond the operating system, whether it’s containers or the rich array of platform services available, you start to see new opportunities that aren’t available inside datacenters.
When you’re not consuming the capital expenditure for the infrastructure you have available to you, and you’re only consuming it when you need it, you can really start to unlock the power of the cloud. There are many more workloads available to take advantage of as well. The more you start to build cloud native, or cloud centric architecture, the more potential you have to maximize financial benefits.
3. Security and Compliance
Cloud speed is fast. Much faster than what’s possible in datacenters. Avoid a potentially fatal breach, data disruption, or noncompliance penalty with strict security and compliance practices. You should be confident in the tools you implement throughout your organization, especially where the cloud is being managed day to day and changes are being driven. With each change and new instance, make sure you’re following the CCoE recommendations with respect to industry, state, and federal compliance regulations.
4-Phase Cloud Maturity Model
CloudHealth put forward a cloud maturity model based on patterns observed in over 10,000 customer interactions in the cloud. Like a traditional maturity model, the bottom left represents immaturity in the cloud, and the upper right signifies high maturity. Within each of the three foundational areas – financial management, operations, and security and compliance – an organization needs to scale and mature through the following four phases.
Phase 1: Visibility
Maturity starts at the most basic level by gaining visibility into your current architecture. Visibility gives you the connective tissue necessary to make smart decisions – although it doesn’t actually make those decisions obvious to you. First, know what you’re running, why you’re running it, and the cost. Then, analyze how it aligns with your organization from a business perspective.
Phase 2: Optimization
The goal here is all around optimization within each of the three areas. In regards to financial management and operations, you need to size a workload appropriately to support demand, but without going over capacity. In the case of security, optimization is proactively monitoring all of the hundreds of thousands of changes that occur across the organization each day. The strategy and tools you use to optimize must be in accordance with the best practices in your standards and policies.
Phase 3: Governance and Automation
In this phase you’re moving away from just pushing out dashboards, notification alerts, or reports to stakeholders. Now, it’s about strategically monitoring for the ideal state of workloads and applications in your business services. How do you automate the outcomes you want? The goal is to keep it in the optimum state all the time, or nearly all the time, without manual tasks and the risks of human error.
Phase 4: Business Integration
This is the ultimate state where the cloud gets integrated with your enterprise dashboards and service catalogue, and everything is connected across the organization. You’re no longer focused on the destination of the cloud. Instead, the cloud is just part of how you transact business.
As you move through each phase, establish measurements of cloud maturity using KPIs and simple metrics. Enlist the help of a partner like 2nd Watch that can provide expertise, automation, and software so you can achieve better business outcomes regardless of your cloud goals. Contact Us to understand how our cloud optimization services are maximizing returns.
You’ve migrated to the cloud and are using cloud services within your own team, but how do you scale that across the organization? A Cloud Center of Excellence (CCoE) is the best way to scale your usage of the cloud across multiple teams, especially when navigating organizational complexity.
What is a CCoE?
A Cloud Center of Excellence, or CCoE, is a group of cross functional business leaders who collaboratively drive the best practices and standards that govern the cloud implementation strategy across their organization – developed in response to changes in the cloud. Pre-cloud, all of our infrastructure, usage, and deployments of applications were controlled by central IT. Typically, the IT department both made the infrastructure and applications available and had control over management. Now, in the post-cloud world, management in large enterprises is occurring in hundreds or thousands of places across the organization – rather than solely in central IT. Today’s cloud moves at a pace much faster than what we saw inside traditional datacenters, and that speed requires a new governance.
This seismic shift in responsibility and business-wide impact has brought both agility and innovation across organizations, but it can also introduce a fair amount of risk. A CCoE is a way to manage that risk with clear strategy development, governance, and buy-in from the top down. Utilizing stakeholders from finance and operations, architecture and security, a CCoE does not dictate or control cloud implementation, but uses best practices and standards throughout the organization to make cloud management more effective.
Getting started with a CCoE
First and foremost, a CCoE cannot start without recognizing the need for it. If you’re scaling in the public cloud, and you do not require and reinforce best practices and standards, you will hit a wall. Without a CCoE, there will be a tipping point at which that easy agility and innovation you received leveraging the public cloud suddenly turns against you. A CCoE is not a discretionary mechanism, it’s actually a prerequisite to scaling in the cloud successfully.
Once you know the significance and meaning of your CCoE, you can adapt it to the needs of your business and the state of your maturity. You need a clear understanding of both how you’re currently using the cloud, as well as how you want to use it going forward.
In doing that, you also need to set appropriate expectations. Over time, what you need and expect from a CCoE will change. Based on size, market, goals, compliance regulations, stakeholder input, etc., the job of a CCoE is to manage cloud implementation while avoiding risk. The key to a successful CCoE is balancing providing agility, innovation, and all the potential benefits of the cloud in a way that does not adversely impact your team’s ability to get things done. Even though the CCoE is driving strategy from the top, your employees need the freedom to make day-to-day management decisions, provision what they need and want, and use the agility provided by the cloud to be creative. It’s a fluid process much different from the rigid infrastructure planning of rack and stack used a decade ago.
Create an ongoing process with returns by partnering with a company who knows what you need not only today, but in the future. The right partner will provide the products, people and services that enable you to be successful. With all the complexity going on in the cloud, it’s extremely difficult to navigate and scale without an experienced expert.
2nd Watch Cloud Advisory Services include a Cloud Readiness Assessment to evaluate your current IT estate, as well as a Cloud Migration Cost Assessment that estimates costs across various cloud providers. As a trusted advisor, we’re here to answer key questions, define strategy, manage change, and provide impartial advice on a wide range of issues critical to successful cloud modernization. Contact Us to see how we can make your CCoE an organizational success.
Now that you’ve migrated your applications to AWS, how can you take the value of being on the cloud to the next level? To provide guidance on next steps, here are 5 things you should consider to amplify the value of being on AWS.
Cloud optimization is a continuous process specific to a company’s goals, but there are some staple best practices all optimization projects should follow. Here are our top 10.
1. Begin with the end in mind.
Business leaders and stakeholders throughout the organization should know exactly what they’re trying to achieve with a cloud optimization project. Additionally, this goal should be revisited on a regular basis to make sure you remain on track to achievement. Create measures to gauge success at different points and follow the agreed upon order of operations to complete the process.
2. Create structure around governance and responsibility.
Overprovisioning is one of the most common issues adding unnecessary costs to your bottom line. Implement specific and regulated structure around governance and responsibility for all teams involved in optimization to control any unnecessary provisioning. Check in regularly to make sure teams are following the structure and you only have the tools you need and are actively using.
3. Get all the data you need.
Cloud optimization is a data-driven exercise. To be successful, you need insight into a range of data pieces. Not only do you need to identify what data you need and be able to get it, but you also need to know what data you’re missing and figure out how to get it. Collaborate with internal teams to make sure essential data isn’t siloed or already being collected. Additionally, regularly clean and validate data to ensure reliability for data-based decision making.
4. Implement tagging practices.
To best utilize the data you have, organizing and maintaining it with strict tagging practices in necessary. Implement a system that works from more than just a technical standpoint. You can also use tagging to launch instances, control your auto parking methodology, or in scheduling. Tagging helps you understand the data and see what is driving spend. Whether it’s an environment tag, owner tag, or application tag, tagging provides clarity into spend, which is the key to optimization.
5. Gain visibility into spend.
Tagging is one way to see where your spend is going, but it’s not the only way required. Manage accounts regularly to make sure inactive accounts aren’t continuing to be billed. Set up an internal mechanism to review with your app teams and hold them accountable. It can be as simple as a dashboard with tagging grading, as long as it lets the data speak for itself.
6. Hire the right technical expertise.
Get more out of your optimization with the right technical expertise on your internal team. Savvy technicians should work alongside the business teams to drive the goals of optimization throughout the process. Without collaboration between these departments, you risk moving in differing directions with multiple end goals in mind. For example, one team might be acting with performance or a technical aspect in mind without realizing the implication on optimization. Partnering with optimization experts can also keep teams aligned and moving toward the same goal.
7. Select the right tools and stick with them.
Tools are a part of the optimization process, but they can’t solve problems alone. Additionally, there are an abundance of tools to choose from, many of which have similar functionality and outcomes. Find the right tools for your goals, facilitate adoption, and give them the time and data necessary to produce results. Don’t get distracted by every new, shiny tool available and the “tool champions” fighting for one over another. Avoid the costs of overprovisioning by checking usage regularly and maintaining the governance structure established throughout your teams.
8. Make sure your tools are working.
Never assume a tool or a process you’ve put in place is working. In fact, it’s better to assume it’s not working and consistently check its efficiency. This regular practice of confirming the tools you have are both useful and being used will help you avoid overprovisioning and unnecessary spending. For tools to be effective and serve their purpose, you need enough visibility to determine how the tool is contributing to your overall end goal.
9. Empower someone to drive the process.
The number one call to action for anyone diving into optimization is to appoint a leader. Without someone specific, qualified, and active in managing the project with each stakeholder and team involved, you won’t accomplish your goals. Empower this leader internally to gain the respect and attention necessary for employees to understand the importance of continuous optimization and contribute on their part.
10. Partner with experts.
Finding the right partner to help you optimize efficiently and effectively will make the process easier at every turn. Bringing in an external driver who has the know-how and experience to consult on strategy through implementation, management, and replication is a smart move with fast results.
2nd Watch takes a holistic approach to cloud optimization with a team of experienced data scientists and architects who help you maximize performance and returns on your cloud assets. Are you ready to start saving? Let us help you define your optimization strategy to meet your business needs and maximize your results. Contact Us to take the next step in your cloud journey.