1-888-317-7920 info@2ndwatch.com

Gartner Report: Don’t Fail Fast in Production; Embed Monitoring Earlier in Your DevOps Cycle

Gartner says, “DevOps initiatives improve speed and agility, but monitoring often starts during production. To provide superior customer experiences, infrastructure and operations leaders need to build instrumentation into the preproduction phase, tracking metrics on availability, performance and service health.”

“How can I&O leaders leverage monitoring practices to continually improve DevOps deployments and performance against business key performance indicators (KPIs)? This research identifies monitoring practices that I&O leaders should embed in the preproduction phase of DevOps cycles to address needs across application development and release management.” says the report.

Access the Gartner report to learn more

Gartner Don’t Fail Fast in Production; Embed Monitoring Earlier in Your DevOps Cycle, 16 July 2019, Pankaj Prasad, George Spafford, Charley Rich


Managed Cloud, AMS, and the Enterprise – The Hows and Whys

Read this article on ChannnelPartnerInsights

It’s easy to forget that when enterprises first started moving to the cloud, it was a largely simple process that saw only a handful of people within an organization using the technology. But as its usage has become more prevalent, on-site infrastructure and IT operations teams have found themselves having to manage cloud environments, which has not only created a skills gap in many enterprises, but also given rise to cost inefficiencies as teams have either become spread more thinly, or, more likely, organizations have had to hire additional staff to manage their cloud environments. All of this can be compounded by trying to successfully integrate a cloud environment into an existing operation’s security structure.

The good news is that as cloud offerings have developed, all of these challenges can be addressed by managed cloud services. They help remove additional costs by negating the need for additional staff, as well as removing the complexity of trying to run a cloud environment for a large enterprise that wants to focus on running its business rather than running its infrastructure.

As managed cloud services continue their reach into the mainstream, customers will need to be educated on the myriad benefits the offering presents. Services such as AWS Managed Services (AMS) can offer enterprises a much easier cloud experience that doesn’t have to impinge upon the day-to-day running of the business.

Why managed cloud?

For clients questioning why they would benefit from a managed cloud offering, the first thing to note is that there is a clear reduction in the operational costs of cloud to be found. Enterprises no longer have to hire staff or spend time training existing staff to manage their cloud infrastructure. Alongside this, with a managed cloud services offering, enterprises have direct access to a team with a high level of skill set in cloud services and who will handle that portion of the organization’s infrastructure. Aspects like logging, monitoring, event management, continuity management, security and access management, patching, provisioning, incidents and reporting are all included in a managed cloud service offering.

AMS in particular is a highly automated offering, meaning that implementation is straightforward and much quicker than regular cloud implementations. It also features out-of-the-box compliance, such as PCI, HIPAA and GDPR, meaning that security postures won’t be disrupted during or after implementation. The service’s automation also allows for requests for change to be done within minutes, versus having to wait for an in-house IT infrastructure team to approve something before it can be changed.

And managed cloud services can have a significant impact upon an enterprise’s operations. For example, one of our clients – an ISV – was experiencing considerable challenges when evolving its product into a SaaS offering. While it was able to service the product, it wasn’t able to service the cloud infrastructure hosting the SaaS product. Using a managed cloud service – in this case AMS – meant the organization no longer had to manage that infrastructure itself and has since been able to decrease its time to resolution, as well as its cost of operations.

Further, the change enabled the ISV to be able to better predict their cost of goods sold given that AMS is a relatively steady monthly statement. This allows ISVs to consistently measure margin on their SaaS product offering.

Making the move to AMS

Migrating to AMS from on-premise infrastructure or an existing AWS environment is a straightforward process that consists of four key stages:

  1. Discovering what exists today and what needs to migrate
  2. Identifying the architecture to migrate (single account or multi-account landing zone)
  3. Identifying the migration plan (scheduled app migration in ‘waves’)
  4. Migrating to AMS

For customers on alternative cloud infrastructures, such as Google Cloud or Microsoft Azure, the migration to AMS is similar. The only bit of heavy lifting (for customers on any cloud platform) can come in integrating an existing operations team with the AMS operations teams so that they know how to work together if there’s a request, an update, or a problem.

Preparing for and performing this people-and-process integration upfront considerably reduces the complexity of cloud operations. This merger of operations usually flows from discovery and doesn’t end until the migration has been tested and the team is operating efficiently.

The path to AMS is a very structured, concrete process, which means clients don’t have to make myriad new decisions on their own. The onboarding process is streamlined and enables us as AMS partners to provide a true timeline for onboarding – something that can often be difficult when you’re dealing with a very large cloud migration.

For example, with AMS we know that discovery and planning take about three weeks, and building out the AMS landing zone takes about three weeks, and you can’t run these steps concurrently. We’ve received client feedback telling us that offering these timescales has been key to their comfort in engaging with this process and knowing they can get it done – clients don’t want an open-ended project that takes six years to migrate.

When it comes down to it, the cloud goals for the majority of customers is to streamline business processes and, ultimately, improve their bottom line. Using a managed cloud service like AMS can reduce costs, reduce operational challenges and increase security, making for a much smoother and easier experience for the enterprise, and a lucrative, open-ended opportunity for channel partners.

-Contributed article by Stefana Muller, Sr Product Manager


AWS Outposts Overview – Deep Dive

AWS Outposts are fully managed and configurable compute and storage racks built with AWS-designed hardware that allow customers to run compute and storage on-premises, while seamlessly connecting to AWS’ broad array of services in the cloud. Here’s a deeper look at the service.

As an AWS Outposts Partner, 2nd Watch is able to help AWS customers overcome challenges that exist due to managing and supporting infrastructures both on-premises and in cloud environments and delivering positive outcomes at scale. Our team is dedicated to helping companies achieve their technology goals by leveraging the agility, breadth of services, and pace of innovation that AWS provides. Read more


A Case for Enterprises to Leverage Managed Cloud Services

Cloud Adoption is almost mainstream. What are you doing to get on board?

If you follow the hype, you’d think that every enterprise has migrated their applications to the cloud and that you’re ‘behind the times’ when it comes to your on-premise or co-located datacenter. The truth is, many cloud computing technologies are a few years away from mainstream adoption. Companies find the prospect of moving the majority of their workloads to cloud daunting, not only due to the cost to migrate, but because their IT organization isn’t ready to operate in this new world. The introduction of new standards like Infrastructure as Code, CI/CD, serverless, containers, and the concern over security and compliance can place IT operations teams in a state of flux for years, which causes uptime, reliability, and costs to suffer.

Despite the challenges, Gartner predicts that Cloud Computing and Software as a Service (SaaS) is less than 2 years from mainstream adoption. {reference Gartner Hype Cycle for Cloud Computing, 2018 – published July 31, 2018 by David Smith & Ed Anderson.}

One expected early adopter of cloud technologies and IaaS is Independent Software Vendors (ISVs). Delivering their software as a service, enabling their customers to pay as they go, has become a requirement of the industry. The majority of ISVs are not dealing with green-field technology. They have legacy code and monolithic architectures to contend with, which require, in many cases, a rewrite to function effectively in cloud. I remember a time where my team (at a multi-national ISV) thought it was ‘good enough’ to fork-lift our executable into Docker and call it a day. This method of delivery will not compete with the Salesforce, ServiceNow, and Splunks of the world.

But how do ISVs compete when Cloud or SaaS Ops isn’t their core competency; when SaaS Ops has now become a distinct part of their product value stream?

The answer is Managed Cloud Services – outsourcing daily IT management for cloud-based services and technical support to automate and enhance your business operations.

Gartner says 75% of fully successful implementations will be delivered by highly skilled, forward looking boutique managed services providers with cloud-native, DevOps-centric services delivery approach.

Though this has traditionally been considered a solid solution for small to medium-sized companies looking to adopt cloud without the operational overhead, it has proven to be a game-changer for large enterprises, especially ISVs who can’t ramp up qualified SaaS operations staff fast enough to meet customer demand.

AWS has jumped on board with their own managed services offering called AWS Managed Services (AMS), which provides companies with access to AWS infrastructure, allowing them to scale their software deployments for end-users without increasing resources to manage their operations. The result is a reduction in operational overhead and risk as the company scales up to meet customer demand.

The AMS offering includes:

  • Logging, Monitoring, and Event Management
  • Continuity Management
  • Security and Access Management
  • Patch Management
  • Change Management
  • Provisioning Management
  • Incident Management
  • Reporting

In addition, if the ISV leverages AWS Marketplace to sell their SaaS solution, their billing, order processing, and fulfillment can be automated from start-to-finish letting them focus on their software and features rather than the minutia of operating a SaaS business and infrastructure, further reducing the strain of IT management. An example of an integration between AWS Marketplace and AMS that our team at 2nd Watch built for Cherwell Software is pictured here:

An example of an integration between AWS Marketplace and AMS

This AMS/AWS Marketplace integration is a win-win for any ISV looking to up their game with a SaaS offering. According to 451 Research, 41% of companies indicate they are lacking the platform expertise required to fully adopt hosting and cloud services within their organization. If this is the case, for companies whose core competency is not infrastructure or cloud, a managed service is a sure fit.

If you’re really looking to get up to speed quickly, our new onboarding service for AWS Managed Services (AMS) helps enterprises accelerate the process to assess, migrate, and operationalize their applications from on-premises to AWS. In addition, our Managed Cloud solutions help clients save 42% more than managing cloud services alone. Schedule a Discovery Workshop to learn more or get started.

I’ll throw one more stat at you; 72% of companies globally, across industries, will adopt cloud computing by 2022 based on the latest Future of Jobs Survey by the World Economic Forum (WEF). If you want to beat the “mainstream” crowd, start your migration now, knowing there are MSPs like 2nd Watch who can help with the transition as well as minimizing strain on your IT Operations team.

-Stefana Muller, Sr Product Manager


Cloud Autonomics and Automated Management and Optimization: Update

The holy grail of IT Operations is to achieve a state where all mundane, repeatable remediations occur without intervention, with a human only being woken for any action that simply cannot be automated.  This allows not only for many restful nights, but it also allows IT operations teams to become more agile while maintaining a proactive and highly-optimized enterprise cloud.  Getting to that state seems like it can only be found in the greatest online fantasy game, but the growing popularity of “AIOps” gives great hope that this may actually be closer to a reality than once thought.

Skeptics will tell you that automation, autonomics, orchestration, and optimization have been alive and well in the datacenter for more than a decade now. Companies like Microsoft with System Center, IBM with Tivoli, and ServiceNow are just a few examples of autonomic platforms that harness the ability to collect, analyze and make decisions on how to act against sensor data derived from physical/virtual infrastructure and appliances.  But when you couple these capabilities with advancements brought through AIOps, you are able take advantage of the previously missing components by incorporating big data analytics along with artificial intelligence (AI) and Machine Learning (ML).

As you can imagine, these advancements have brought an explosion of new tooling and services from Cloud ISV’s thought to make the once utopian Autonomic cloud a reality. Palo Alto Network’s Prisma Public Cloud product is great example of a technology that functions with autonomic capabilities.  The security and compliance features of Prisma Public Cloud are pretty impressive, but it also has a component known as User and Entity Behavior Analytics (UEBA).  UEBA analyzes user activity data from logs, network traffic and endpoints and correlates this data with security threat intelligence to identify activities—or behaviors—likely to indicate a malicious presence in your environment. After analyzing the current state of the vulnerability and risk landscape, it reports current risk and vulnerability state and derives a set of guided remediations that can be either performed manually against the infrastructure in question or automated for remediation to ensure a proactive response, hands off, to ensure vulnerabilities and security compliance can always be maintained.

Another ISV focused on AIOps is MoogSoft who is bringing a next generation platform for IT incident management to life for the cloud.  Moogsoft has purpose-built machine learning algorithms that are deigned to better correlate alerts and reduce much of the noise associated with all the data points. When you marry this with their Artificial Intelligence capabilities for IT operations, they are helping DevOps teams operate smarter, faster and more effectively in terms of automating traditional IT operations tasks.

As we move forward, expect to see more and more AI and ML-based functionality move into the core cloud management platforms as well. Amazon recently released AWS Control Tower to aide your company’s journey towards AIOps.  While coming with some pretty incredible features for new account creation and increased multi-account visibility, it uses service control policies (SCPs) based upon established guardrails (rules and policies).  As new resources and accounts come online, Control Tower can force compliance with the policies automatically, preventing “bad behavior” by users and eliminating the need to have IT configure resources after they come online. Once AWS Control Tower is being utilized, these guardrails can apply to multi-account environments and new accounts as they are created.

It is an exciting time for autonomic platforms and autonomic systems capabilities in the cloud, and we are excited to help customers realize the many potential capabilities and benefits which can help automate, orchestrate and proactively maintain and optimize your core cloud infrastructure.

To learn more about autonomic systems and capabilities, check out Gartner’s AIOps research and reach out to 2nd Watch. We would love to help you realize the potential of autonomic platforms and autonomic technologies in your cloud environment today!

-Dusty Simoni, Sr Product Manager





Operating and maintaining systems at scale with automation

Managing numerous customers with unique characteristics and tens of thousands of systems at scale can be challenging. Here, I want to pull back the curtain on some of the automation and tools that 2nd Watch develops to solve these problems. Below I will outline our approach to this problem and its 3 main components: Collect, Model, and React.

Collect: The first problem facing us is an overwhelming flood of data. We have CloudWatch metrics, CloudTrail events, custom monitoring information, service requests, incidents, tags, users, accounts, subscriptions, alerts, etc. The data is all structured differently, tells us different stories, and is collected at an unrelenting pace. We need to identify all the sources, collect the data, and store it in a central place so we can begin to consume it and make correlations between various events.

Most of the data I described above can be gathered from the AWS & Azure APIs directly, while others may need to be ingested with an agent or by custom scripts. We also need to make sure we have a consistent core set of data being brought in for each of our customers, while also expanding that to include some specialized data that perhaps only certain customers may have. All the data is gathered and sent to our Splunk indexers. We build an index for every customer to ensure that data stays segregated and secure.

Model: Next we need to present the data in a useful way. The modeling of the data can vary depending on who is using it or how it is going to be consumed. A dashboard with a quick look at several important metrics can be very useful to an engineer to see the big picture. Seeing this data daily or throughout the day will make anomalies very apparent. This is especially helpful because gathering and organizing the data at scale can be time consuming, and thus could reasonably only be done during periodic audits.

Modeling the data in Splunk allows for a low overhead view with up-to-date data so the engineer can do more important things. A great example of this is provisioned resources by region. If the engineer looks at the data on a regular basis, they would quickly notice that the number of provisioned resources has drastically changed. A 20% increase in the number of EC2 resources could mean several things; Perhaps the customer is doing a large deployment, or maybe Justin accidently put his AWS access key and secret key on GitHub (again).

We provide our customers with regular reports and reviews of their cloud environments. We also use the data collected and modeled in this tool for providing that data. Historical data trended over a month, quarter, and year can help you ask questions or tell a story. It can help you forecast your business, or the number of engineers needed to support it. We recently used the historical tending data to show progress of a large project that included waste removal and a resource tagging overhaul for a customer. Not only were we able to show progress throughout the project,t but we used that same view to ensure that waste did not creep back up and that the new tagging standards were being applied going forward.

React: Finally, it’s time to act on the data we collected and modeled. Using Splunk alerts we can provide conditional logic to the data patterns and act upon them. From Splunk we can call our ticketing system’s API and create a new incident for an engineer to investigate concerning trends or to notify the customer of a potential security risk. We can also call our own APIs that trigger remediation workflows. A few common scenarios are encrypting unencrypted S3 buckets, deleting old snapshots, restarting failed backup jobs, requesting cloud provider limit increases, etc.

Because we have several independent data sources providing information, we can also correlate events and have more advanced conditional logic. If we see that a server is failing status checks, we can also look to see if it recently changed instance families or if it has all the appropriate drivers. This data can be included in the incident and available for the engineer to review without having to check it themselves.

The entire premise of this idea and the solution it outlines is about efficiency and using data and automation to make quicker and smarter decisions. Operating and maintaining systems at scale brings forth numerous challenges and if you are unable to efficiently accommodate the vast amount of information coming at you, you will spend a lot of energy just trying to keep your head above water.

For help getting started in automating your systems, contact us.

-Kenneth Weinreich, Managed Cloud Operations