The exponential growth of big data is pushing companies to process massive amounts of information as quickly as possible, which is often times not realistic, practical or down right just not achievable on standard CPI’s. In a nutshell, High Performance Computing (HPC) allows you to scale performance to process and report on the data quicker and can be the solution to many of your big data problems.
However, this still relies on your cluster capabilities. By using AWS for your HPC needs, you no longer have to worry about designing and adjusting your job to meet the capabilities of your cluster. Instead, you can quickly design and change your cluster to meet the needs of your jobs. There are several tools and services available to help you do this, like the AWS Marketplace, AWS API’s, or AWS CloudFormation Templates.
Today, I’d like to focus on one aspect of running an HPC cluster in AWS that people tend to forget about – placement groups.
Placement groups are a logical grouping of instances in a single availability zone. This allows you to take full advantage of a low-latency 10 GB network, which in turn will allow you to be able to transfer up to 4TB of data per hour between nodes. However, because of the low-latency 10 GB network, the placement groups cannot span to multiple availability zones. This may scare some people away from using them, but it shouldn’t. You can create multiple placement groups in different availability zones as a work-around, and with enhanced networking you can also still connect between the different HPC’s.
One of the grea benefits of AWS HPC is that you can run your High Performance Computing clusters with no up-front costs and scale out to hundreds of thousands of cores within minutes to meet your computing needs. Learn more about Big Data and HPC solutions on AWS or Contact Us to get started with a workload workshop.
-Shawn Bliesner, Cloud Architect
Business intelligence (BI) is an umbrella term that refers to a variety of software applications used to analyze an organization’s raw data. BI as a discipline is made up of several related activities including data mining, online analytical processing, querying and reporting. Analytics is the discovery and communication of meaningful patterns in data. This blog will look at a few areas of BI that will include data mining and reporting, as well as talk about using analytics to find the answers you need to make better business decisions.
Data Mining is an analytic process designed to explore data. Companies of all sizes continuously collect data, often times in very large amounts, in order to solve complex business problems. Data collection can range in purpose from finding out the types of soda your customers like to drink to tracking genome patterns. To process these large amounts of data quickly takes a lot of processing power, and therefore, a system such as Amazon Elastic MapReduce (EMR) is often needed to accomplish this. AWS EMR can handle most use cases from log analysis to bioinformatics, which are key when collecting data, but AWS EMR can only report on data that is collected, so make sure the collected data is accurate and complete.
Reporting accurate and complete data is essential for good BI. Tools like Splunk’s Hunk and Hive work very well with AWS EMR for modeling, reporting, and analyzing data. Hive is business intelligence software used for reporting meaningful patterns in the data, while Hunk helps interactively review logs with real-time alerts. Using the correct tools is the difference between data no one can use and data that provides meaningful BI.
Why do we collect all this data? To find answers of course! Finding answers in your data, from marketing data to application debugging, is why we collect the data in the first place. AWS EMR is great for processing all that data with the right tools reporting on that data. But more than knowing just what happened, we need to find out how it happened. Interactive queries on the data are required to drill down and find the root causes or customer trends. Tools like Impala and Tableau work great with AWS EMR for these needs.
Business Intelligence and Analytics boils down to collecting accurate and complete data. That includes having a system that can process that data, having the ability to report on that data in a meaningful way, and using that data to find answers. By provisioning the storage, computation and database services you need to collect big data into the cloud, we can help you manage big data, BI and analytics while reducing costs, increasing speed of innovation, and providing high availability and durability so you can focus on making sense of your data and using it to make better business decisions. Learn more about our BI and Analytics Solutions here.
-Brent Anderson, Senior Cloud Engineer
Batch computing isn’t necessarily the most difficult thing to design a solution around, but there are a lot of moving parts to manage, and building in elasticity to handle fluctuations in demand certainly cranks up the complexity. It might not be particularly exciting, but it is one of those things that almost every business has to deal with in some form or another.
The on-demand and ephemeral nature of the Cloud makes batch computing a pretty logical use of the technology, but how do you best architect a solution that will take care of this? Thankfully, AWS has a number of services geared towards just that. Amazon SQS (Simple Queue Services) and SWF (Simple Workflow Service) are both very good tools to assist in managing batch processing jobs in the Cloud. Elastic Transcoder is another tool that is geared specifically around transcoding media files. If your workload is geared more towards analytics and processing petabyte scale big data, then tools like EMR (Elastic Map Reduce) and Kinesis could be right up your alley (we’ll cover that in another blog). In addition to not having to manage any of the infrastructure these services ride on, you also benefit from the streamlined integration with other AWS services like IAM for access control, S3, SNS, DynamoDB, etc.
For this article, we’re going to take a closer look at using SQS and SWF to handle typical batch computing demands.
Simple Queue Services (SQS), as the name suggests, is relatively simple. It provides a queuing system that allows you to reliably populate and consume queues of data. Queued items in SQS are called messages and are either a string, number, or binary value. Messages are variable in size but can be no larger than 256KB (at the time of this writing). If you need to queue data/messages larger than 256KB in size the best practice is to store the data elsewhere (e.g. S3, DynamoDB, Redis, MySQL) and use the message data field as a linker to the actual data. Messages are stored redundantly by the SQS service, providing fault tolerance and guaranteed delivery. SQS doesn’t guarantee delivery order or that a message will be delivered only once, which seems like something that could be problematic except that it provides something called Visibility Timeout that ensures once a message has been retrieved it will not be resent for a given period of time. You (well, your application really) have to tell SQS when you have consumed a message and issue a delete on that message. The important thing is to make sure you are doing this within the Visibility Timeout, otherwise you may end up processing single messages multiple times. The reasoning behind not just deleting a message once it has been read from the queue is that SQS has no visibility into your application and whether the message was actually processed completely, or even just successfully read for that matter.
Where SQS is designed to be data-centric and remove the burden of managing a queuing application and infrastructure, Simple Workflow Service (SWF) takes it a step further and allows you to better manage the entire workflow around the data. While SWF implies simplicity in its name, it is a bit more complex than SQS (though that added complexity buys you a lot). With SQS you are responsible for managing the state of your workflow and processing of the messages in the queue, but with SWF, the workflow state and much of its management is abstracted away from the infrastructure and application you have to manage. The initiators, workers, and deciders have to interface with the SWF API to trigger state changes, but the state and logical flow are all stored and managed on the backend by SWF. SWF is quite flexible too in that you can use it to work with AWS infrastructure, other public and private cloud providers, or even traditional on-premise infrastructure. SWF supports both sequential and parallel processing of workflow tasks.
Note: if you are familiar with or are already using JMS, you may be interested to know SQS provides a JMS interface through its java messaging library.
One major thing SWF buys you over using SQS is that the execution state of the entire workflow is stored by SWF extracted from the initiators, workers, and deciders. So not only do you not have to concern yourself with maintaining the workflow execution state, it is completely abstracted away from your infrastructure. This makes the SWF architecture highly scalable in nature and inherently very fault-tolerant.
There are a number of good SWF examples and use-cases available on the web. The SWF Developer Guide uses a classic e-commerce customer order workflow (i.e. place order, process payment, ship order, record completed order). The SWF console also has a built in demo workflow that processes an image and converts it to either grayscale or sepia (requires AWS account login). Either of these are good examples to walk through to gain a better understanding of how SWF is designed to work.
Contact 2nd Watch today to get started with your batch computing workloads in the cloud.
-Ryan Kennedy, Sr. Cloud Architect
As a digital business, one of the essential platforms you are leveraging today is your ecommerce platform as a way to interact, engage and sell to your customers. 2nd Watch offers Amazon Web Services hosting for ecommerce platforms for large businesses that want a flexible, secure, highly scalable, global and low-cost solution for online sales and retailing.
The architecture and management of the configuration is vital because every second counts to your customers, especially during peak hours, days and seasonal traffic. In today’s highly-connected world, forecasting demand can be difficult and often reaches new peaks through social awareness of deals or offers. Consumers are impatient, and their expectations for how fast they get information is increasing. Any performance issues can affect your brand, conversions, sales and ultimately your top line performance. In order for ecommerce platforms to be highly responsive and meet your customer demand, you must design-for-change so that you can meet your customers where they want and quickly.
Whether your enterprise is running BlueCherry with MS Dynamic AX or Magento, AWS offers the most powerful infrastructure that can scale globally to meet your customers’ demands. The essential part of running in the cloud is the architecture and engineering that will allow your business to scale efficiently to avoid unnecessary costs. With the proper configuration and management, your business can handle millions of catalog views and hundreds of thousands of orders easily to meet your top line objectives.
Enterprise essentials for running on AWS
- Security – At a high level, 2nd Watch has taken the following approach to secure the AWS infrastructure
- User access. Management of user access and data management is one of the most important aspects for a digital business. Enterprises need to control secure access for users. AWS Identity and Access Management (IAM) allows enterprises to control access to AWS services and resources. When an account is properly set-up and managed, users and groups have controls and permissions that allow or deny them access to any particular AWS resource. The proper account structure and management are required to ensure security and governance.
Manage IAM users and their access – You can create users in IAM, assign them individual security credentials (in other words, access keys, passwords, and multi-factor authentication devices), or request temporary security credentials to provide users access to AWS services and resources. You can manage permissions in order to control which operations a user can perform.
Manage IAM roles and their permissions – You can create roles in IAM and manage permissions to control which operations can be performed by the entity, or AWS service, that assumes the role. You can also define which entity is allowed to assume the role.
Manage federated users and their permissions – You can enable identity federation to allow existing identities (e.g. users) in your enterprise to access the AWS Management Console, to call AWS APIs, and to access resources, without the need to create an IAM user for each identity.
- Data Privacy. Encrypting data in transit and at rest is extremely important in the public cloud. AWS provides the essential platform enhancements to easily implement an end-to-end encryption solution. Many AWS services use SSL connections by default, and AWS enables users to securely and easily manage custom SSL certificates for their applications. Data encryption for personal or business data at rest within AWS can be easily and transparently implemented using AWS- or user-supplied encryption keys. AWS maintains platform certification compliance for many of the most important data protection and privacy certifications your business requires, and publishes backup and redundancy procedures for services so that customers can gain greater understanding of how their data flows throughout AWS. For more information on the data privacy and backup procedures for each service in the AWS cloud, consult the Amazon Web Services: Overview of Security Processes
- Reports, Certifications, and Independent Atations. AWS has, in the past, successfully completed multiple SAS70 Type II audits, and now publishes a Service Organization Controls 1 (SOC 1) report, published under both the SSAE 16 and the ISAE 3402 professional standards. In addition, AWS has achieved ISO 27001 certification, and has been successfully validated as a Level 1 service provider under the Payment Card Industry (PCI) Data Security Standard (DSS). In the realm of public sector certifications, AWS has received authorization from the U.S. General Services Administration to operate at the FISMA Moderate level, and is also the platform for applications with Authorities to Operate (ATOs) under the Defense Information Assurance Certification and Accreditation Program (DIACAP). We will continue to obtain the appropriate security certifications and conduct audits to demonstrate the security of our infrastructure and services. For more information on risk and compliance activities in the AWS cloud, consult the Amazon Web Services: Risk and Compliance whitepaper.
- Physical Security. Amazon has many years of experience in designing, constructing, and operating large-scale data centers. AWS infrastructure is housed in Amazon-controlled data centers throughout the world. Only those within Amazon who have a legitimate business need to have such information know the actual location of these data centers, and the data centers themselves are secured with a variety of physical controls to prevent unauthorized access.
- Secure Services. Each of the services within the AWS cloud is architected to be secure and contains a number of capabilities that restrict unauthorized access or usage without sacrificing the flexibility that customers demand. For more information about the security capabilities of each service in the AWS cloud, consult the Amazon Web Services: Overview of Security Processes whitepaper referenced above.
- Amazon Elastic Compute Cloud (EC2)
- Auto Scaling
- Elastic Load Balancing
- Amazon CloudFront (CDN)
- Amazon Relational Database (RDS)
- Amazon Route 53
- Amazon ElastiCache
- Amazon Simple Storage Service (Amazon S3)
Only proper configuration of enterprise ecommerce platforms and the management of user access, data management and infrastructure (IaaS) management will lead to a successful implementation in the public cloud. With the 2nd Watch solution you get the best practices for architecture, configuration, security, and performance. This allows your platform to accommodate for daily, weekly, monthly or yearly cyclical performance requirements that are easily expanded globally.
We are an AWS Premier Partner with over 400 projects on AWS and highly recommend hosting your ecommerce platform on AWS, regardless of if it is BlueCherry with MS Dynamic AX, Magento or another solution. Learn more about 2nd Watch Digital Marketing Solutions on Amazon Web Service Benefits.
Are you interested in a High Performance Solution for an ecommerce platform?
A digital business starts with automation. Learn the la at our blog, or download our Digital Business Whitepaper.
-Jeff Aden – EVP Marketing & Strategic Business Development
We’ve all been there: Surfing the internet for, well everything, and then BOOM! The website you land on serves up text, but the static and dynamic images fail to appear, leaving nothing but blank, barren real estate and feelings of frustration. Or perhaps you’ve been trying to download the la episode of Game of Thrones only to be thwarted by delivery speeds that make Tyrion’s journey to Volantis seem like it was taken aboard the Concord.
I never really stopped to consider—or appreciate—the technology that delivers consumer-facing web content like images, media, games and software downloads, until recently. I’ve been taking for granted, like most of consumers, that the content I am searching for just appears (like magic) with a single click of a mouse and rapid load of a browser.
What Are CDNs?
Content Delivery Networks (CDN)s have been around since the birth of the internet. They are the key technology that enables websites to deliver content to consumers and give content owners and publishers the ability to scale to meet increasing global demand from consumers using multiple devices and a variety of platforms.
How Do CDNs Work?
In order to achieve optimal delivery performance and accuracy, CDNs maintain a large network of globally distributed servers that are connected to the internet and store or connect to local copies of the customer’s content. By caching the content to the closest end user, it improves the experience by decreasing the amount of time needed to deliver the content to the user’s device.
Why CDNs are Important
As we discussed in our previous post about websites and web hosting, your website is one of the most visible and valuable ways of communicating with your current and potential customers. While there are several ways your business can benefit from building and hosting its website in the cloud, one key benefit is increased performance. A benefit that is realized when your customers receive the information they want, when they want it: With little to no latency and high data transfer speeds.
Most websites contain a mix if static and dynamic content. Static content includes images or style sheets while dynamic or application-generated content includes elements of your site that are personalized to each view. Previously, developers who wanted to improve the performance and reliability of their dynamic content had limited options, as the solutions offered by traditional CDNs are expensive, hard to configure and difficult to manage.
Public cloud services like Amazon CloudFront are a perfect example of how successful consumer-facing websites like PBS are achieving optimal content delivery speeds that delight visitors and improve the overall customer experience.
In terms of enterprise-related benefits, Amazon CloudFront allows developers to get started in minutes and without long term commitments for use, monthly platform fees or additional costs to deliver dynamic content to your end users. It works seamlessly with dynamic web applications running in Amazon EC2 or your origin running outside of AWS (example: on-premises data center) without any custom coding or proprietary configurations. This makes Amazon CloudFront easy to deploy and manage. Plus, you can use one, single Amazon CloudFront distribution to deliver your entire website, allowing you to use a single domain name for your entire website without the need to separate your static and dynamic content or manage multiple domain names on your website.
An AWS-sponsored whitepaper by Frost & Sullivan that compared CDN Performance of four ed CDNs discusses the benefits for enterprises:
“For enterprise companies in particular, Amazon CloudFront allows them to deliver large volumes of content with reliable performance to a global audience at a fraction of the cost of trying to deliver the content themselves using their own in-house infrastructure. Instead of a content owner having to buy their own servers, rent co-location space, buy bandwidth, enter into long-term contracts with a variety of vendors or worry about traffic spikes and delivery performance, the content owner can use Amazon CloudFront. By using Amazon CloudFront, the content owner can focus their time and resources on their core product and services, not infrastructure.”
The whitepaper also presents its findings from multiple comparison s that included top CDNs: Amazon CloudFront, Akamai, Level 3 and Limelight. The results show that Amazon CloudFront is, on average, seven percent faster than the next closest CDN and 51 percent faster than the third CDN ed.
There are many kinds of CDNs that deliver everything from small objects like images on websites, to larger pieces of content like software and media downloads. While the type of content can vary, the main goal (and central benefit) of a CDN remains the same: Improving end-user experience by more rapidly and accurately delivering content.
Migrating to Amazon CloudFront
For enterprises, choosing the best CDN partner for their business can be challenging. At 2nd Watch, our digital marketing capabilities are flexible, highly scalable, elastic and enable you to deliver valuable marketing content to your growing customer base—without the need for upfront investments or long-term contracts. It’s a low cost solution that allows you to manage your digital marketing assets (from static and dynamic content to live streaming video and gaming) with ease and agility. Whether you’re migrating your CDN from Akamai or Limelight to Amazon CloudFront, 2nd Watch’s public cloud environments enable you to focus on delivering relevant content that your current and potential customers want, when they want it. Contact us to get started.
-Katie Ellis, Marketing
Cloud technology has proven to create many benefits for companies large and small. The agility that is created by using public-cloud resources has allowed companies to act sooner and respond faster to market conditions and changes. The cost savings of running workloads in the cloud has also confirmed that the on-demand, pay-for-what-you-use model is allowing development shops and enterprise IT departments to save money for their organization like we have never seen before. We’ve talked about websites and webhosting in the cloud, but the most widely used and now probably over-looked use case for the public cloud is the ability to implement cloud resources for /dev, and this couldn’t be more relevant for companies that are looking to change their business with digital marketing.
Digital marketing continues to evolve as companies are looking to leverage technology to enhance their brand, reach new target audiences and nurture the existing customer base. Those customers are using multiple devices, on a host of different platforms; all with the goal of reaching customers in a meaningful way. From email to web apps, to social media and retargeted marketing, there are hundreds of ways to extend your digital marketing efforts, but you want to make sure that it’s done right, timely and cost effectively. The public-cloud gives developers the chance to cultivate new ways of reaching customers on these devices, as well as an inexpensive option for ing the viability of digital marketing campaigns.
Leveraging a public cloud provider like Amazon Web Services (AWS) creates the perfect environments for running development cycles and ing new offerings. Companies like Skytap are creating software that allows development/ teams even more flexibility and efficiencies by allowing companies to create Environments-as-a-Service for enterprises in a compliant and secure manner. Analytic companies like New Relic are allowing us to compile application performance data, as well as perform synthetic transactions within our campaigns so we can grab real-time data on where our customers are going, what are they looking for and answer the question “Is my application effective in reaching our target audience?” Never before have we seen so many tools to empower our development teams to make amazing marketing campaigns with flawless code deployments because of the ability to leverage public-cloud providers’ infrastructure resources that are easy to set up, always available, and flexible enough to any software we like.
Digital marketing has come a long way over the years, but we are about to embark on a new era of technology offerings that allows us to reach our customers and track their behavior in a meaningful way. The concept of a digital business using digital marketing efforts is not something that should be considered only for high-tech firms or flashy start-ups. It should be leveraged by all businesses, from the small regional bank to the large multi-national conglomerate. The ability to accept and implement this strategy will define the new Fortune 500 of tomorrow, and it all starts by creating a culture of innovation and change today. Download our white paper, The Digital Enterprise: Transforming Business in the Cloud to learn more about Dev/Test in the public cloud.
-Blake Diers, Alliance Manager