How To Keep SageMaker AI Cost Under Control and Avoid Bad Billing Surprises when doing Machine Learning in AWS

Amazon SageMaker AI is AWS’ managed service for automating Machine Learning tasks and it’s a great option to build, train and deploy ML models in the cloud. However, due to their high data processing and compute nature, ML tasks have the potential to incur very high AWS cost (it’s not uncommon to see thousands of dollars spent in AWS due to ML heavy processes). In this article I’ll cover a number of important considerations and strategies in order to keep AWS SageMaker cost under control.

How To Cut Your AWS Bill With Savings Plans (and avoid some common mistakes)

Savings Plans are a very effective way for AWS customers to reduce costs (more than 60% savings in some cases). Similar to Reserved Instances, Savings Plans deliver cost reductions in exchange for a long-term commitment. However, when comparing Savings Plans and Reserved Instances, there are many different dynamics to consider. These are often not clear or easy to navigate. I’ve written this article to cover all of these important areas as well as a straightforward process I’ve used to find the best fit for specific workloads and requirements.

How to Operate Reliable AWS Lambda Applications in Production

Are you already using AWS Lambda, or planning to launch your next application using AWS Lambda? How do you make sure your application reliably serves your customers? Operating a Serverless application in a production environment brings some familiar challenges, but also new ones. In this article I cover some points that will make your life easier once your Lambda function runs in a production environment.

Configure your Lambda functions like a champ and let your code sail smoothly to Production

Do you manage applications that rely on AWS Lambda functions? If so, how do you deploy your functions across different stages? In this post we'll take a look at different methods we can use to decouple code and configurations in AWS Lambda, in a reliable and secure way. This is particularly important in an agile development cycle, when code is constantly moving from development to test environments and eventually to Production.

Save yourself a lot of pain (and money) by choosing your AWS Region wisely

Choosing an AWS region is not a trivial decision. There are many variables that affect the price, performance and availability of your application as well as the AWS services you can use. If you choose the wrong region you could end up paying more than double and waiting several months before you can take advantage of new products and features.

Querying 15 Billion Records - a TPC-H/TPC-DS Performance Comparison between Starburst Enterprise and a Data Lakehouse platform

In this article, I will focus on two popular Data Analytics tools: Starburst Enterprise and a known Lakehouse platform based on Apache Arrow. I’ll focus on deploying these tools on AWS infrastructure and cover areas such as performance, infrastructure setup and maintenance. The article covers test results using TPC-H and TPC-DS benchmarks using data in Parquet format.

Querying 6.35 Billion Records - a TPC-DS Performance and Cost Comparison between Big Data platforms Starburst Enterprise and EMR SQL engines

In this article, I’ll compare performance, infrastructure setup, maintenance and cost related to 4 Data Analytics solutions: Starburst Enterprise, EMR Presto, EMR Spark and EMR Hive, leveraging the TPC-DS benchmark. As in previous articles, I want to answer the following: "What do I need to do in order to run this workload, how fast will it be and how much will I pay for it?”

How To Fix Your AWS Cost Problems In 5 Simple Steps

It's the beginning of the month and your latest AWS bill is in. You are not happy with your high AWS cost. How do you make sure you reduce your AWS bill in the coming months? Reducing AWS cost is not an easy task - you have to thoroughly assess your business needs to make sure you're focusing on the right areas and not overspending. In this article I'll walk you through a process I follow in order to reduce AWS cost, without putting any applications at risk.

Part IV: Redshift - The Ultimate Guide to Saving Money with AWS Reserved "Anything"

AWS Reserved purchases are a very effective way to significantly reduce AWS cost. In three previous articles, I wrote about EC2, RDS and EMR. In the fourth article in this series, I write about one of the potentially most expensive AWS services: Redshift. I'm also including live price calculations, tips and steps that apply specifically to Redshift.

Part III: EMR - The Ultimate Guide to Saving Money with AWS Reserved "Anything"

An EMR cluster can easily cost you thousands of dollars per month. In two previous articles, I wrote about how to save money by purchasing Reserved capacity for EC2 and RDS - however, there are other services where Reserved purchases can help you reduce AWS cost. In the third article in this series, I take a look at EMR, a potentially very expensive service. I'm also including live price calculations and tips that apply specifically to EMR.

Part II: RDS - The Ultimate Guide to Saving Money with AWS Reserved "Anything"

In a previous post, I wrote about how to save money using EC2 Reserved. But there's more, since AWS Reserved Pricing applies not only to EC2 instances. That's why in the second article in this series, I wrote about RDS - a potentially expensive item in your AWS bill. I'm also including live price calculations and tips that apply specifically to RDS.

Part I: EC2 - The Ultimate Guide to Saving Money with AWS Reserved "Anything"

Most AWS customers have at least heard of AWS Reserved Pricing. While the concept is popular mainly because of EC2 Reserved Instances, it actually applies to many other AWS Services. In this series, I'll walk you through what to look out for and tips on how to save money by purchasing Reserved capacity in AWS, including live price calculations. I've also included a repeatable, step-by-step process to maximize savings on your reserved purchases. We'll start with EC2.

Querying 8.66 Billion Records, part II - a Performance and Cost Comparison between Starburst Presto and EMR SQL Engines

I recently wrote an article comparing three tools that you can use on AWS to analyze large amounts of data: Starburst Presto, Redshift and Redshift Spectrum. I compared Performance and Cost using data and queries from the TPC-H benchmark, on a 1TB dataset (which adds up to 8.66 billion records!). But as you probably know, there are more data analysis tools that one can use in AWS. One in particular I’m going to take a look at is Elastic Map Reduce (EMR). In this article I compare the following tools: Starburst Presto, EMR Presto, EMR Spark and EMR Hive.

Querying 8.66 Billion Records - a Performance and Cost Comparison between Starburst Presto and Redshift

If you're an application owner, sooner or later you'll need to analyze large amounts of data. The good news? Whatever your needs are, you’ll likely be covered. The problem? Handling and analyzing large amounts of data is inherently complicated, that's why it's very important to understand the options out there. In this article, I will focus on three very interesting tools designed to analyze large amounts of data: Starburst Presto, Redshift and Redshift Spectrum. I compare Performance and Cost using data and queries from the TPC-H benchmark, on a 1TB dataset (which adds up to 8.66 billion records!)

How to Cut your S3 Cost in Half by Using the S3 Infrequent Access Storage Class

Most of us accumulate things over time, whether we want it or not. The same happens with data stored in S3. You might have files that were once popular, but now they are only filling up space and making your AWS bill higher than it should. Thanks to the S3 Infrequent Access storage class, you can save money storing files that are not accessed frequently but that you still want to keep accessible. See how you could reduce your S3 cost by about 50% a year.

How to use AWS Elastic File System to Finally Migrate your Web Applications to the Cloud

Do you want to run your web applications in AWS, but are worried about potential code changes or vendor lock-in? Then AWS Elastic File System (EFS) might be the solution! With EFS you can run your web applications in AWS with minimal or zero code changes and at the same time enjoy all the advantages of using the cloud, such as elasticity, high availability and pay-as-you-go. EFS, however, is a tricky service. It's very easy to run into performance traps. In this article, I show you how to avoid common issues with EFS, based on my own project experience migrating and launching applications using EFS.

Try out MiserBot - a fun and effective way to save money on your AWS bill

Keeping track and managing AWS cost is a difficult and time consuming task for AWS customers. Nobody likes doing it, but if you ignore it you could easily experience a bad AWS billing surprise. MiserBot is a Slack and e-mail chatbot designed to make it fun and easy to stay on top of your AWS cost, and save money!

Now you can calculate AWS cost in near real-time for your serverless applications

Calculating AWS cost at scale is a critical task before launching an application. One thing is to pay a few dollars for a development environment - and a different one is to pay for a Production application your business will depend on. That's why I created the AWS Near Real-time Price Calculator tool. An easy, automated way to estimate AWS cost in near real time, using real usage metrics. I just extended this tool's capabilities for serverless applications running on AWS Lambda, Kinesis and Dynamo DB.

Use These Tools to Keep your AWS Lambda Cost Under Control.

AWS Lambda is extremely convenient and cheap to get started with, but you have to keep an eye on cost once your applications run at scale. That's why I've built some tools to help with the monitoring and optimization of AWS Lambda cost. If you're planning to run AWS Lambda functions at scale, reading this post can save you thousands of dollars.

Using Athena to Save Money on your AWS Bill

AWS announced Athena back in re:Invent 2016. Athena is a very handy service that lets you query data that is stored in S3, without you having to launch any infrastructure. Just put data files in S3, use SQL syntax and let Athena do its magic. It's awesome. That's why it's a great tool for doing some detailed analysis on AWS Cost and Usage reports. But it's not as straightforward as it sounds, that's why I wrote some tools to simplify the whole process.

How to use AWS QuickSight to do AWS Cost Optimization (and save a lot of money)

AWS cost optimization is one of the most important tasks for any application owner. It's no secret that AWS pricing can be complicated, but thankfully there are many ways in which you can keep cost under control. AWS QuickSight is a great way to analyze billing reports, understand where your money is going and find ways to cut cost. In this article I take a close look at how to use AWS QuickSight to analyze AWS billing reports.

Turbocharge your Locust load tests by exporting results to CloudWatch

You're executing load tests using Locust. Wouldn't it be nice to have your test results in a single dashboard, together with system metrics such as CPU Usage, memory, Disk I/O, etc? In this article I show you how to export your Locust load test results in real time, to CloudWatch Logs and CloudWatch Metrics.

How to know if an AWS service is right for you

You're building a new application that will run on AWS, or migrating an existing application to AWS. If you're considering AWS, you want to pick the right components to power your application. With more than 200 AWS services available today, how do you choose one with confidence? How do you identify advantages and disadvantages? How do you uncover and address critical gaps, as early as possible? In this article I walk you through essential steps for choosing the right AWS services for your application.

Know how much your EC2 application WILL cost you, in near real-time, using this Lambda function.

Do you want to know as soon as possible when you're heading for a very large AWS bill? Like in 10 minutes, not 6, 12 or 24 hours later, when there's not much you can do about it. Or how about executing performance tests and not only see response times, but also AWS price metrics, in near real-time, without you doing any manual calculations? This article describes a way to calculate monthly EC2 pricing in near real-time, based on your current usage, using the AWS Price List API, AWS Lambda and CloudWatch Events. CloudFormation template included.

Are you hiring AWS cloud engineers? Here are some tips on what to look for...

Are you hiring engineers to work on your AWS applications? How do you know which candidates are a good fit? Traditional software engineering skills are a must, but there are also specific skills that engineers must have in today's world of cloud development and operations. In this article I write about what to look for when hiring software engineers for your AWS cloud projects.

Do you grant third parties access to your AWS account... Do you also want to know what's going on? Use CloudTrail and the AWS Elasticsearch Service

One of the most important things you should do before working with an external tool or service provider is to make sure you know which operations they are executing on your AWS resources. CloudTrail is AWS' standard auditing mechanism; it logs all API activity that takes place in your account. But one problem is that once you have CloudTrail data, it's difficult to analyze it. In this post I show you an automated way in which you can use CloudFormation to automatically set up CloudTrail and Elasticsearch for easy visualization of your activity data.

How to find an optimal EC2 configuration in 5 steps (with actual performance tests and results)

Ever wondered what EC2 configuration is the most optimal for your application? Have you ever tried different configurations and found there are a lot of knobs to turn in AWS? If you want to find a configuration in AWS that will support any business growth, you have to test and iterate. In this post I describe the steps I followed to test different EC2 instance types and determine which one best met my requirements. The steps I describe here can be applied to any application type.

How much time do I have left before my instance runs out of CPU credits?

T2 EC2 instance types are a great way to save money if you run an application that typically is not too busy, but that needs to handle occasional bursts in traffic. That being said, you need to understand CPU credits and make sure your application always has a healthy CPU credit balance. If you run out of credits, the CPU in your instance will be capped, putting your customer experience at risk. The table in this post tells you how much time you have left before you run out of CPU credits.

Takeaways from the S3 outage on February 28th, 2017.

As you probably know, Amazon S3 suffered on February 28th, 2017, a big outage. This affected pretty much all AWS services in its biggest region, N. Virginia, and a big portion of the internet. Here are some key takeaways for the rest of us, as application and business owners.

Publish JMeter results to AWS CloudWatch and get ready for performance test automation.

Do you want to automate tasks around your JMeter performance tests? If you want to know whether your tests passed or failed, the first thing you need is a set of metrics to monitor. In this post I show you how to feed your JMeter test results into CloudWatch Logs and generate test result metrics in real-time. As a bonus, I'm also including a CloudFormation template.