Welcome to Week 4, where you will learn about the benefits of monitoring on AWS, and how to optimize solutions on AWS. You will also learn about the function of Elastic Load Balancing (ELB), and how to differentiate between vertical scaling and horizontal scaling.
Learning Objectives
- Configure high availability for your application
- Differentiate between vertical and horizontal scaling
- Route traffic with Amazon Elastic Load Balancing
- Describe the function of Amazon Elastic Load Balancer
- Discover how to optimize solutions on AWS
- Describe the function of Amazon CloudWatch on AWS
- Define the benefits of monitoring on AWS
Monitoring on AWS
Video: Introduction to Week 4
Addressing Application Performance and Scalability
The next phase of the course will address two key issues to improve the employee directory application:
- Monitoring: Gaining visibility into application performance and resource usage is crucial. The course will introduce Amazon CloudWatch as a tool to provide these insights.
- Scalability: Since application demand can fluctuate, the course will cover:
- Automation: Adding or removing resources automatically based on demand to optimize performance and cost.
- Load Balancing: Distributing traffic across multiple resources to handle changes in demand.
Goal: The aim is to create a more robust and cost-effective application that can efficiently adapt to varying user needs.
- Hello everyone. Good to see you’re here
and ready to learn. You’re almost done with the course. Now that we’ve added a
storage and database layer, the employee directory
application is fully functional but we still have two more issues. The first issue is that we have no insight into how our application is performing and how it utilizes resources. So to fix this, we will
start off the next lesson with a discussion of a monitoring tool called Amazon CloudWatch. After we understand what the demand for your application looks like, we’ll address the second
problem, scalability. You may find that your application demand
isn’t always constant. So in the last section of the course, you’ll learn how to automate
the process of adding resources as demand increases and reducing capacity as demand declines to reduce cost. You’ll also learn how to deliver traffic and load balance across a
changing amount of resources. Let’s get started.
Video: Monitoring on AWS
Why Monitoring Matters
- Proactive Problem Solving: Don’t wait for users to report problems! With monitoring, you can identify and address performance issues before they significantly impact the user experience.
- Understanding Root Causes: User reports only tell you that there’s a problem, not why. Monitoring provides insights into where the problem originates (database, network, EC2, etc.).
The Monitoring Process
- Collect Data: Gather metrics (e.g., CPU usage) and logs (e.g., network traffic) from various cloud services involved in your application.
- Establish Baseline: Analyze historical data to determine normal operating conditions.
- Set Alerts: Define thresholds that, when exceeded, trigger automatic notifications for investigation or remediation.
The Role of Amazon CloudWatch
- Centralized Monitoring: Collects and displays data from various AWS services (RDS, DynamoDB, EC2, etc.) in a single place.
- Automation: Enables automated responses based on incoming data.
Key Takeaway: In dynamic cloud environments, effective monitoring is essential for maintaining a smoothly running application, providing a positive user experience, and quickly pinpointing the source of issues.
- When you have a solution built out of many different pieces, like how we build solutions on AWS, it’s important to be able to see how different services
are operating over time and react to operational
events as they’re happening. Let’s consider the Employee
Directory application. It’s Monday morning and the users are seeing
latency on page loads. It’s probably not good enough to wait until a user sees the slowdown, calls, or enters a ticket saying, “Hello, your application is running slow.” If you receive a call or
a ticket from your users, you can then react and
troubleshoot the issue. Waiting for users to
notice and report issues to investigate will generally
lead to unhappy end users. Ideally, you’d be able to respond to operational issues before
your end users notice. On top of that, the end user
can only provide information about their experience, and they cannot give insight into the inner workings of your solution. So, they can report the issue to you. But where’s that issue coming from? Is it an EC2 problem? Is it a database problem? Is it a code change that
has recently been deployed? Without monitoring, you have to do some digging
to figure all of that out. So what do we need to do? The first step is to put
proper monitoring in place. Monitoring is a verb. It’s something we do. We collect metrics, we collect logs, we watch network traffic. The data needed to help
you pinpoint problems comes from the different
services and infrastructure hosting the application,
and monitoring tools help you collect the data being
generated by these systems. In a cloud environment
that is ever-changing and ever-scaling, it’s even more important to collect various types of
information about the systems as they scale up and down
and change over time. The different services
that make up your solution generate data points that we call metrics. Metrics that are monitored over
time are called statistics. And metric is a data point like
the current CPU utilization of an EC2 instance, where
other data you monitor could come from different
forms like log files. For example, your network will
generate data like flow logs so you can see how network
traffic is coming into and out of your VPC. The servers will be generating metrics such as how much CPU is
currently being used, or how much network traffic the instance is accepting at any given moment. Then finally, one more example
is your database layer, which will generate
metrics such as the number of simultaneous connections
to your database. So you need a way to collect
all of this information. Once it’s collected, it can be
used to establish a baseline, and this baseline can be used to determine if things are operating smoothly or not. If the information
collected deviates too far from the baseline, you would
then trigger automatic alerts to go out to someone or something to try to remediate the issue. A good monitoring solution gathers data in one centralized location
so you can proactively monitor and operate your system to
keep your end users happy as well as allows for
automated tasks to be triggered based on the data coming in. This is where Amazon CloudWatch comes in. CloudWatch allows you to
monitor your solutions all in one place. Data will be collected from
your cloud-based infrastructure so you can see things like
database metrics coming from RDS or DynamoDB,
alongside EC2 metrics, as well as metrics coming from other services making
up your AWS solution. Coming up next, we’ll dive into some CloudWatch
features and use cases.
Reading 4.1: Monitoring on AWS
Reading
When operating a website like the Employee Directory Application on AWS you may have questions like:
- How many people are visiting my site day to day?
- How can I track the number of visitors over time?
- How will I know if the website is having performance or availability issues?
- What happens if my Amazon Elastic Compute Cloud (EC2) instance runs out of capacity?
- Will I be alerted if my website goes down?
You need a way to collect and analyze data about the operational health and usage of your resources. The act of collecting, analyzing, and using data to make decisions or answer questions about your IT resources and systems is called monitoring. Monitoring enables you to have a near real-time pulse on your system and answer the questions listed above. You can use the data you collect to watch for operational issues caused by events like over-utilization of resources, application flaws, resource misconfiguration, or security-related events. Think of the data collected through monitoring as outputs of the system, or metrics.
Use Metrics to Solve Problems
The resources that host your solutions on AWS all create various forms of data that you might be interested in collecting. You can think of each individual data point that is created by a resource as a metric. Metrics that are collected and analyzed over time become statistics, like the example of average CPU utilization over time below, showing a spike at 1:30. Consider this: One way to evaluate the health of an Amazon EC2 instance is through CPU utilization. Generally speaking, if an EC2 instance has a high CPU utilization, it can mean a flood of requests. Or it can reflect a process that has encountered an error and is consuming too much of the CPU. When analyzing CPU utilization, take a process that exceeds a specific threshold for an unusual length of time. Use that abnormal event as a cue to either manually or automatically resolve the issue through actions like scaling the instance. This is one example of a metric. Other examples of metrics EC2 instances have are network utilization, disk performance, memory utilization, and the logs created by the applications running on top of EC2.
Know the Different Types of Metrics
Different resources in AWS create different types of metrics. An Amazon Simple Storage Service (S3) bucket would not have CPU utilization like an EC2 instance does. Instead, S3 creates metrics related to the objects stored in a bucket like the overall size, or the number of objects in a bucket. S3 also has metrics related to the requests made to the bucket such as reading or writing objects. Amazon Relational Database Service (RDS) creates metrics such as database connections, CPU utilization of an instance, or disk space consumption. This is not a complete list for any of the services mentioned, but you can see how different resources create different metrics. You could be interested in a wide variety of metrics depending on the types of resources you are using, the goals you have, or the types of questions you want answered.
Understand the Benefits of Monitoring
Monitoring gives you visibility into your resources, but the question now is, “Why is that important?” The following are some of the benefits of monitoring.
Respond to operational issues proactively before your end users are aware of them. It’s a bad practice to wait for end users to let you know your application is experiencing an outage. Through monitoring, you can keep tabs on metrics like error response rate or request latency, over time, that help signal that an outage is going to occur. This enables you to automatically or manually perform actions to prevent the outage from happening—fixing the problem before your end users are aware of it.
Improve the performance and reliability of your resources. Monitoring the different resources that comprise your application provides you with a full picture of how your solution behaves as a system. Monitoring, if done well, can illuminate bottlenecks and inefficient architectures. This enables you to drive performance and reliability improvement processes.
Recognize security threats and events. When you monitor resources, events, and systems over time, you create what is called a baseline. A baseline defines what activity is normal. Using a baseline, you can spot anomalies like unusual traffic spikes or unusual IP addresses accessing your resources. When an anomaly occurs, an alert can be sent out or an action can be taken to investigate the event.
Make data-driven decisions for your business. Monitoring is not only to keep an eye on IT operational health. It also helps drive business decisions. For example, let’s say you launched a new feature for your cat photo app, and want to know whether it’s being used. You can collect application-level metrics and view the number of users who use the new feature. With your findings, you decide whether to invest more time into improving the new feature.
Create more cost-effective solutions. Through monitoring, you can view resources that are being underutilized and rightsize your resources to your usage. This helps you optimize cost and make sure you aren’t spending more money than necessary.
Enable Visibility
AWS resources create data you can monitor through metrics, logs, network traffic, events, and more. This data is coming from components that are distributed in nature, which can lead to difficulty in collecting the data you need if you don’t have a centralized place to review it all. AWS has already done that for you with a service called Amazon CloudWatch.
Amazon CloudWatch is a monitoring and observability service that collects data like those mentioned in this module. CloudWatch provides actionable insights into your applications, and enables you to respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health. This unified view is important. You can use CloudWatch to:
- Detect anomalous behavior in your environments.
- Set alarms to alert you when something’s not right.
- Visualize logs and metrics with the AWS Management Console.
- Take automated actions like scaling.
- Troubleshoot issues.
- Discover insights to keep your applications healthy.
Resource:
Mark as completedLikeDislikeReport an issue
Video: Introduction to Amazon CloudWatch
What is Amazon CloudWatch?
- A powerful monitoring service within AWS that collects and visualizes metrics from various AWS resources.
- Enables you to understand the health and performance of your applications and infrastructure.
Key Features Demonstrated
- Dashboards:
- Customizable pages to monitor multiple resources in one view.
- Example: Created a dashboard showing CPU utilization of the EC2 instance.
- Alarms:
- Set thresholds for metrics and trigger automated actions when those thresholds are crossed.
- Example: Created an alarm to send an email if CPU utilization exceeds 70% for over 5 minutes.
Other Important Notes
- Automatic Metrics: Many AWS services send metrics to CloudWatch by default.
- Custom Metrics: You can programmatically send additional application-specific metrics to CloudWatch.
- Actions with Alarms: Besides email via SNS, CloudWatch alarms can trigger other actions (e.g., scaling resources), which will be explored later.
Key Takeaway: CloudWatch is a vital tool for keeping track of the health of your AWS infrastructure, alerting you to potential issues, and facilitating automated responses.
- In this video, I’m
gonna run through some of the features that Amazon
CloudWatch has to offer. We’ve already deployed
resources into our AWS account with the employee directory
application, so to get a handle on what CloudWatch is and how it works, let’s dive into the AWS console
and see what we can see. The end goal of this demo
is to set up a dashboard that shows us the CPU utilization of the EC2 instance over time and set an alert that will be sent out if the CPU utilization of
the instance goes over 60% for a period of time. I’m already in the console and I will navigate to Amazon CloudWatch. A lot of the AWS services
begin reporting metrics to CloudWatch automatically
when you create resources. Let’s explore what types
of metrics are available from the resources that
we have already created in a previous lesson. To do that, let’s first
create a dashboard. A dashboard in CloudWatch
is a customizable page in the console that you can use to monitor your different types of resources in a single view. This includes resources that are spread across different AWS regions. I want to create a dashboard that shows me the CPU utilization over
time for the EC2 instance hosting the employee
directory application. So now I’m going to click dashboards in the left-hand navigation and then click create a dashboard and we can then name this
dashboard mydashboard. AWS resources automatically send metrics to CloudWatch and it’s
built into the service. I’m going to select
which widget we can add to the dashboard, and I will
select a line graph here. Next, we can take a look at
what the data source will be for this widget, and I will
select metric for this. Now we can browse through
all of the available metrics. This is organized by
service, so I will find EC2. I will select EC2, and then I want to view per instance metrics. From here, I can scroll through the different EC2-specific metrics that are being reported to CloudWatch and I want to select CPU utilization. Click save, and now we are
brought back to the dashboard with one line graph for one specific
instance’s CPU utilization. This gives us visibility
into one aspect of the health of our EC2 instance. You can explore in your own
account what other metrics are available for you
through CloudWatch by default but you can also report
some custom metrics to CloudWatch programmatically. This is good to know because
with EC2 and CloudWatch, you only get visibility into
the health of the instance by default, which doesn’t really
give you a holistic picture of the health of your application. The application running on the instance might not be operating correctly, but the CPU utilization could be fine. So keep in mind that you may
choose to use custom metrics to get a more detailed and
accurate view of the health in these dashboards. Once a dashboard is created,
you can then share it with your team members. Now let’s move on to another feature of CloudWatch called
Amazon CloudWatch alarms. CloudWatch alarms allow
you to create thresholds for the metrics you’re monitoring, and these thresholds can
define normal boundaries for the values of the metrics. If a metric crosses a
boundary for a period of time, the alarm would be triggered,
and then you can take a couple of different automated actions
after an alarm is triggered. In our use case, I want to notify someone if the CPU utilization
spikes over 70% for a period of time, say five minutes. Let’s create an alarm to do that. Let’s navigate to the Alarm
section and click Create alarm. Now, what I want to do is create an alarm for CPU utilization, so
I will select metric, click on EC2, per instance metrics and scroll down to select CPU utilization. Now I will select the time
period we are monitoring to trigger the alarm, which
in this case is five minutes. You want to make sure you
pick a reasonable time period where you don’t wait too long to respond but you also don’t respond to every short-lived
uptick in CPU utilization. There is a balance to strike here and it will be highly dependent on your specific situation
and your desired outcome. Okay, so now we’ll scroll
down and type in 70 for the static threshold,
which we are watching for, which is representing the
CPU utilization threshold. If it goes over 70% for
more than five minutes, it’s likely that there’s a problem, and now we will configure
the action that will be taken if the metric triggers the alarm. For CloudWatch alarms, there are three states an alarm can be in: either an ALARM, OK, or INSUFFICIENT_DATA. An alarm can trigger an
action when it transitions between these three states. In our case, I want to
have this send out an alert to an email address when it transitions from OK to an alarm. AWS has a service called Amazon
Simple Notification Service, or SNS, which allows you to create a topic and then send out messages
to subscribers of the topic. You can imagine a scenario
where we have systems admins or developers on call for our employee directory application and if something goes wrong, we want to send them an email
paging them, letting them know that something is going
wrong with the app. So we will select create a new SNS topic since we do not have one in place already. Name it CPU_Utilization_Topic, and then I will put in
an email address here to receive the alert. Notice as I scroll down here that there are other actions you can take for a CloudWatch alarm. We will talk more about some of these options in upcoming lessons. So then click Next and give
this a name and description. And finally, click Create alarm. It will take some time for the alarm to begin
collecting enough information to leave the insufficient data state and transition into the OK state. That is it for now. We will continue to use CloudWatch in upcoming lessons to
make our solution elastic through EC2 auto scaling.
Reading: Reading 4.2: Introduction to Amazon CloudWatch
Reading
How CloudWatch Works
The great thing about CloudWatch is that all you need to get started is an AWS account. It is a managed service, which enables you to focus on monitoring, without managing any underlying infrastructure.
The employee directory app is built with various AWS services working together as building blocks. It would be difficult to monitor all of these different services independently, so CloudWatch acts as one centralized place where metrics are gathered and analyzed. You already learned how EC2 instances post CPU utilization as a metric to CloudWatch. Different AWS resources post different metrics that you can monitor. You can view a list of services that send metrics to CloudWatch in the resources section of this unit.
Many AWS services send metrics automatically for free to CloudWatch at a rate of one data point per metric per 5-minute interval, without you needing to do anything to turn on that data collection. This by itself gives you visibility into your systems without you needing to spend any extra money to do so. This is known as basic monitoring. For many applications, basic monitoring does the job.
For applications running on EC2 instances, you can get more granularity by posting metrics every minute instead of every 5 minutes using a feature like detailed monitoring. Detailed monitoring has an extra fee associated. You can read about pricing on the CloudWatch Pricing Page linked in the resources section of this unit.
Break Down Metrics
Each metric in CloudWatch has a timestamp and is organized into containers called namespaces. Metrics in different namespaces are isolated from each other—you can think of them as belonging to different categories.
AWS services that send data to CloudWatch attach dimensions to each metric. A dimension is a name/value pair that is part of the metric’s identity. You can use dimensions to filter the results that CloudWatch returns. For example, you can get statistics for a specific EC2 instance by specifying the InstanceId dimension when you search.
Set Up Custom Metrics
Let’s say for your application you wanted to record the number of page views your website gets. How would you record this metric to CloudWatch? It’s an application-level metric, meaning that it’s not something the EC2 instance would post to CloudWatch by default. This is where custom metrics come in. Custom metrics allows you to publish your own metrics to CloudWatch.
If you want to gain more granular visibility, you can use high-resolution custom metrics, which enable you to collect custom metrics down to a 1-second resolution. This means you can send one data point per second per custom metric. Other examples of custom metrics are:
- Web page load times
- Request error rates
- Number of processes or threads on your instance
- Amount of work performed by your application
Note: You can get started with custom metrics by programmatically sending the metric to CloudWatch using the PutMetricData API.
Understand the CloudWatch Dashboards
Once you’ve provisioned your AWS resources and they are sending metrics to CloudWatch, you can then visualize and review that data using the CloudWatch console with dashboards. Dashboards are customizable home pages that you use for data visualization for one or more metrics through the use of widgets, such as a graph or text.
You can build many custom dashboards, each one focusing on a distinct view of your environment. You can even pull data from different Regions into a single dashboard in order to create a global view of your architecture.
CloudWatch aggregates statistics according to the period of time that you specify when creating your graph or requesting your metrics. You can also choose whether your metric widgets display live data. Live data is data published within the last minute that has not been fully aggregated.
You are not bound to using CloudWatch exclusively for all your visualization needs. You can use external or custom tools to ingest and analyze CloudWatch metrics using the GetMetricData API.
As far as security goes, you can control who has access to view or manage your CloudWatch dashboards through AWS Identity and Access Management (IAM) policies that get associated with IAM users, IAM groups, or IAM roles.
Get to Know CloudWatch Logs
CloudWatch can also be the centralized place for logs to be stored and analyzed, using CloudWatch Logs. CloudWatch Logs can monitor, store, and access your log files from applications running on Amazon EC2 instances, AWS Lambda functions, and other sources.
CloudWatch Logs allows you to query and filter your log data. For example, let’s say you’re looking into an application logic error for your application, and you know that when this error occurs it will log the stack trace. Since you know it logs the error, you query your logs in CloudWatch Logs to find the stack trace. You also set up metric filters on logs, which turn log data into numerical CloudWatch metrics that you graph and use on your dashboards.
Some services are set up to send log data to CloudWatch Logs with minimal effort, like AWS Lambda. With AWS Lambda, all you need to do is give the Lambda function the correct IAM permissions to post logs to CloudWatch Logs. Other services require more configuration. For example, if you want to send your application logs from an EC2 instance into CloudWatch Logs, you need to first install and configure the CloudWatch Logs agent on the EC2 instance.
The CloudWatch Logs agent enables Amazon EC2 instances to automatically send log data to CloudWatch Logs. The agent includes the following components.
- A plug-in to the AWS Command Line Interface (CLI) that pushes log data to CloudWatch Logs.
- A script that initiates the process to push data to CloudWatch Logs.
- A cron job that ensures the daemon is always running.
After the agent is installed and configured, you can then view your application logs in CloudWatch Logs.
Learn the CloudWatch Logs Terminology
Log data sent to CloudWatch Logs can come from different sources, so it’s important you understand how they’re organized and the terminology used to describe your logs.
Log event: A log event is a record of activity recorded by the application or resource being monitored, and it has a timestamp and an event message.
Log stream: Log events are then grouped into log streams, which are sequences of log events that all belong to the same resource being monitored. For example, logs for an EC2 instance are grouped together into a log stream that you can then filter or query for insights.
Log groups: Log streams are then organized into log groups. A log group is composed of log streams that all share the same retention and permissions settings. For example, if you have multiple EC2 instances hosting your application and you are sending application log data to CloudWatch Logs, you can group the log streams from each instance into one log group. This helps keep your logs organized.
Configure a CloudWatch Alarm
You can create CloudWatch alarms to automatically initiate actions based on sustained state changes of your metrics. You configure when alarms are triggered and the action that is performed.
You first need to decide what metric you want to set up an alarm for, then you define the threshold at which you want the alarm to trigger. Next, you define the specified time period of which the metric should cross the threshold for the alarm to be triggered.
For example, if you wanted to set up an alarm for an EC2 instance to trigger when the CPU utilization goes over a threshold of 80%, you also need to specify the time period the CPU utilization is over the threshold. You don’t want to trigger an alarm based on short temporary spikes in the CPU. You only want to trigger an alarm if the CPU is elevated for a sustained amount of time, for example if it is over 80% for 5 minutes or longer, when there is a potential resource issue.
Keeping all that in mind, to set up an alarm you need to choose the metric, the threshold, and the time period. An alarm has three possible states.
- OK: The metric is within the defined threshold. Everything appears to be operating like normal.
- ALARM: The metric is outside of the defined threshold. This could be an operational issue.
- INSUFFICIENT_DATA: The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.
An alarm can be triggered when it transitions from one state to another. Once an alarm is triggered, it can initiate an action. Actions can be an Amazon EC2 action, an Auto Scaling action, or a notification sent to Amazon Simple Notification Service (SNS).
Use CloudWatch Alarms to Prevent and Troubleshoot Issues
CloudWatch Logs uses metric filters to turn the log data into metrics that you can graph or set an alarm on. For the employee directory application, let’s say you set up a metric filter for 500-error response codes.
Then, you define an alarm for that metric that will go into the ALARM state if 500-error responses go over a certain amount for a sustained time period. Let’s say if it’s more than five 500-error responses per hour, the alarm should enter the ALARM state. Next, you define an action that you want to take place when the alarm is triggered.
In this case, it makes sense to send an email or text alert to you so you can start troubleshooting the website, hopefully fixing it before it becomes a bigger issue. Once the alarm is set up, you feel comfortable knowing that if the error happens again, you’ll be notified promptly.
You can set up different alarms for different reasons to help you prevent or troubleshoot operational issues. In the scenario just described, the alarm triggered an SNS notification that went to a person who looked into the issue manually. Another option is to have alarms trigger actions that automatically remediate technical issues.
For example, you can set up an alarm to trigger an EC2 instance to reboot, or scale services up or down. You can even set up an alarm to trigger an SNS notification, which then triggers an AWS Lambda function. The Lambda function then calls any AWS API to manage your resources, and troubleshoot operational issues. By using AWS services together like this, you respond to events more quickly.
Resources:
- External Site: AWS: Getting Started with Amazon CloudWatch
- External Site: AWS: What Is Amazon CloudWatch Logs?
- External Site: AWS Services That Publish CloudWatch Metrics
- External Site: AWS: View Available Metrics
- External Site: AWS: Amazon CloudWatch Pricing
- External Site: AWS: Amazon Simple Notification Service
- External Site: AWS: EC2 Auto Scaling Actions
Mark as completedLikeDislikeReport an issue
Optimization
Video: Optimizing Solutions on AWS
Current Infrastructure and Issues
- Single EC2 instance in one Availability Zone (AZ) hosts the application.
- DynamoDB and S3 have built-in high availability, but the application itself is a single point of failure.
Scaling: Why It Matters
- To handle increased user demand as the company grows, scaling is required.
- Two options:
- Vertical Scaling: Increasing the size of the instance (limited).
- Horizontal Scaling: Adding more instances (preferred for flexibility).
The Problem with Manual Scaling
- Launching and terminating instances to match demand is tedious and inefficient.
Solutions
- Redundancy for Availability: Add another instance in a different AZ to ensure the application remains online even if one AZ has problems.
- EC2 Auto Scaling: Automatically adds or removes EC2 instances based on defined conditions. This ensures capacity matches demand and maintains the health of the instance fleet.
- Load Balancer: Distributes requests across multiple instances, eliminating the need to track individual IP addresses and simplifying routing.
Key Takeaway: To achieve a highly available and scalable infrastructure, it’s necessary to move beyond single instances and manual management. Auto scaling and load balancing are essential tools for this.
- At this point, you
know how to set up alarms to notify you when your
infrastructure is having capacity, performance, or availability issues. But we need to go one step further. We don’t just want to
know about these issues. We want to either prevent them or respond to them automatically. In this section of the course you’ll learn how to do just that. Currently, our infrastructure
looks like this. We have one EC2 instance hosting our employee directory application in a single availability zone with DynamoDB as its database, and S3 for storage of static assets. If I wanted to evaluate the availability of our infrastructure, I have to look at each
individual component. I know that both DynamoDB and Amazon S3 are highly available by design, so that leaves one issue, this singular instance of our application. If that instance goes down employees have no way to
connect to our application. How do we solve this? Well, as you already know, to increase availability,
we need redundancy. We can achieve this by
adding one more server, but the location of that
server is important. If instance A goes down we don’t want instance B to
go down for the same reasons. So to avoid the unlikely event of, say, an AZ experiencing issues, we should ensure that instance
B is put in another AZ. So now we have two instances
that we’ve manually launched, but let’s say our company grows rapidly, and the employee directory application is constantly being accessed by thousands of employees around the world. To meet this demand we could scale our instances vertically, meaning we could increase the
size of the instances we have, or we could scale our
instances horizontally, meaning we could add more instances to create a fleet of instances. If we scale vertically
eventually we’ll reach the upper limit of
scalability for that instance. But if we scale horizontally, we don’t have those same limitations. I can add as many instances
to my fleet as I’d like, but if I need, say, 15 more
instances to meet demand, that means I’d have to manually
launch those 15 instances and manually shut them down, right? Well, you could do it that way, or you could automate this process with Amazon EC2 auto scaling. This service will allow you to
add and remove EC2 instances based off of conditions that you define. With this service, you can also maintain the health of your fleet of instances, but with more EC2 instances
comes a bigger issue. How do we access those servers? Before we were simply using
the instance public DNS name, or public IP address, but when you have multiple instances, you have multiple of IPs to route to. Instead of maintaining
the logic to send requests to your various servers, you would use a load balancer
to distribute those requests across a set of resources for you. And since you can connect
from your load balancer to access the application, you no longer need to use the public IPs of your EC2 instances. Before the next video
I’ll spin up two instances hosting our employee
directory application. See you soon.
Reading 4.3: Optimizing Solutions on AWS
Reading
What Is Availability?
The availability of a system is typically expressed as a percentage of uptime in a given year or as a number of nines. Below, you can see a list of the percentages of availability based on the downtime per year, as well as its notation in nines.
Availability (%) | Downtime (per year) |
90% (“one nine”) | 36.53 days |
99% (“two nines”) | 3.65 days |
99.9% (“three nines”) | 8.77 hours |
99.95% (“three and a half nines”) | 4.38 hours |
99.99% (“four nines”) | 52.60 minutes |
99.995% (“four and a half nines”) | 26.30 minutes |
99.999% (“five nines”) | 5.26 minutes |
To increase availability, you need redundancy. This typically means more infrastructure: more data centers, more servers, more databases, and more replication of data. You can imagine that adding more of this infrastructure means a higher cost. Customers want the application to always be available, but you need to draw a line where adding redundancy is no longer viable in terms of revenue.
Improve Application Availability
In the current application, there is only one EC2 instance used to host the application, the photos are served from Amazon Simple Storage Service (S3) and the structured data is stored in Amazon DynamoDB. That single EC2 instance is a single point of failure for the application. Even if the database and S3 are highly available, customers have no way to connect if the single instance becomes unavailable. One way to solve this single point of failure issue is by adding one more server.
Use a Second Availability Zone
The physical location of that server is important. On top of having software issues at the operating system or application level, there can be a hardware issue. It could be in the physical server, the rack, the data center or even the Availability Zone hosting the virtual machine. An easy way to fix the physical location issue is by deploying a second EC2 instance in a different Availability Zone. That would also solve issues with the operating system and the application. However, having more than one instance brings new challenges.
Manage Replication, Redirection, and High Availability
Create a Process for ReplicationThe first challenge is that you need to create a process to replicate the configuration files, software patches, and application itself across instances. The best method is to automate where you can.
Address Customer RedirectionThe second challenge is how to let the clients, the computers sending requests to your server, know about the different servers. There are different tools that can be used here. The most common is using a Domain Name System (DNS) where the client uses one record which points to the IP address of all available servers. However, the time it takes to update that list of IP addresses and for the clients to become aware of such change, sometimes called propagation, is typically the reason why this method isn’t always used.
Another option is to use a load balancer which takes care of health checks and distributing the load across each server. Being between the client and the server, the load balancer avoids propagation time issues. We discuss load balancers later.
Understand the Types of High AvailabilityThe last challenge to address when having more than one server is the type of availability you need—either be an active-passive or an active-active system.
- Active-Passive: With an active-passive system, only one of the two instances is available at a time. One advantage of this method is that for stateful applications where data about the client’s session is stored on the server, there won’t be any issues as the customers are always sent to the same server where their session is stored.
- Active-Active: A disadvantage of active-passive and where an active-active system shines is scalability. By having both servers available, the second server can take some load for the application, thus allowing the entire system to take more load. However, if the application is stateful, there would be an issue if the customer’s session isn’t available on both servers. Stateless applications work better for active-active systems.
Resources
Video: Amazon EC2 Auto Scaling
Problem: Limited Scalability
- The existing setup with two web servers can’t handle increasing traffic. Manual scaling is time-consuming.
Solution: EC2 Auto Scaling
- Auto Scaling automatically adds or removes EC2 instances based on demand, ensuring the application can handle traffic spikes.
Components
- Launch Template:
- Defines the configuration of instances to be launched (AMI, instance type, security group, user data).
- Ensures new instances are identical to existing web servers.
- Auto Scaling Group (ASG):
- Specifies where to launch instances (VPC, subnets).
- Connects to the load balancer for traffic distribution.
- Sets minimum (2), maximum (4), and desired (2) instance counts.
- Scaling Policy:
- Uses target scaling to adjust ASG capacity.
- Triggers scaling out (adding instances) when CPU utilization exceeds 60%.
Stress Test & Results
- The “Stress CPU” button in the app simulates load.
- CloudWatch shows:
- CPU utilization alarm triggered.
- Auto Scaling launched two new instances.
- Load decreased as traffic spread to more instances.
Key Takeaways
- Auto Scaling makes the application dynamically adjust to demand.
- Correct termination is important: delete the Auto Scaling Group, not just the instances, to avoid unwanted replacements.
- As more people visit our application, the demand on our two web
servers is going to increase. At some point, our two instances aren’t going to be able to handle that demand, and we’re going to need
to add more EC2 instances. Instead of launching
these instances manually, we want to do it automatically
with EC2 Auto Scaling. Autoscaling is what allows
us to provision more capacity on demand depending on
different thresholds that we set and we can set those in CloudWatch. Okay, so we’re going to
draw a high level example of how this works and then Seth
is going to build it for us. So looking at our application, traffic coming in from the outside can come down to either EC2 instance. In this video, these EC2 instances will be part of an auto-scaling group. Each of the EC2 instances that we launch will be completely identical. Then we’re going to run
code to simulate stress on our employee directory application that will make the instances
think that they’re overloaded. When this happens, the instances
will report to CloudWatch and say that their CPUs are overloaded. CloudWatch will then
go into an alarm state and tell auto-scaling,
give me more EC2 instances. As each instance comes online, they will pass ALB Health checks. They’ll start to receive traffic and give us the horizontal
scalability that we need. Seth will be helping us
build out the scalability. Let’s have him join us. You got all that, right? – Yes. It’s time to make this app scale. Let’s make it happen. So to get this built out,
the first thing we need to do is create a launch template. This is going to define what to launch. So it’s all about setting
the configurations for the instances we want to launch. Morgan said we want our
instances to be identical and that’s what we’ll be configuring here. First thing we’re going
to do in the EC2 dashboard is find launch templates on the side panel and click create launch template. We’ll first provide a
name and description. I’ll call this app-launch-template and give it a description of a web server for the employee directory app. Then you’ll notice this handy check box that asks if we are going
to use this launch template with EC2 Auto Scaling. We are. So we’ll check it and scroll down. We’ll then choose the AMI, again, the launch template
is all about what we launch. So what we want to do is create
mirrors of the web servers hosting our app that we
already have running. That way whenever we scale out,
we just have more instances with the exact same configuration. When we launched the instance
hosting our app earlier, we use the Amazon Linux, AMI,
and a T2 micro instance type. So we’ll select both of those options. Next we choose the security
group for our new instances. We’ll use the same security group you created earlier in the
course, the web security group. Then we scroll down and expand the advanced details selection. Here we’ll choose our instance role, the same role we used previously in the instance profile dropdown. Once we do that, we’ll scroll all the way
down and paste our user data. This is what grabs our source code and unzips it so that our
application runs on EC2. Now we’re done and we can click create. Now that we’ve configured
our launch template which again defines what we launch, we now have to define
an auto-scaling group which tells us where, when and
how many instances to launch. To create an autoscaling group, we’ll select autoscaling
groups on the side panel and then create an autoscaling group. Here, we’ll enter a name such as app-asg and then select the launch
template we just created app launch template and then click next. Then we’ll select our network. We’ll choose the same VPC we
created earlier in the course. App-vpc and select both the
private subnets we created, Private A and private
B and then click next. We then need to select attached
to an existing load balancer to receive traffic from the
load balancer we created earlier and then choose our target
group, at Target group. Click enable ELB health checks so that the load balancer will check if your instances are
healthy and then click next. Now, we’ll choose the maximum
minimum and desired capacity. The minimum we’ll say is two. This means that our auto-scaling group will always have at least two instances, one in each availability zone. The maximum will say is four, which means that our fleet can
scale up to four instances. And the desired capacity, which is the number of instances
you want to be running, we’ll say is two. Next, we can configure
the auto-scaling policies. With scaling policies, you
define how to scale the capacity of your auto-scaling group in
response to changing demand. For example, you could
set up a scaling policy with CloudWatch that whenever
your instant CPU utilization or any other metric that you’d
like reaches a certain limit, you can deploy more EC2 instances to your auto-scaling group. So what we want to do is
use target scaling policies to adjust the capacity of this group. Earlier, Morgan created a CloudWatch alarm that resulted in the action
of sending out an email. Here we’re going to create
a target tracking policy, much like Morgan created the alarm, but this time it will result in the action of triggering an auto-scaling event. So we’ll name this CPU utilization and then we’ll say that we
want to add a new instance to our fleet whenever
the target value is 60%. We’ll also keep the instances
need setting at 300 seconds to warm up before scaling. Then we’ll click next to
configure notifications when a scaling event happens. This is optional, so for now
we’re going to skip past it. All right, here we can review and click create autoscaling group. Now all that’s left is
to stress our application and make sure that it actually
scales up to meet the demand. To do that, I’ll open up a new tab and paste the endpoint for
our elastic load balancer. Here’s our application. I’m going to go to the info page by app pending/info on the URL. You’ll notice here that we
built in a stress CPU feature. This is going to mimic the
load coming into our servers. In the real world, you would probably use
a load testing tool, but instead we built a stress CPU button as a quick and easy way to test this out. Then we can watch our servers scale with that auto-scaling policy. So as the CPU utilization goes up, our auto-scaling policy will be triggered and our server groups will grow in size. I’m going to select 10
minutes for our stress test. All right, some time has passed. We’ve stressed our application, and if we take a look at CloudWatch, we can see what happened. So let’s click on CloudWatch. If you look here, we can
see our alarm summary. We went over 60% CPU
utilization across instances. That was our threshold, so
we launched new instances. We launched two new instances
into our autoscaling group. Then you can see the
loads start to come down. Because we’ve launched more instances into our autoscaling group, there were more hosts to accept traffic for our elastic load balancer and the average CPU utilization
went down across servers. All right, let’s go look
at the EC2 instances that were launched into
the autoscaling group. I’m going to go to the EC2 dashboard. From here, I’m going to
scroll down to target groups and select the app target group. I’m going to select
targets, and we can now see that we have four healthy target instances in our auto-scaling group. So now we have an environment
with an auto-scaling group and our launch template. We’ve set up an alarm in CloudWatch that whenever our CPU
utilization goes over 60%, we’re going to launch more
instances into that group. All right, and this app
should be able to scale now using EC2 Auto Scaling. – Great. Thank you, Seth. All right. Now if we wait around long enough, what we would see is as
the CPU load dropped off then one by one they
would each get terminated because we no longer need them. Bringing our state all the way back down to the basic two we need, the minimum number for
this particular group. All of that done without
any human interaction. If for some reason you are
doing this in your AWS account, don’t forget to delete
the autoscaling group instead of the instances. Otherwise, guess what will happen? The autoscaling group will
spin up more instances to replace the ones that were deleted. So ensure you are deleting the
auto-scaling group as well.
Video: Route Traffic with Amazon Elastic Load Balancing
What is Elastic Load Balancing (ELB)?
- A service that distributes incoming application traffic across multiple servers (e.g., EC2 instances).
- Improves scalability and fault tolerance by ensuring no single server is overwhelmed.
- Highly available and automatically scales to handle traffic fluctuations.
Types of Load Balancers
- Application Load Balancer (ALB): Works with HTTP/HTTPS traffic (web applications).
- Network Load Balancer: Handles TCP, UDP, and TLS traffic for high-performance needs.
- Gateway Load Balancer: Routes traffic to third-party appliances.
Key Components of an ALB
- Listener:
- Checks for incoming requests on a specific port and protocol (e.g., port 80, HTTP).
- Target Group:
- Group of backend resources to route traffic to (EC2 instances, Lambda functions, etc.)
- Has health checks to ensure targets are operational.
- Rule:
- Defines how requests are routed to target groups.
- Can be based on paths in the URL (e.g., send
/info
traffic to a specific target group).
Benefits of Using ELB
- Scalability: Handles traffic surges without manual intervention.
- Resilience: Reduces the impact of individual server failures.
- Flexibility: Customizable routing rules based on application needs.
How to Create an ALB (Example Steps)
- Go to the EC2 console and select “Load Balancers”.
- Create a new ALB (internet-facing, HTTP on port 80).
- Select VPC, availability zones, and public subnets.
- Choose the appropriate security group (allow traffic on port 80).
- Create a target group, select instances, and include them.
- After the load balancer is created, copy its DNS URL.
- Test it in a browser – the load balancer will distribute your requests across the target EC2 instances.
- Now that we have multiple EC2 instances hosting our application
in private subnets, it’s time to distribute our
request across our servers using the Elastic Load
Balancing or ELB service. Conceptually, how this would work is a typical request for the application would start from the browser of the client and is then sent to the load balancer. From there, the load balancer
determines which EC2 instance to send the request to. After it sends the request, the
return traffic would go back through the load balancer and
back to the client browser, meaning your load balancer is directly in the path of traffic. Now if we’re looking at the architecture, the ELB looks like one thing. It looks like a single point of failure. But ELB is actually, by
design, a highly available and automatically scalable
service, much like S3 is. What that means is that A,
the ELB is a regional service, so you don’t have to worry about maintaining nodes
in each availability zone or having to configure
high availability yourself. AWS maintains that for you. And B, the ELB is designed to
handle additional throughput and will automatically scale up to handle the traffic coming in, without you having to
configure that feature. Now there are several types of load balancers that you can choose. There’s the Application Load Balancer that load balances HTTP and HTTPS traffic. There’s the Network Load Balancer that routes TCP, UDP, and TLS traffic. And there’s the Gateway Load Balancer that is mainly used to
load balance requests to third-party applications. For our employee directory application, we’ll be load balancing web traffic so we’ll be using the Application
Load Balancer or the ALB. When you create an ALB, you’ll need to configure
three main components. The first component is a listener. The goal of the listener
is to check for requests. To define a listener, a port must be provided
as well as the protocol. For example, since we’re
routing web traffic and we set our application to use port 80, we’d want our load balancer to listen to port 80
using the HTTP protocol. Additionally, we could set
up a listener port for 443 using the HTTPS protocol. The second component is a target group. A target is the type of backend you want to direct traffic
to, such as an EC2 instance, AWS Lambda functions, or IP addresses. A target group is simply just a group of these backend resources. Each target group needs
to have a health check, which is how the load balancer can check that the target is healthy so that it can start accepting traffic. The ALB operates on the application layer, which is layer seven of the OSI model. This gives the load balancer
a lot of cool features, and one of those is the third
component, which is a rule. A rule defines how your requests
are routed to your targets. Each listener has a default rule and you can optionally
define additional rules. So if we had two target groups, A and B, we can set up an additional rule that says if traffic is coming to our /info page, we can deliver that
traffic to target group B. So because of the fact ALB
operates on layer seven, you can customize paths for your traffic. Okay, now it’s time to
put this in practice. Let’s create an Application
Load Balancer in the console. If you want to find the ELB service, you’ll need to go to the EC2 console, so we’ll type in EC2 to
the service search bar and then on the side panel,
we’ll select Load Balancers. Once we bring up our
load balancers dashboard that shows all of the load
balancers for this region, we’ll then click Create load balancer. Here’s where we choose
which type of ELB we want and you see the three main types. We’ll be using the ALB. From here, we’ll choose a
name and call it app-elb. The next setting asks if
we want an internet-facing or an internal-facing load balancer. An internet-facing load balancer
does what you might expect, which is route requests from
clients over the internet to your backend servers or targets. An internal load balancer,
on the other hand, will route requests from
clients with a private IP to targets with a private IP. For example, if you had
a three-tier application with a web, app, and database tier, you could use an
internal-facing load balancer to route traffic from your
web tier to your app tier. For this app, we’ll be
routing internet traffic, so you can leave it as internet-facing. After that, we choose
which availability zones we want to route traffic to. First, we’ll choose the VPC,
select both availability zones, and then choose the two public subnets. Now you choose your security
group for your load balancer. This is where you decide which
traffic you want to allow in. Here we’ll choose a security group that will allow traffic
on port 80 from anywhere. Then, we configure the
listeners and routing. Currently, the default setting is to allow HTTP traffic on port 80. If we want to allow or
limit to HTTPS traffic, we can click Add listener
and choose HTTPS, but for this demo, we’ll
stick with the default. Next, we’ll configure routing, and for this, we need to click
on Create a target group. This will open a new page where we can configure the target group. First, we will select the target type, which will be instances. Then we can scroll down and we’ll give it a name,
such as app-target-group, and leave all of the defaults selected. Then we’ll click Next to
choose which instances we want to live in the target group. For here, I’ll choose the
two instances I’ve created in private subnets and click
Include as pending below and click Create target group. Now coming back to the load
balancer creation page, if we refresh the dropdown
for target groups, we’ll see the target group
that was just created. We will select this and then scroll down, accepting the rest of the defaults and clicking on Create load balancer. After the load balancer has been created, we can then select it and find the DNS URL
in the description box. We can then pull that DNS URL, copy and paste it into a new tab, and from here you can see our app. If I go to the /info page of our app, you can see which availability
zone we’re currently in. If I refresh the page a few times, you should see eventually
that I’m being directed to both my EC2 instances in both AZs.
Reading 4.4: Route Traffic with Amazon Elastic Load Balancing
Reading 4.5: Amazon EC2 Auto Scaling
Week 4 Exercise & Assessment
Video: Introduction to Lab 4
Objectives
- Make the employee directory application highly available using AWS tools.
Steps
- Review and Validate EC2 Instance: Examine the existing EC2 instance’s configuration for necessary settings and specifications.
- Create a Launch Template: Build a template based on the EC2 instance configuration, which will define how new instances are launched.
- Create an Application Load Balancer (ALB): Set up the ALB to distribute traffic across multiple EC2 instances for greater availability.
- Set up Auto Scaling Group: Using the launch template, configure an auto-scaling group that enables the application to dynamically scale up or down based on demand.
- Testing:
- Utilize the application’s built-in “stress” feature to simulate increased user demand.
- Verify that the auto-scaling group responds by launching additional EC2 instances to handle the load.
Goal: Ensure your application can handle varying traffic loads, maintaining high availability and responsiveness for users.
- All right, one more lab, and this time we’re going to make the employee directory
application highly available. In this lab, you will first
review an Amazon EC2 instance and validate its configurations. Then using this information, you will create a launch template, which will be used for EC2 Auto Scaling and you will create an
Application Load Balancer. Next, you will set up
an Auto Scaling group using the launch template. This will enable your application to scale in and out with demand. Then you’ll need to test it all out. The application has a stress
feature built into it, which is used to simulate demand. You’ll use this stress feature and then validate that
scaling did actually occur. Alright, go ahead and get started.
Video: Demo Making Employee Directory Application
Goal
- Handle increased traffic on the “Employee Directory” application through load balancing for even request distribution, and auto-scaling to adjust the number of running instances based on demand.
Steps
- Launch a New Instance: Create a new instance of the application, ensuring it has the same configuration and data access as the existing instance.
- Create a Load Balancer:
- Set up an Application Load Balancer (ALB) to distribute traffic across multiple instances.
- Configure the ALB with the appropriate security group to allow access.
- Create a Target Group:
- This defines the instances the ALB will forward requests to.
- Set up health checks to determine which instances are healthy and can receive requests.
- Register the new instance in the Target Group.
- **Launch Template:
- Create a template that defines how new instances will be configured during auto-scaling. This includes instance type, security settings, and the user data script to install the application.
- Auto Scaling Group (ASG):
- Create an ASG with the launch template, ensuring it’s attached to the correct load balancer.
- Set desired, minimum, and maximum capacity (e.g., start with 2 instances, scale up to 4).
- Configure a scaling policy based on average CPU utilization (e.g., scale up when CPU exceeds 60%).
- Testing:
- Verify the application is accessible via the load balancer’s endpoint.
- Use the built-in stress test to increase CPU load and trigger scaling.
- Monitor auto-scaling events: instances being added and becoming healthy.
Key Concepts
- Load Balancing: Distributes traffic across multiple instances to prevent overload and improve responsiveness.
- Auto Scaling: Automatically adjusts the number of running instances based on defined criteria (like CPU utilization) for efficient resource usage and handling changing demand.
- Target Groups: Hold instances that the load balancer will route traffic to.
- Launch Templates: Predefined instructions for launching instances, ensuring consistency during scaling events.
- [Instructor] Hey y’all, and welcome to our final exercise walkthrough where we handle the load balancing and auto scaling for our application. As you can see, I am already logged into the AWS Management Console and I’m using the Admin user that I created earlier in order to handle these tasks. So in order to set up our load balancing and Auto Scaling, the first thing I’m going to do is launch the application instance. So I’m going to go over to EC2, and then go over to my Instances and work with the last
instance that I created. In this case, it’s employee-directory-app-dynamodb. So I will select that and I will go ahead and use the shortcut that we’ve been using, where I click Actions,
Image and templates, and then Launch more like this. In this case, instead of appending dynamodb, I am going to append lb for load balancing for this instance, and then I’m just gonna scroll and make sure that everything is where I want it to be. So I want to make sure I continue using the same key pair, and then I also want to make sure that I enable the public IP, so that I can access and test this instance. From there, I’m going to scroll down to the Advanced details. Again, just to double check that my instance role is still selected and that my user data is where I left it. And from there, I’ll go ahead and click Launch Instance. And now that that instance is launching, I can go over to View my instances, and I will just give this some time to go ahead and be fully launched, so that we can make sure that the instance is up and running and that the application is ready to go. All right, now that it
has been a few minutes, I’ll go ahead and click refresh again. And it looks like two of two checks have passed for this application instance. So just to verify that the application is up and running, I’ll copy that IPV for IP address and then paste that into a new tab. And it looks like our Employee Directory is up and running. And because it does have access to the S3 bucket and DynamoDB table that were created, the data that I had already added to the directory is still going to be visible even though this is a new instance. So now that I verify that that works, I can go ahead and close that. And what I want to do next is to create the application load balancer. So to do that, I do need to be in the EC2 Console because the load balancers are accessed through the EC2 Console. So the way that I get to the load balancing is, I scroll down in this
side navigation pane, and then there’s this section near the bottom that says Load Balancing. What I want to do is click Load Balancers. And now that I’m in the Load Balancers page, I will go ahead and click Create load balancer. And the type of load balancer that I want here is an Application Load Balancer. So under that option, I will go ahead and click Create. For my load balancer name, I want to go ahead and put in app-elb, and then I do wanna make sure that this is an internet-facing load balancer, so that I can access its endpoint from the internet. After that, I’ll go ahead and scroll down. And under Network mapping, I need to make sure that I select my app-vpc because that is where my instances are launched, that’s where my instances for this application will be launched. And then for the
Availability Zone mappings, I want to select both us-west-2a and us-west-2b. Once those are selected, you can see that it pops up with the ability for me to specify which subnets I want to route within, and I will go ahead and keep that with the Public Subnets for those Availability Zones. So scrolling down, I get to Security groups. And what I want to do here is just make sure that proper access is set up for my load balancer. So to do that, instead of using this
default security group, I’m going to go ahead and click Create new security group, and that will take me to a page where I can do just that. I can create a new security group. So for this, I’m going to name it load-balancer-sg and then my Description is just… This is for HTTP access. The VPC for this needs to be my app-vpc, so just making sure that that’s the one that is selected. And then I want to work with my rules. So it currently has no inbound rules, so I’ll go ahead and add a rule, and then add HTTP access from anywhere. And from there, I will go ahead and create my security group. So now that that security
group has been created, I can close out that new tab and then refresh my security group list. And now I have the ability to select the security group that I just created, and that will be the security group that I use for this load balancer. Now that that’s been done, I’m going to scroll down to Listeners and routing, and this is gonna be where I determine how my load balancer is receiving requests and how it’s forwarding those requests on. So here I’m going to click Create target group, and that will open up a new tab. I want to keep Instances selected as the target type, and then I want to name my target group, app-target-group. After that, I want to make sure that it’s associated with the correct VPC. And then I want to scroll down to Health checks. And Health checks is going to be how the load balancer determines which instances are capable or which targets are capable of receiving requests. Here I want to expand Advanced health check settings, and I want to make my
Healthy threshold, two, and my Unhealthy threshold, five. I also want to make my Timeout 30 seconds, and I want to change the interval of checks to 40 seconds. So what this does is, the healthy threshold is how many consecutive health checks that are successful are going to determine if the instance or if the target is available to receive requests. An unhealthy threshold is how many consecutive failed health checks determine which targets are not available to receive requests. For the Timeout, this is just how long is it going to take for the health check to fail if no response is received. And then the Interval is just how often are the health checks going to be sent. With all of those selected, I’m going to go ahead and scroll down, and then hit Next. And now I am able to register targets to my target group that I am creating. For this, I am going to go ahead and select the instance that I had just launched to include as a target, and then I’m going to select Include as pending below. Now that that has been included, what’s going to happen is, I will create the target group and then the load balancer will start those health checks to make sure that that instance as a target is up and running. So I’ll go ahead and click Create target group, and that target group has been created. So since that target
group has been created, I will go ahead and close this tab. And back on my load
balancer creation page, I can go ahead and refresh my listeners in routing dropdown list, and I can select app-target-group which I have just created in order to be utilized by the load balancer. Once I’ve selected that, I can scroll down and click Create load balancer. And now that the load
balancer is being created, I can click View load balancer. And as you can now see, I have a load balancer that has freshly been created and it is just currently
being provisioned, and that will just take a couple of minutes. So after a couple of minutes, we can see that the state of the load balancer has changed to Active. So what I want to do from here is just test to make sure that I can access my application from the load balancer. So to do that, I will select this specific load balancer. And what I want to do is copy the DNS name for this load balancer. With that copied, I will go ahead and open up a new tab, paste that load balancer
endpoint in there. And as we can see, the load balancer is directing traffic to the application instance with the data that I have added. Okay, now that I can see that that works, I’m going to go back to the EC2 Management Console. And the next thing that I want to do is get everything ready for scaling. So the first thing that I need to do in order to do that is to create a launch template. So in the left-hand menu here, under Instances, there’s a selection for Launch Templates. I’m going to go ahead and select that. And because I don’t currently have any launch templates created, I will go ahead and click Create launch template. My name for this will be app-launch-template and my description will just be A web server for the employee directory application. From there for Auto Scaling guidance, I want to select this because I am specifically going to be focusing on EC2 Auto Scaling, and then I want to scroll down and select the components that are necessary for me to launch my instances. What the launch template does is just, it tells Auto Scaling what it is that I want to launch. So what instances, how to configure those instances and all of the details that you would use if you were manually launching instances. So what I can do here is, since I do have instances already running, I can select Currently in use, and that will bring up what I am currently running. And because my account only has one instance currently running, it will pull the information for that instance. For my Instance type, I want to select a t2 micro, because that is the instance type that I want to scale, and it’s Free tier eligible, so that will keep those cost down for this little exercise. For my key, I’ll just keep selecting the same keys that I have been. And then for my Network settings, I want to make sure that I select the security group that I created specifically for this. And so that is going to be my web-security-group. After I’ve done that, I’m going to go ahead and scroll down to my Advanced details and expand that. And the two main things that I want to do here are, I want to make sure that my instance profile is utilizing the role that I’ve set up for access to the other resources, and then I want to make sure that I add my user data in order to install and configure the application that I’m using. So to do that, I will go ahead and paste the user data in here, but then I need, I just need to make sure that I change the necessary components so that I am working with the same resources. So I’ll change my bucket to the bucket that I’ve created for this application, and I will change my region to us-west-2 for the Oregon region. So once I’ve done that, I can create my launch template. And as we can see, my launch template has
been successfully created. And if I go over to View launch templates, I can see the newly
created launch template. From here, I need to create the actual Auto Scaling group. So to do that, I am going to scroll down in this left-hand navigation pane. And under Auto Scaling, I’m going to select Auto Scaling Groups. I don’t currently have
any Auto Scaling groups, so I will click Create Auto Scaling group. The name for this group will be app-asg. And because I’ve created
a launch template, I can go ahead and just select the launch template that was created
specifically for this group. Once that has been selected, I can go ahead and click Next. And this is where I get to choose various launch options. So in this case, I want to make sure that I am launching into my app-vpc, and then I want to launch these instances into my public subnets. So I will choose Public Subnet 1 and Public Subnet 2. With those selected, I can go ahead and click Next. And now I’m in my advanced options. The reason that this is important is because for one I’ve
created a load balancer and I need to make sure that this Auto Scaling group is utilizing that load balancer. So what I can do is attach to an existing load balancer, and then I can choose my load balancer from my load balancer target groups. From here, I can just drop this down and select app-target-group. And then for my Health checks, I will just have the load balancer handle the health checks, since those have already been set up. After that, I will click Next. And here is where I can decide how large and how small I want my group of instances or my fleet of instances to be. So what I’m going to do here is, I’m going to change my Desired capacity to two. I also want my Minimum capacity to be two, and then I want my Maximum
capacity to be four. This will mean that my Auto Scaling group will launch with two instances, and then it will never get smaller than two instances. So if one of those two instances is deemed unhealthy, it will launch an instance to replace that. If there is a load on the load balancer and a load on the instances, then the maximum that my fleet, the maximum size that my fleet will be is four instances. And so that’s what this group size is establishing. After that, I want to make sure that, for my Scaling policies, I choose a Target scaling policy. And for this Target scaling policy, I can change my target value for my average CPU utilization to be 60, and I’ll keep the 300 seconds of warmup as the default. And that’s just going to make sure that if I scale, the scaling is going to be based on my average CPU
utilization across my fleet, and if that hits or goes above 60%, then it will launch new instances, but it will give those
instances 300 seconds or five minutes to warm up and start passing their health checks before it does another scaling action. So now that that’s done, I can go ahead and click Next. And for notifications, I am not currently going to add notifications for my walkthrough, but here you could add a notification and make that notification an SNS topic and utilize your email address so that you can be notified anytime a scaling action is taken. But instead, I’m just going to go ahead and click Next, and then Next again, and just quickly review that everything is where I want it before creating my Auto Scaling group. So now that my Auto Scaling
group has been created, I need it to get everything to that desired capacity. And once it’s there, I can test and make sure that my application is still accessible, and then I can test scaling. So before I test my application, what I want to do is go back to the EC2 Console, and then under Load Balancing, near the bottom of the left-hand menu, I want to hit Target Groups. The reason I’m doing this is that, I want to make sure that before I start testing anything, that everything is running and is healthy. If I want to keep an eye on it, what I can do is click Targets. And then as you can see, the Auto Scaling group has launched two additional instances, one in each Availability Zone, and they are currently healthy. So that means that my application is up and running and I have multiple instances now that will be utilized whenever requests are made. And so, if one were to become unhealthy or go down for any reason, I have the other instances there to still answer those requests. So to test my application and the scaling of the application, what I’m going to do is go back to the application page that I have open, and there’s a stress test built into this application just to make it easier for this exercise. So what I’ll do is append /info to the end of my URL, and that will take me to the page that allows me to handle some tooling. One of the things I can do is I can refresh this page and it, as you’ll see, it will go between the various instances that I have running and the Availability Zones that those instances are running in. So refreshing this will show that it’s changing the instances and changing the Availability Zones that are being hit. But what I want to do to test this is utilize this Stress cpu, built-in test here. And I’m gonna go ahead and stress the CPU for 10 minutes. So this is going to allow me to stress my CPU to get it above that 60% threshold and launch new instances. And so while I’m waiting for that to happen, I’m going to go back over to the target group and view my targets. And then I’m just going to periodically refresh this to see when new instances are being added and when they’re healthy. So I’m just going to
give that some minutes as I wait for those to launch. So now that I’ve given it some time, I’m going to just see if some new instances have been launched. So I’m gonna go over to Instances, and as I can see, two more instances have been launched in response to the scaling action that is happening because of the CPU stress that we initiated. So these instances will be launched and then they will be added to the group of instances that are part of the entire fleet. I hope you enjoyed the course and that the exercises
were helpful to you. And I’ll see you in another course.
Quiz: Week 4 Quiz
What are the three components of Amazon EC2 Auto Scaling?
Launch template, scaling policies, EC2 Auto Scaling group
Amazon EC2 Auto Scaling requires users to specify three main components: a configuration template for the Amazon Elastic Compute Cloud (Amazon EC2) instances (either a launch template or a launch configuration); an EC2 Auto Scaling group to list minimum, maximum, and desired capacity of instances; and scaling policies that scale an instance based on the occurrence of specified conditions or on a schedule. For more information, see Amazon EC2 Auto Scaling.
Gemini
The correct answer is:
- Launch template, scaling policies, EC2 Auto Scaling group
Here’s why each component is crucial for EC2 Auto Scaling:
- Launch Template: Defines the configuration of instances to be launched (instance type, AMI, security group, etc.). This ensures consistency when scaling up.
- Scaling Policies: Determine when and how scaling occurs (e.g., scale out when CPU usage exceeds a threshold, scale in when demand decreases).
- EC2 Auto Scaling Group: The core element that manages the pool of instances. It uses the launch template to launch instances and the scaling policies to add/remove them based on demand.
ChatGPT
The correct answer is:
Launch template, scaling policies, EC2 Auto Scaling group
Llama3
The correct answer is: Launch template, scaling policies, EC2 Auto Scaling group.
Explanation:
- Launch Template: defines the template for launching new instances, including the Amazon Machine Image (AMI), instance type, and other settings.
- Scaling Policies: define how the Auto Scaling group should scale in response to changes in workload or other conditions.
- EC2 Auto Scaling Group: is the group of instances that are managed by Auto Scaling, and is the core component of Auto Scaling.
These three components work together to enable Auto Scaling to automatically add or remove instances from the group based on the scaling policies.
Which of the following features are included in Elastic Load Balancing (ELB)?
A and C
ELB automatically distributes incoming traffic across multiple targets—such as Amazon Elastic Compute Cloud (Amazon EC2) instances, containers, and IP addresses—in one or more Availability Zones. ELB automatically scales its capacity in response to changes in incoming traffic. In addition, if users enable Auto Scaling with Elastic Load Balancing, instances that are launched by Auto Scaling are automatically registered with the load balancer. For more information, see Route Traffic with Amazon Elastic Load Balancing.
Gemini
The correct answer is:
- A and C
Here’s why:
- A. Automatic scaling: ELB can integrate with EC2 Auto Scaling. This allows the load balancer to handle increased traffic by automatically scaling out a group of EC2 instances.
- C. Integration with Amazon EC2 Auto Scaling: As mentioned above, this integration allows for seamless scaling of resources in response to demand.
Here’s why the other options are not fully accurate:
- B. Integration with Amazon Relational Database Service RDS: While ELB can distribute traffic to RDS instances, it’s not a primary feature of ELB itself.
ChatGPT
The correct answer is:
A and C
Llama3
The correct answer is: C) Integration with Amazon EC2 Auto Scaling.
Explanation:
Elastic Load Balancing (ELB) provides several features, including:
- Integration with Amazon EC2 Auto Scaling: ELB can work with Auto Scaling to automatically add or remove instances from the load balancer as the Auto Scaling group scales up or down.
- Distribution of traffic: ELB distributes incoming traffic across multiple instances to improve responsiveness, reliability, and scalability.
- Health checks: ELB performs health checks on instances to ensure they are functioning correctly and directs traffic only to healthy instances.
- SSL termination: ELB can handle SSL termination, offloading encryption and decryption tasks from instances.
Automatic scaling is a feature of Amazon EC2 Auto Scaling, not ELB. Integration with Amazon Relational Database Service (RDS) is not a feature of ELB.
True or False: When a user uses Elastic Load Balancing (ELB) with an Auto Scaling group, it is not necessary to manually register individual Amazon Elastic Compute Cloud (Amazon EC2) instances with the load balancer.
True
The load balancer automatically registers new instances. For more information, see Amazon EC2 Auto Scaling.
Gemini
The answer is True.
Here’s why:
- Seamless Integration: When you integrate an Auto Scaling group with an Elastic Load Balancer, the load balancer automatically detects and distributes traffic to healthy instances within the group.
- Dynamic Updates: As the Auto Scaling group launches or terminates instances, the load balancer’s list of targets updates automatically.
- No Manual Intervention: This eliminates the need to manually register or deregister individual EC2 instances, streamlining the process.
Let me know if you’d like a more detailed explanation of how this integration works!
ChatGPT
The statement is True. When using Elastic Load Balancing (ELB) with an Auto Scaling group, it is not necessary to manually register individual Amazon EC2 instances with the load balancer. Auto Scaling automatically registers and deregisters instances with the load balancer as they are launched or terminated.
Llama3
The correct answer is: True.
Explanation:
When you use Elastic Load Balancing (ELB) with an Auto Scaling group, you don’t need to manually register individual Amazon Elastic Compute Cloud (Amazon EC2) instances with the load balancer. Instead, ELB automatically registers and deregisters instances as they are launched or terminated by Auto Scaling. This is one of the benefits of using ELB with Auto Scaling, as it simplifies the process of managing instances and load balancing.
An application must choose target groups by using a rule that is based on the path of a URL. Which Elastic Load Balancing (ELB) type should be used for this use case?
Application Load Balancer
Application Load Balancer is a layer 7 load balancer that routes HTTP and HTTPs traffic, with support for rules. For more information, see Route Traffic with Amazon Elastic Load Balancing.
Gemini
The correct answer is:
- Application Load Balancer
Here’s why:
- Routing Flexibility: Application Load Balancers support content-based routing, allowing you to define rules based on URL paths. This is essential for directing traffic to different target groups based on the specific content being requested.
Why the other options are less suitable:
- Classic Load Balancer: Older generation with more basic routing capabilities, primarily based on IP address and port.
- Network Load Balancer: Operates at the network layer (TCP/UDP), not focused on HTTP-level routing like URL paths.
- Target Load Balancer: This is not a real ELB type.
ChatGPT
For the given use case, where an application must choose target groups based on the path of a URL, the appropriate Elastic Load Balancing (ELB) type to use would be the:
Application Load Balancer
LLama3
The correct answer is: B) Application Load Balancer.
Explanation:
Application Load Balancer (ALB) is the correct choice because it supports path-based routing, which allows you to route traffic to different target groups based on the URL path. This is a key feature of ALB that enables you to route traffic to different microservices or applications based on the URL path.
Classic Load Balancer (CLB) does not support path-based routing, and Network Load Balancer (NLB) is primarily used for load balancing TCP traffic and does not support path-based routing either. There is no such thing as a “Target Load Balancer” in Elastic Load Balancing (ELB).
What are the two ways that an application can be scaled?
Vertically and horizontally
An application can be scaled vertically by adding more power to an existing machine, or it can be scaled horizontally by adding more machines to a pool of resources. For more information, see Optimizing Solutions on AWS.
Gemini
The correct answer is:
- Vertically and horizontally
Here’s the difference between the two scaling approaches:
- Vertical Scaling (“Scaling Up”): Increasing the capacity of an individual machine by adding more powerful resources (CPU, RAM, storage).
- Horizontal Scaling (“Scaling Out”): Adding more machines to distribute the workload across a larger pool of resources.
The other options are not standard scaling terms:
- Diagonally and vertically: Diagonal scaling isn’t a common concept.
- Independently and vertically: Independent scaling suggests unrelated resources, but vertical scaling involves changing a single machine’s resources.
ChatGPT
The correct answer is:
Vertically and horizontally
Llama3
The correct answer is: A) Vertically and horizontally.
Explanation:
There are two ways to scale an application:
- Vertical scaling (also known as scaling up): This involves increasing the power or capacity of individual instances or nodes, such as upgrading to a larger instance type or adding more resources (e.g., CPU, memory, or storage) to an existing instance. This approach can improve performance, but it has limitations, such as increased costs and potential bottlenecks.
- Horizontal scaling (also known as scaling out): This involves adding more instances or nodes to handle increased traffic or demand. This approach can provide greater flexibility, scalability, and fault tolerance, as well as improved performance and availability.
The other options are not correct:
- Diagonal scaling is not a recognized scaling approach.
- Independent scaling is not a recognized scaling approach.
- Horizontal and diagonal scaling is not a recognized combination of scaling approaches.
Which elements in Amazon CloudWatch dashboards can be used to view and analyze metrics?
Widgets
Widgets are the elements that can be added to a dashboard. For more information, see the Introduction to Amazon CloudWatch video.
Gemini
The correct answer is:
- Widgets
Here’s why:
- Visualization Tools: Widgets are the primary elements in CloudWatch dashboards for displaying metrics. They come in various types:
- Line graphs: Show trends over time
- Number graphs: Single value for the current metric
- Text widgets: Annotations, instructions, etc.
Let’s clarify the other options:
- Metrics: The actual data points being measured (CPU usage, request counts, etc.). Widgets display these metrics visually.
- Icons: While CloudWatch may sometimes use icons, they aren’t the primary element for metric visualization.
- Components: A broad term, could refer to various aspects of a dashboard, but less specific than widgets.
ChatGPT
The correct answer is:
Widgets
Llama3
The correct answer is: A) Widgets.
Explanation:
In Amazon CloudWatch, widgets are the elements used to view and analyze metrics on a dashboard. Widgets are visual representations of metrics, such as graphs, tables, and numbers, that provide real-time data about your resources and applications. You can add multiple widgets to a dashboard to create a customized view of your metrics.
The other options are not correct:
- Metrics are the actual data points being measured, such as CPU utilization or request latency. While metrics are an essential part of CloudWatch, they are not the elements used to view and analyze them.
- Icons are graphical symbols used to represent various elements in CloudWatch, such as alarms or metrics. While icons are used in CloudWatch, they are not the primary way to view and analyze metrics.
- Components is a generic term that could refer to various parts of CloudWatch, but it is not a specific element used to view and analyze metrics.
What are the possible states of a metric alarm in Amazon CloudWatch?
OK, ALARM, INSUFFICIENT_DATA
A metric alarm in CloudWatch has the following possible states. OK: The metric or expression is within the defined threshold. ALARM: The metric or expression is outside of the defined threshold. INSUFFICIENT_DATA: For this state, the alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state. For more information, see the Introduction to Amazon CloudWatch video.
Gemini
The correct answer is:
- OK, ALARM, INSUFFICIENT_DATA
Here’s what each state means:
- OK: The metric is within the defined threshold, indicating normal operation.
- ALARM: The metric has breached the defined threshold, triggering the alarm state.
- INSUFFICIENT_DATA: Not enough data points are available to determine whether the metric is in an OK or ALARM state.
ChatGPT
The possible states of a metric alarm in Amazon CloudWatch are:
- OK
- ALARM
- INSUFFICIENT_DATA
Llama3
The correct answer is: C) OK, ALARM, INSUFFICIENT_DATA.
Explanation:
In Amazon CloudWatch, a metric alarm can be in one of three possible states:
- OK: The alarm is not triggered, and the metric is within the defined threshold.
- ALARM: The alarm is triggered, and the metric has exceeded the defined threshold.
- INSUFFICIENT_DATA: There is not enough data available to determine the state of the alarm, usually because the metric is not reporting data or the alarm is still initializing.
The other options are not correct:
- Option A is close, but “NOT_AVAILABLE” is not a valid state for a metric alarm.
- Option B is incorrect because “ALERT” is not a valid state for a metric alarm.
- Option D is incorrect because “NOT_AVAILABLE” is not a valid state for a metric alarm, and the correct state is “INSUFFICIENT_DATA”.
What kind of data can a company collect with VPC Flow Logs?
Data about network traffic that comes into and out of a virtual private cloud (VPC)
With VPC Flow Logs, a company can collect data about network traffic that comes into and out of their VPC. For more information, see Monitoring on AWS.
Gemini
The correct answer is:
- Data about network traffic that comes into and out of a virtual private cloud (VPC)
VPC Flow Logs specifically capture information about network traffic flows including:
- Source IP address
- Destination IP address
- Ports
- Protocols
- Number of packets
- Number of bytes
- Whether traffic was accepted or rejected
Let’s clarify why the other options aren’t the primary focus:
- Malicious activity and unauthorized behavior: While flow logs can provide some insights for security analysis, they’re not designed to be a comprehensive security monitoring tool.
- Configurations of AWS resources: This type of information is typically found in other services like AWS Config.
- Compliance-related information: Flow logs help demonstrate network access controls, which is one part of compliance, but not the whole picture.
ChatGPT
The correct answer is:
Data about network traffic that comes into and out of a virtual private cloud (VPC)
Llama3
The correct answer is: A) Data about network traffic that comes into and out of a virtual private cloud (VPC).
Explanation:
VPC Flow Logs is a feature in Amazon Web Services (AWS) that allows you to collect and analyze network traffic data from your Virtual Private Cloud (VPC). With VPC Flow Logs, you can collect data about the network traffic that comes into and out of your VPC, including:
- Source and destination IP addresses
- Source and destination ports
- Protocol (e.g., TCP, UDP, ICMP)
- Packet and byte counts
- Accept or reject actions
This data can be used for various purposes, such as:
- Monitoring network traffic patterns
- Identifying security threats
- Optimizing network performance
- Meeting compliance requirements
The other options are not correct:
- Option B is incorrect because while VPC Flow Logs can help with security monitoring, it is not primarily used to detect malicious activity or unauthorized behavior.
- Option C is incorrect because VPC Flow Logs is not used to collect data about AWS resource configurations.
- Option D is incorrect because while VPC Flow Logs can be used to meet compliance requirements, it is not primarily used to collect compliance-related information.
What is a benefit of monitoring on AWS?
Monitoring recognizes security threats and events.
When users monitor resources, events, and systems over time, they create what is called a baseline. A baseline defines what activity is normal. By using a baseline, users can spot anomalies, such as unusual traffic spikes or unusual IP addresses that are accessing resources. When an anomaly occurs, an alert can be sent or an action can be taken to investigate the event. For more information, see Reading: Monitoring on AWS.
Gemini
Here’s the most accurate benefit of monitoring on AWS:
- Monitoring recognizes security threats and events.
Monitoring provides crucial insights into:
- Potential vulnerabilities: Identifying unusual activity or configuration changes that could signal a security issue.
- Active threats: Detecting intrusions or unauthorized access attempts.
- Compliance: Demonstrating adherence to security standards and regulations.
Let’s address why the other options aren’t as strong:
- Monitoring creates operational overhead. While there’s some setup and maintenance, effective monitoring leads to long-term efficiency gains by preventing problems.
- Monitoring decreases the performance and reliability of resources. Done correctly, monitoring should have minimal impact on system performance.
- Monitoring increases speed and agility. Monitoring helps identify bottlenecks and performance issues, allowing for faster fixes and better resource utilization.
Let me know if you’d like a deeper dive into how monitoring enhances security on AWS!
ChatGPT
The benefit of monitoring on AWS is:
Monitoring recognizes security threats and events.
Llama3
The correct answer is: B) Monitoring recognizes security threats and events.
Explanation:
Monitoring on AWS provides several benefits, including:
- Recognizing security threats and events: Monitoring helps detect and respond to security incidents, such as unauthorized access or data breaches.
- Improving performance and reliability: Monitoring enables you to identify performance bottlenecks and optimize resource utilization, leading to improved performance and reliability.
- Enhancing speed and agility: Monitoring provides real-time insights, enabling you to respond quickly to changes in your application or infrastructure, and make data-driven decisions.
The other options are not correct:
- Option A is incorrect because monitoring can actually help reduce operation overhead by identifying inefficiencies and optimizing resource utilization.
- Option C is incorrect because monitoring can actually help improve the performance and reliability of resources by identifying bottlenecks and optimizing resource utilization.
- Option D is correct, but it’s not the best answer. Monitoring does increase speed and agility, but it’s not the only benefit.
True or False: When a company redesigns an application by using a serverless service on AWS, they might not need to configure networking components, such as a virtual private cloud (VPC), subnets, and security groups.
True
By default, AWS Lambda runs functions in a secure VPC with access to AWS services and the internet. For more information, see Redesigning the Employee Directory Application.
Gemini
The answer is True. Here’s why:
- Abstraction with Serverless: Serverless services like AWS Lambda and AWS Fargate often handle networking infrastructure behind the scenes. This means you may not need to directly manage VPCs, subnets, etc.
- Focus on Code: This allows developers to primarily focus on their application logic, reducing the need to manage low-level networking details.
However, there are some nuances:
- Not Always Completely Network-Free: Some serverless services still might require basic VPC configuration for connectivity to other resources (like databases).
- Security Groups Remain Relevant: Even with serverless, security groups are often still needed to control access to and from your functions.
ChatGPT
The statement is True. When redesigning an application to use a serverless service on AWS, such as AWS Lambda, the company might not need to configure networking components like a virtual private cloud (VPC), subnets, and security groups, as the serverless service abstracts away much of the underlying infrastructure management.
Llama3
The correct answer is: A) True.
Explanation:
When a company redesigns an application using a serverless service on AWS, such as AWS Lambda, they might not need to configure networking components, such as a virtual private cloud (VPC), subnets, and security groups. This is because serverless services are designed to abstract away the underlying infrastructure, including networking components.
With serverless services, AWS manages the underlying infrastructure, including the networking components, allowing developers to focus on writing code and deploying applications without worrying about the underlying infrastructure.
However, it’s worth noting that while serverless services can simplify the development and deployment process, they may still require some configuration of networking components, such as API Gateway integrations or VPC endpoint configurations, depending on the specific use case and requirements.
Going Serverless
Video: Redesigning the Employee Directory
Original Architecture:
- Employee directory application hosted on EC2 instances (combining presentation and application logic).
- Application Load Balancer for traffic distribution.
- DynamoDB database.
- S3 for image storage.
Challenges of Original Architecture:
- Potential overload on EC2 instances due to handling both website display and backend logic.
- Maintenance overhead including patching, updates, and instance optimization.
Proposed Serverless Redesign:
- S3 for Static Website Hosting: Move HTML, CSS, JavaScript (presentation layer) to S3.
- JavaScript for Dynamic Content: Leverages JavaScript’s ability to make HTTP requests to load data from the backend.
- AWS Lambda for Application Logic: Replaces EC2 instances, running backend code in response to events (API calls).
- Amazon API Gateway: Front-end for the backend API, triggers Lambda functions based on requests.
- DynamoDB and S3 Retained: Database and image storage remain the same.
- IAM Roles: Manage secure access between components.
Benefits of Serverless Redesign:
- Scalability: Serverless components scale automatically based on demand.
- Reduced Operational Overhead: No more patching or server management.
- Potential Cost Optimization: Depends on usage patterns and traffic.
- Flexibility: Modular design allows for future changes with less disruption.
Additional Services:
- Amazon Route 53: DNS management.
- Amazon CloudFront: Caching for faster delivery of static content.
Overall: The serverless architecture offers improved scalability, easier maintenance, and the potential for cost savings by focusing on pay-per-use components (especially for variable traffic loads).
- Nice work getting through this course. Let’s take a look at how this architecture we built turned out. The employee directory application
is currently being hosted across multiple EC Two instances inside of a VPC in a private subnet. These EC Two instances are part of an EC Two auto-scaling group. And traffic is being distributed across them using an
application load balancer. The database is being
hosted on Amazon DynamoDB. And the images are stored in S3. Beautiful. Looking at this from a
maintenance perspective, you would need to ensure that your auto-scaling
policies are working well with your expectations and it would likely take
some tweaking over time. You also would need to install
security patches and updates for EC Two, as well as keep an
eye out for new instant sizes or types that might help you
further optimize your solution. Now, this is really great. But as with everything built on AWS, there are multiple ways you
can architect a solution and have success. It really depends on what
you are optimizing for and what you are trying to do that will determine how you
architect an application. That being said, what I want to do now, is present to you an architecture that could be a wonderful
serverless redesign of the employee directory application, taking full advantage
of cloud native services like AWS Lambda. I’m going to touch on some services we haven’t covered yet in this course to give you ideas of
alternative architectures. So this employee directory application is a great example of a
standard three-tier application, where you have the presentation layer, the application layer, and the data layer. The presentation layer
is the user interface. The application layer
is the business logic. And the data layer is the database. As things are right now, the Amazon EC Two instances are hosting both the presentation layer, as well as the application layer. This is true because the EC Two instances have a web server running that is serving the
content for the website, like the HTML, CSS, and JavaScript, which is the presentation layer. Then the same instances
are also handling requests for the backend logic for
viewing, adding, updating and deleting employees, which
is the application layer. What I want to do now is separate
those two pieces entirely, having the front-end of the
website hosted separately from the backend application logic. It’s important to separate
the presentation layer from the application layer so that the instances are not overloaded by handling different types
of requests at the same time. We’re going to move the presentation layer to be hosted out of Amazon S3. S3 supports static website hosting, and therefore this is a great place for us to host the HTML, CSS and
JavaScript for our website. When you’re hosting a
static website with S3, you may think, “Well, my website isn’t
static. It’s dynamic.” It’s pulling data from a database, so this isn’t a static website, and therefore S3 would not
work for this use case. This is where JavaScript comes in. JavaScript files have the
ability to make HTTP requests and load dynamic content, modifying the static
page to display results that come back from requests. So this should work well. The presentation layer is taken care of. Now, I want to tackle
the application layer. It used to be hosted on Amazon EC Two. But let’s go ahead and
change this to AWS Lambda. This means that our employee
directory application code would only be run in response
to events being triggered by the front-end presentation layer. Now, you don’t want your
front-end talking directly to your backend code. So you would instead expose
your backend using an API. We would use a service Amazon
API Gateway to host this API. Each action you could take on an employee would have its own method on the API. This API hosted on API Gateway would act as a front door
to trigger the backend code, which we would host on AWS Lambda instead of EC Two as discussed. We could have one Lambda
function handle all of the requests for employee data, or we could have one Lambda
function for each action. We would keep DynamoDB for the
database or the data layer, and we would also keep S3 for
the employee photo storage. All of the access between
these services would be handled via role-based access using IAM roles. One nice thing about this, is notice how, because we built the
solution in a modular way, we were able to swap out how we were handling the
presentation and application layer while leaving the data
layer totally intact with no modifications. That is the type of flexibility
that can help you innovate and adapt quickly to changes. So now for completeness and clarity, let’s focus on the new architecture and fill it out a bit more. I will add some other AWS
services to this diagram that you can explore on your own. First, I’m going to add Amazon Route 53 for domain name management and Amazon Cloud front here as well, which will allow us to
cash those static assets like the HTML, CSS and JavaScript, closer to the end users by taking advantage of AWS Edge locations. If a user wants to visit the employee directory application website and view all of its
employees, here’s the flow. The user would first type in
the domain for the website, which would get sent to Amazon Route 53. Route 53 would send back to the client the address of the static
website being hosted on S3, and then the website would
be rendered in the browser. This website has JavaScript making the API calls to the backend to load the dynamic content. So the API call to load all
of the employees would be made and it would hit API Gateway first. API Gateway would validate the request and then trigger the
backend Lambda function. The Lambda function would
then send an API call to DynamoDB to query the
employees in the table and it would return that
data to API Gateway, which would then be
returned to the JavaScript, which would finally be
rendered on the page. All right, and that’s that. With this architecture we just laid out, we have optimized for
scalability, operational overhead, and depending on your usage, it could also be optimized for cost. The serverless aspects of this make the operations for support much less than compared with Amazon
EC Two based workloads. There is no patching, or AMI management, when you use serverless
solutions like AWS Lambda. Also, notice how it was not required that I create a VPC,
subnets, security groups, or network access control
list for the solution. The networking aspect of
this is managed for you. Though, you can integrate
serverless services with your VPC if you need
to, for compliance reasons. But it’s not required to get
a solution up and running. You have many options to choose from when designing your application. You can imagine a scenario where you redesign the same application to be hosted using AWS container services, and then this entire
diagram would change again. There are a lot of ways
you can build on AWS and that’s the beauty of it. You can swap certain pieces
of your solutions out as AWS services are released
or gain new features. And because everything
in AWS is an API call, you can automate the
process along the way. That’s it for this course. From me, Seph, Meowzy and Fluffy, thank you so much for learning with us. One more reminder to
please, please, please, remember to delete any
resources that you’ve created in your own AWS account for this class to avoid incurring any costs if you’ve been following along. Thanks again and see you next time.
Practice Quiz: Self-Graded Activity: Capstone Project
Reading
As part of this assignment, you will create a high-level architecture diagram that uses AWS service icons and arrows to depict an AWS solution for the given scenario. Create your diagram by using a tool like diagrams.net, or you can select a different tool by from the AWS Architecture Icons page by scrolling to the Drawing and diagramming tools section.
Scenario: You have a web application that accepts requests from the internet. Clients can send requests to query for data. When a request comes in, the web application queries a MySQL database and returns the data to the client.
Instructions: Design a three-tier architecture that follows AWS best practices by using services such as Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (Amazon RDS) with high availability, and Elastic Load Balancing (ELB). Create an architecture diagram that lays out your design, including the networking layer, compute layer, database layer, and anything else that’s needed to accurately depict the architecture. Write a few paragraphs that explain why you chose the AWS services that you used and how they would support the solution for the given scenario. Your explanation must describe how traffic flows through the different AWS components—from the client to the backend database, and back to the client.
End of Course Assessment
Quiz: End of Course Assessment
What are the four main factors that a solutions architect should consider when they must choose a Region?
Latency, price, service availability, and compliance
A solutions architect should consider the following four aspects when deciding which AWS Region to use for hosting applications and workloads: latency, price, service availability, and compliance. For more information, see the AWS Global Infrastructure video in week 1
Gemini
The best answer is:
- Latency, price, service availability, and compliance
Here’s why the other options aren’t as accurate:
- Latency, high availability, taxes, and compliance: While taxes are important, they’re less fundamental than the core AWS services’ availability within a Region.
- Latency, taxes, speed, and compliance: “Speed” is too vague. Specific factors like latency, service performance, and network bandwidth are more relevant.
- Latency, security, high availability, and resiliency: Security is vital but applies across all regions. It’s not the primary reason for choosing one Region over another.
Why the four factors are essential:
- Latency: Impacts user experience. Users closer to the Region experience lower network delays.
- Price: Costs for services can vary between Regions.
- Service Availability: Not all services are available in every Region. Newer services or features often roll out gradually.
- Compliance: Data sovereignty or specific regulations may require data to reside within certain Regions.
ChatGPT
The correct answer is: Latency, price, service availability, and compliance.
Llama3
The correct answer is:
Latency, security, high availability, and compliance
When choosing a region for a solution, a solutions architect should consider the following factors:
- Latency: The time it takes for data to travel from the user to the server and back. This is important for applications that require real-time interactions, such as video conferencing or online gaming.
- Security: The level of security and encryption provided by the region. This includes factors such as firewall rules, access controls, and encryption protocols.
- High availability: The ability of the region to maintain uptime and availability of the application. This includes factors such as redundancy, failover, and disaster recovery.
- Compliance: The level of compliance with regulatory requirements, such as data sovereignty, GDPR, HIPAA, etc. This includes factors such as data storage, processing, and transmission.
The other options are not accurate. Taxes are not typically a consideration when choosing a region. Speed is not a direct consideration, although it may be related to latency.
Which statement BEST describes the relationship between Regions, Availability Zones and data centers?
Regions are clusters of Availability Zones. Availability Zones are clusters of data centers.
The AWS Cloud infrastructure is built around AWS Regions and Availability Zones. An AWS Region is a physical location in the world that has multiple Availability Zones. Availability Zones consist of one or more discrete data centers, each with redundant power, networking, and connectivity, housed in separate facilities. For more information, see the AWS Global Infrastructure video in week 1.
Gemini
The most accurate statement is:
- Regions are clusters of Availability Zones. Availability Zones are clusters of data centers.
Here’s a breakdown of the hierarchy:
- Region: A geographically distinct area with multiple, isolated locations (Availability Zones) designed for high availability.
- Availability Zone (AZ): A physically separate location within a Region, containing one or more data centers. Each AZ has independent power, cooling, and networking.
- Data Center: A physical facility housing computing equipment and infrastructure within an Availability Zone.
ChatGPT
The correct statement is: Regions are clusters of Availability Zones. Availability Zones are clusters of data centers.
Llama3
The correct answer is:
Regions are clusters of Availability Zones. Availability Zones are clusters of data centers.
Here’s a breakdown of the relationship between Regions, Availability Zones, and data centers:
- A Region is a geographic area where Amazon Web Services (AWS) provides a set of services, such as computing, storage, and databases.
- An Availability Zone (AZ) is a separate physical location within a Region, designed to provide high availability and redundancy. Each AZ is isolated from the others, but connected through high-speed networks.
- A data center is a physical facility that houses the infrastructure and equipment needed to provide AWS services. Each AZ typically has one or more data centers, which are connected to each other and to other AZs within the Region.
So, a Region is a collection of AZs, and an AZ is a collection of data centers.
Which of the following can be found in an AWS Identity and Access Management (IAM) policy?
A and B
Gemini
Here’s the correct answer and explanation:
- A and B (Effect and Action)
IAM policies have these core components:
- Effect: “Allow” or “Deny” – determines if a request is permitted or blocked.
- Action: The specific operations allowed or denied (examples: “s3:GetObject”, “ec2:StartInstances”)
Why not Object:
- Resource: IAM policies specify actions allowed/denied on resources. The resource (e.g., an S3 bucket, an EC2 instance) is often specified with an Amazon Resource Name (ARN). While closely related, it’s not technically a core component of the policy itself.
IAM Policy Example:
JSON
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::my-bucket"
}
]
}
Let me know if you’d like to explore more details about IAM policies!
ChatGPT
The correct answer is: B and C. In an AWS Identity and Access Management (IAM) policy, you can find “Action” and “Object” (Resource), which define what actions are allowed or denied on what AWS resources.
Llama3
The correct answer is:
B and C
An AWS Identity and Access Management (IAM) policy can contain the following elements:
- Effect: The effect of the policy, which can be either “Allow” or “Deny”.
- Action: The specific action that is being allowed or denied, such as “s3:GetObject” or “ec2:DescribeInstances”.
- Object: The resource that is being acted upon, such as a specific S3 bucket or an EC2 instance.
The other options are not correct:
- Option A is incorrect because “Effect” is not a resource that can be found in an IAM policy.
- Option D is incorrect because “A and B” is not a valid combination of elements that can be found in an IAM policy.
A solutions architect is consulting for a company. When users in the company authenticate to a corporate network, they want to be able to use AWS without needing to sign in again. Which AWS identity should the solutions architect recommend for this use case?
IAM Role
An IAM role does not have any credentials (password or access keys) that are associated with it. Instead of being uniquely associated with one person, a role can be assumed by anyone who needs it. An IAM user can assume a role to temporarily take on different permissions for a specific task. A role can be also assigned to a federated user who signs in by using an external identity provider (IdP) instead of IAM. For more information, see the Role Based Access in AWS video.
Gemini
The best answer is IAM Role. Here’s why:
- IAM Roles provide temporary security credentials: Roles are ideal for granting access from external identity providers when users are already authenticated elsewhere. This avoids needing to manage separate IAM users in AWS.
- Reasons other options aren’t the best fit:
- AWS account root user: Has full administrative privileges – excessive and risky for general use.
- IAM user: Represents a long-term identity. Better suited when the identity originates within AWS itself.
- IAM Group: A collection of IAM users, it doesn’t address the cross-system authentication requirement.
Federated Identity with IAM Roles
Here’s the general setup the solution architect would likely suggest:
- Corporate Identity Provider (IdP): The existing system where company users authenticate.
- AWS IAM Role: Configured to trust the company’s IdP. When a user authenticates to their corporate network, they can assume this role.
- Temporary Credentials: The IAM Role provides temporary AWS credentials the user can use for interacting with AWS services.
ChatGPT
The AWS identity that the solutions architect should recommend for this use case is an IAM Role. By assuming an IAM role, users can temporarily receive permissions to carry out tasks in AWS services without needing to sign in separately. This can be particularly useful for federated access scenarios where users authenticate through corporate networks.
Llama3
The correct answer is:
IAM Role
When users in the company authenticate to the corporate network, they want to be able to use AWS without needing to sign in again. This is known as Single Sign-On (SSO). To achieve SSO, the solutions architect should recommend using an IAM Role.
Here’s why:
- An IAM Role is a set of permissions that can be assumed by a user or a service. When a user assumes an IAM Role, they are granted the permissions defined in the Role, without needing to provide credentials.
- When a user assumes an IAM Role, they can access AWS resources without needing to sign in again.
- IAM Roles can be used for SSO, allowing users to access AWS resources without needing to provide credentials.
The other options are not correct:
- Option A is incorrect because the AWS account root user is not suitable for SSO. The root user has full access to the AWS account, which is not secure.
- Option B is incorrect because an IAM user is a permanent identity that requires credentials to access AWS resources. It’s not suitable for SSO.
- Option D is incorrect because an IAM Group is a collection of users or roles, but it’s not a suitable solution for SSO.
A company wants to allow resources in a public subnet to communicate with the internet. Which of the following must the company do to meet this requirement?
B and C
Unlike a modem at home, which can go down or go offline, an internet gateway is highly available and scalable. After the company creates an internet gateway, they then need to attach it to a virtual private cloud (VPC) and create a route table to route network traffic through the internet gateway. For more information, see the Introduction to Amazon VPC reading.
Gemini
The correct answer is B and C:
- Attach an internet gateway to their VPC
- Create a route in a route table to the internet gateway
Here’s why:
- Internet Gateway (IGW): An IGW is a highly available, managed component that allows communication between a VPC and the public internet. Attaching it is essential.
- Route Table: A route table defines how network traffic flows within your VPC. You need a route saying: “Traffic destined for the internet (0.0.0.0/0) should be sent to the Internet Gateway.”
Why other options are incorrect:
- Create a route to a private subnet: Private subnets, by definition, don’t have direct routes to the internet. This would isolate resources further.
ChatGPT
The correct answer is: B and C. To allow resources in a public subnet to communicate with the internet, the company must attach an internet gateway to their VPC (making it an internet-facing VPC) and create a route in a route table associated with the public subnet to the internet gateway.
Llama3
The correct answer is:
B and C
To allow resources in a public subnet to communicate with the internet, the company must:
- Attach an internet gateway to their VPC: An internet gateway is a virtual device that allows communication between the VPC and the internet. It’s a required component for public subnets to access the internet.
- Create a route in a route table to the internet gateway: A route table is used to determine where network traffic should be routed. In this case, the company needs to create a route that directs traffic from the public subnet to the internet gateway.
Here’s why the other options are incorrect:
- Option A is incorrect because creating a route to a private subnet is not necessary for allowing resources in a public subnet to communicate with the internet.
- Option D is incorrect because attaching an internet gateway to a VPC is not enough; the company also needs to create a route in the route table to the internet gateway.
By attaching an internet gateway to the VPC and creating a route to it, the company can allow resources in the public subnet to communicate with the internet.
What does an Amazon Elastic Compute Cloud (Amazon EC2) instance type indicate?
Instance family and instance size
Instance types are named based on instance generation, family, additional capabilities, and size. For more information, see the Introduction to Amazon EC2 video.
Gemini
The correct answer is:
- Instance family and instance size
Here’s what an Amazon EC2 instance type indicates:
- Instance Family: A category of instances with similar performance characteristics. For example:
- General Purpose (e.g., t2, m5)
- Compute-Optimized (e.g., c5)
- Memory-Optimized (e.g., r5)
- Instance Size: Determines the amount of resources (CPU cores, memory, storage capacity) allocated to the instance. For example:
- nano
- micro
- large
- xlarge
The other options are not accurate:
- Instance placement: Refers to Availability Zone placement or dedicated instances, not reflected in the instance type.
- Instance tenancy: Refers to dedicated instances/hosts or spot instances, a separate pricing model.
- Instance Amazon Machine Image (AMI): Chosen during instance launch, not tied to the type itself.
- Networking speed: Can be influenced by instance type (Enhanced Networking), but not the sole indicator.
ChatGPT
An Amazon EC2 instance type indicates the instance family and instance size.
Llama3
The correct answer is:
1 point
Instance family and instance size
An Amazon Elastic Compute Cloud (Amazon EC2) instance type indicates the combination of:
- Instance family: This refers to the type of instance, such as C5, M5, R5, etc.
- Instance size: This refers to the specific configuration of the instance, such as the number of CPU cores, memory, and storage.
The instance type determines the resources available to the instance, such as CPU, memory, and storage. For example, an instance type like c5.xlarge might have 4 CPU cores, 16 GB of memory, and 30 GB of storage.
The other options are not correct:
- Option B is incorrect because instance placement refers to the availability zone or region where the instance is located, which is not indicated by the instance type.
- Option C is incorrect because instance tenancy refers to the level of isolation between instances, which is not indicated by the instance type. Instance billing is also not indicated by the instance type.
- Option D is incorrect because the Amazon Machine Image (AMI) is a template that defines the software configuration of the instance, which is not indicated by the instance type. Networking speed is also not indicated by the instance type.
What is a typical use case for Amazon Simple Storage Service (Amazon S3)?
Object storage for media hosting
Amazon S3 is an object storage service that is designed for large objects, such as media files. Because users can store unlimited objects, and the size of each individual object can be up to 5 TB, Amazon S3 is a good location to host video, photo, or music uploads. For more information, see the Object Storage with Amazon S3 video.
Gemini
Here’s the best answer and why others aren’t as suitable:
- Object storage for media hosting: This is a classic S3 use case. It’s great for storing files like images, videos, and documents that need to be directly accessed via URLs.
Let’s look at why the other choices aren’t ideal:
- Object storage for a boot drive: You can’t boot an EC2 instance directly from S3. Boot drives require block storage (like Amazon EBS).
- Block storage for an EC2 instance: Block storage provides the low-latency performance typically required for operating systems and databases. S3’s object storage model has different access patterns.
- File storage for multiple EC2 instances: While technically possible with solutions like Amazon EFS (Elastic File System), S3 isn’t optimized for concurrent file-level access from multiple instances. EFS is better suited to this use case.
ChatGPT
A typical use case for Amazon Simple Storage Service (Amazon S3) is object storage for media hosting.
Llama3
The correct answer is:
1 point
Object storage for media hosting
Amazon S3 is a popular use case for object storage, which is designed to store and serve large amounts of unstructured data, such as images, videos, and audio files. S3 is optimized for storing and serving large files, making it a great fit for media hosting.
Here’s why the other options are not correct:
- Option B is incorrect because S3 is not designed for boot drives. Boot drives typically require a specific type of storage that is optimized for fast boot times and low latency, which S3 is not designed for.
- Option C is incorrect because S3 is not designed for block storage. Block storage is typically used for storing and serving files, whereas S3 is designed for storing and serving large amounts of unstructured data.
- Option D is incorrect because S3 is not designed for file storage for multiple EC2 instances. While S3 can be used to store files for multiple instances, it’s not designed for storing files for multiple instances, but rather for storing and serving large amounts of unstructured data.
In summary, S3 is designed for object storage, which makes it a great fit for media hosting, and not for other use cases like boot drives, block storage, or file storage for multiple EC2 instances.
A solutions architect is working for a healthcare facility, and they are tasked with storing 7 years of patient information that is rarely accessed. The facility’s IT manager asks the solutions architect to consider one of the Amazon Simple Storage Service (Amazon S3) storage tiers to store the patient information. Which storage tier should the solutions architect suggest?
Amazon S3 Glacier Deep Archive
Amazon S3 Glacier Deep Archive is the lowest-cost storage class in Amazon S3. This storage class supports long-term retention and digital preservation for data that might be accessed once or twice in a year. It is designed for customers—particularly those in highly regulated industries, such as financial services, healthcare, and the public sector—that retain data sets for 7 to 10 years (or longer) to meet regulatory compliance requirements. For more information, see the Object storage with Amazon S3 reading.
Gemini
The best option for this scenario is Amazon S3 Glacier Deep Archive. Here’s why:
- Long-Term, Infrequent Access: Glacier Deep Archive is designed for data that needs to be retained for compliance (like healthcare records) but rarely needs to be retrieved.
- Lowest Cost: Glacier Deep Archive offers the lowest storage cost of all S3 storage tiers. This aligns well with cost-sensitive, long-term archival needs.
- Retrieval Flexibility: While retrieval takes longer than other tiers (hours), it’s still possible, allowing for access if the patient data is ever needed.
Why other options are less ideal:
- Amazon S3 Standard: Designed for frequently accessed data. Costlier for purely archival storage.
- Amazon S3 Standard-Infrequent Access: For less frequent access, but still within days rather than years.
- Amazon S3 Intelligent-Tiering: Automatic movement between tiers. Great for mixed usage patterns, but less cost-effective for data that’s almost exclusively archival.
ChatGPT
For storing patient information that is rarely accessed, the solutions architect should suggest using Amazon S3 Glacier Deep Archive. This storage class is designed for long-term archival of data that is accessed very infrequently.
Llama3
For storing patient information that is rarely accessed, the solutions architect should suggest using Amazon S3 Glacier Deep Archive. This storage class is designed for long-term archival of data that is accessed very infrequently.
Which task of running and operating the database are users responsible for when they use Amazon Relational Database Service (Amazon RDS)?
Optimizing the database
With Amazon RDS, users are no longer responsible for the underlying environment that the database runs on. Instead, users can focus on optimizing the database because Amazon RDS has components that AWS manages. For more information, see Explore Databases on AWS.
Gemini
The most accurate choice is:
- Optimizing the database
With Amazon RDS, users are still responsible for the database itself, including:
- Schema design
- Query optimization
- Performance tuning
- Indexing
Tasks Amazon RDS Handles:
- Provisioning and managing the underlying infrastructure: RDS takes care of hardware, setup, etc.
- Installing the relational database management system on the database instance: RDS pre-installs and handles the DB software.
- Installing patches to the operating system for the database instance: RDS automates OS patching and maintenance.
Let me know if you’d like to dive into the difference between managed services like RDS and running a database entirely yourself!
ChatGPT
When using Amazon Relational Database Service (Amazon RDS), users are responsible for optimizing the database. This includes tasks such as tuning queries, designing schemas, and optimizing performance parameters within the database itself.
Llama3
The correct answer is:
2 point
Provisioning and managing the underlying infrastructure
When using Amazon Relational Database Service (Amazon RDS), users are responsible for provisioning and managing the underlying infrastructure, which includes:
- Choosing the instance type and storage size
- Configuring the database instance
- Monitoring and troubleshooting the database instance
The other options are not correct:
- Option 1 is incorrect because optimizing the database is not a task that users are responsible for when using Amazon RDS. Amazon RDS provides a managed relational database service, which means that Amazon handles the underlying infrastructure and database management tasks, including optimization.
- Option 3 is incorrect because installing the relational database management system on the database instance is not a task that users are responsible for when using Amazon RDS. Amazon RDS provides a managed relational database service, which means that Amazon handles the installation and configuration of the database management system.
- Option 4 is incorrect because installing patches to the operating system for the database instance is not a task that users are responsible for when using Amazon RDS. Amazon RDS provides a managed relational database service, which means that Amazon handles the maintenance and patching of the underlying operating system.
By provisioning and managing the underlying infrastructure, users can focus on developing and deploying their applications, while Amazon RDS handles the underlying infrastructure and database management tasks.
True or false: A Multi-AZ deployment is beneficial when users want to increase the availability of their database.
True
Placing a workload across multiple Availability Zones increases the availability of resources. For example, say that an environmental hazard in an Availability Zone causes an Amazon Aurora database to stop working. In this case, a read-replica of the Aurora database instance that is in an unaffected Availability Zone will automatically be promoted to a primary database instance. For more information, see Amazon Relational Database Service.
Gemini
The answer is True.
Here’s why a Multi-AZ deployment increases database availability:
- Redundancy: A Multi-AZ deployment replicates your database across multiple Availability Zones (AZs). This means you have working copies of your database in physically separate locations.
- Failover: If one Availability Zone experiences issues, AWS RDS can automatically failover to the replica in another Availability Zone. This minimizes downtime and disruptions.
- High Availability: Since your database remains operational even if an AZ fails, Multi-AZ deployments significantly improve the overall availability of your database system.
ChatGPT
True. A Multi-AZ (Availability Zone) deployment in Amazon RDS helps increase the availability of a database by automatically replicating data to a standby instance in a different Availability Zone. If the primary database instance fails, the system can automatically failover to the standby instance, reducing downtime.
Llama3
The correct answer is:
True
A Multi-AZ deployment is beneficial when users want to increase the availability of their database. In a Multi-AZ deployment, Amazon RDS creates a read replica of the database in a different Availability Zone (AZ) from the primary instance. This provides several benefits, including:
- High availability: By having a read replica in a different AZ, the database is more resilient to failures in a single AZ.
- Automatic failover: If the primary instance fails, Amazon RDS can automatically failover to the read replica in the other AZ.
- Improved durability: By having a read replica in a different AZ, the database is more resistant to data loss in the event of a failure.
This makes Multi-AZ deployments beneficial for users who want to increase the availability of their database.
The other option is incorrect because:
- False: A Multi-AZ deployment is not beneficial for reducing the cost of the database. While it may provide some cost savings by allowing for automatic failover, the primary benefit of a Multi-AZ deployment is to increase the availability of the database.
What are the three components of Amazon EC2 Auto Scaling?
Launch template, scaling policies, EC2 Auto Scaling group
Amazon EC2 Auto Scaling requires users to specify three main components: a configuration template for the Amazon Elastic Compute Cloud (Amazon EC2) instances (either a launch template or a launch configuration); an EC2 Auto Scaling group to list minimum, maximum, and desired capacity of instances; and scaling policies that scale an instance based on the occurrence of specified conditions or on a schedule. For more information, see Amazon EC2 Auto Scaling.
Gemini
The correct answer is:
- Launch template, scaling policies, EC2 Auto Scaling group
Here’s why each component is crucial:
- Launch Template: A blueprint defining how to launch new EC2 instances (AMI, instance type, storage, security groups, etc.). This ensures consistency when scaling.
- Scaling Policies: Rules that determine when and how scaling happens. (e.g., scale up when CPU utilization exceeds 70%, scale down when traffic drops below a threshold).
- EC2 Auto Scaling Group: The logical grouping of EC2 instances managed by Auto Scaling. It’s where you link the launch template and scaling policies.
ChatGPT
The three components of Amazon EC2 Auto Scaling are:
Launch template (or launch configuration)
Scaling policies
EC2 Auto Scaling group
Llama3
The three components of Amazon EC2 Auto Scaling are:
Launch template (or launch configuration)
Scaling policies
EC2 Auto Scaling group
An application must choose target groups by using a rule that is based on the path of a URL. Which Elastic Load Balancing (ELB) type should be used for this use case?
Application Load Balancer
Application Load Balancer is a layer 7 load balancer that routes HTTP and HTTPs traffic, with support for rules. For more information, see Route Traffic with Amazon Elastic Load Balancing.
Gemini
The correct answer is Application Load Balancer.
Here’s why Application Load Balancers are the perfect fit for this scenario:
- HTTP/HTTPS Traffic: Application Load Balancers operate at Layer 7 of the OSI model, allowing them to understand and route based on HTTP/HTTPS data, including the path within a URL.
- Path-Based Routing: They support advanced routing rules where you can forward requests to different target groups based on specific paths in the URL. For example:
/images/*
-> Image Processing Target Group/videos/*
-> Video Streaming Target Group
Why other options aren’t suitable:
- Classic Load Balancer: Older generation, operates at Layer 4 (TCP/UDP), lacking the ability to route based on URL paths.
- Network Load Balancer: Ultra-high performance, Layer 4 load balancing. Doesn’t understand HTTP/HTTPS details required for path-based routing.
- Gateway Load Balancer: Specialized for virtual appliances (firewalls, intrusion detection), not designed for general HTTP/HTTPS path-based routing.
ChatGPT
For choosing target groups based on the path of a URL, the suitable Elastic Load Balancing (ELB) type is the Application Load Balancer. It operates at the application layer (Layer 7) of the OSI model and can route traffic based on content such as the URL path.
Llama3
The correct answer is:
1 point
Application Load Balancer
The Application Load Balancer (ALB) is designed to work with HTTP and HTTPS traffic and can route traffic based on the URL path. It can also perform advanced routing and filtering based on the URL path, making it a good fit for this use case.
Here’s why the other options are not correct:
- Option A is incorrect because the Classic Load Balancer is designed for load balancing TCP traffic and is not suitable for routing traffic based on the URL path.
- Option C is incorrect because the Network Load Balancer is designed for load balancing TCP traffic and is not suitable for routing traffic based on the URL path.
- Option D is incorrect because the Gateway Load Balancer is not a real Elastic Load Balancing (ELB) type.
The Application Load Balancer is the best choice for this use case because it can route traffic based on the URL path and is designed to work with HTTP and HTTPS traffic.
Video: Introduction to Amazon CodeWhisperer
Absolutely! Here’s a concise summary of the provided text, focusing on Amazon CodeWhisperer’s key features and benefits:
What is Amazon CodeWhisperer?
- AI-powered coding assistant integrated within your IDE.
- Generates code suggestions in real-time based on your comments and existing code.
- Supports both general-purpose coding tasks and optimization for AWS services.
How does it work?
- Natural Language Comments: Describe your desired code in plain English, and CodeWhisperer will offer corresponding code snippets.
- Code Generation: Can suggest single lines or whole functions to accelerate development.
- Code Completion: Provides smart suggestions as you type, streamlining common syntax.
Key Benefits
- Productivity Boost: Reduces time spent on boilerplate code and searching for syntax.
- Security Assistance: Helps you identify and fix potential vulnerabilities aligning with security best practices.
- Easier Learning: Simplifies learning new APIs, especially for AWS services.
How to Get Started:
- Install the AWS toolkit for your IDE.
- Create an AWS Builder ID (free).
Important Notes
- CodeWhisperer’s suggestions are a starting point – always review and customize the generated code.
- Use descriptive comments and intuitive naming conventions to maximize suggestion quality.
- [Morgan] Wouldn’t it be
nice if when you were coding, you had an always-online,
always-available companion helping you along the way? With generative AI, you can have a helpful virtual coding
buddy right in your IDE. I’m talking about the
service Amazon CodeWhisperer. CodeWhisperer is an AI coding companion that generates coding suggestions to you in real time as you type. These coding suggestions
may be single line or full function code and it can really help you accelerate how quickly you can build software. With CodeWhisperer,
you can write a comment in natural language that outlines
a specific task in English such as upload a file to Amazon S3 with server-side encryption. Then CodeWhisperer recommends
one or more code snippets that you can use to accomplish the task directly in your IDE. You can then optionally
accept that suggestion or cycle through other suggestions. CodeWhisperer is a general purpose tool meaning it can help you
with general coding tasks, but it is also optimized
for popular AWS services. Think about how many
times you’ve forgotten the syntax for a common task or maybe you don’t know the syntax for how to interact
with a specific AWS API. With CodeWhisperer, you don’t need to go search on the internet for an answer. Instead you can write a comment for what you are trying to do and CodeWhisperer will generate
some code for you to review. This can save you a lot of time when compared to reading
through documentation searching for an example. CodeWhisperer will give
you a code suggestion and you can get answers without
needing to leave your IDE. That being said, it’s not just
about simple code generation. You can also have it generate more complex code like full functions and with that, it can help
you solve problems quickly and efficiently through
its code suggestions. CodeWhisperer Individual is
available to use at no cost by creating and signing
in with an AWS Builder ID. The signup process only
takes a few minutes and does not require a credit card or an AWS account. CodeWhisperer also has a feature where it scans your code to detect hard-to-find vulnerabilities and gives you code
suggestions to remediate them. This can help you align to best practices for tackling security
vulnerabilities like those outlined by Open Worldwide Application
Security Project or OWASP or those that don’t meet
crypto library best practices or other similar security best practices. All right, enough talking
about this amazing tool. Let’s see it in action. I’m in my PyCharm IDE and I already have the
AWS toolkit installed which includes Amazon CodeWhisperer under the developer tools section I am also logged in with my Builder ID, so I’m ready to start using CodeWhisperer. I have a blank Python file and since I already have
CodeWhisperer turned on, now I want to create a simple program that creates an S3 bucket
given a bucket name. First, I’m going to type
the import statement at the top of the file, importing boto3 which is
the AWS SDK for Python. Then I’m going to make a
comment to create the S3 client. You can see CodeWhisperer is popping up to make a suggestion
for creating the client. I can use the arrow keys to cycle through the different suggestions and I can press the Tab button
to accept the suggestion which I will do in this case. Then I’m going to delete this next comment that it generated. Then I want to write a comment describing the function
to create the bucket. I will make a comment saying
function that inputs a string and creates an S3 bucket
with error handling. And then I can hit enter and you can see CodeWhisperer
is making some suggestions. I can cycle through these
using the arrow keys, so I’ll hit the arrow key and we can see the different suggestions that it’s making to us. I am going to accept this
suggestion by hitting Tab. Now if I hit enter, we can see what else
CodeWhisperer suggests. I’m going to accept some more suggestions until this function looks completed. All right, so now we have a function that will create an S3 bucket
given the string as a name and it will catch an exception and print it so that the
program doesn’t throw a stack trace if something goes wrong. Note how I’m writing comments and letting CodeWhisperer
generate suggestions that way. You don’t need to use it this way. You can code like you usually do and it will suggest things along the way. For now, let’s continue
using the comment method. Let’s write a comment
to call the new function and let CodeWhisperer
generate that as well. CodeWhisperer put in a
placeholder name here that says bucket name, but the thing is about S3 is that bucket names have
to be globally unique. So I’m going to append this bucket name a bunch of numbers to make sure that this wasn’t taken by another account. Then I can hit enter and continue. Now I want to use CodeWhisperer to generate some sample JSON data and then I want to write that data into an object in our new S3 bucket. So I will create another comment and type generate JSON
data to upload to S3. Then I can hit enter and we can then start
to see the suggestions. I’m going to accept this suggestion here. This fills out some dummy data for this particular JSON sample. This looks fine for our
sample proof of concept. I’m gonna go ahead and hit enter again. Then I will make another comment to upload this object to S3 by saying upload data to S3 and then we can cycle
through the suggestions here, and I’m going to accept this one. And you can see that this line of code to upload the data to S3 has the bucket name
that we created earlier. This is because CodeWhisperer looks at all of the code in the file and uses that context to generate code. CodeWhisperer uses information
like code snippets, comments, cursor location, and contents from files open in the IDE as inputs to provide code suggestions. So that is how it can give you really specific suggestions like this because it takes in all of this context. Now we need to run the file. So before I do that, I’m going to add the closing
parentheses to this statement and then I want to modify the code that CodeWhisperer suggested to me. I’m going to save the
response into a variable here and then I want to add
some print statements. So I’m going to print the response and you can see CodeWhisperer
is helping me do that. And from here I’m going to save it and then we can right click on App py and select Run App. We can see that we have a
status code of 200 right here which means that the bucket was created and the object was uploaded. You can do all sorts of things
with Amazon CodeWhisperer. You can generate single lines of code, full functions, generate
sample data and more. Now that we have explored
how to use comments to generate the suggestions, let’s move on to using
single line code completion meaning while I’m typing, it will make suggestions sort of like a really smart IntelliSense. To highlight this, I’m going to add a new function definition to this file. Back in the file, I’m going to scroll down to the bottom and then we can begin
typing our new function. I want this one to list all
of the buckets in the account, so I’m going to type def list_buckets and then we can see that CodeWhisperer is suggesting some code for
how to write out this function. I’m going to accept the first one and we can see that this
will call the S3 client and call the API list buckets and then print out the response. So you don’t always need to make a comment for CodeWhisperer to
generate suggestions for you. You can just code like normal and it will kick in and start working. Something to note about CodeWhisperer and generative AI in general is that it doesn’t magically know exactly what you’re trying to do meaning that CodeWhisperer
might not generate exactly what you want every time, but using it can still save tons of time as it can lay out a lot of
the basics that you need. Then you can go in and modify
it to your specific needs. You own the code that you write and are responsible for it including any code suggestions
provided by CodeWhisperer. So always review a code
suggestion before accepting it and edit it as needed. To get the most out of CodeWhisperer, you should write comments that are short and mapped to smaller discrete tasks so that no single function or code block is too long. CodeWhisperer is the most helpful when you use intuitive names
for various code elements such as function names. The more code that is available
as surrounding context, the better the suggestions will be. To round out this video, let’s review why you
should use CodeWhisperer. CodeWhisperer helps accelerate
software development by providing code suggestions that can reduce total development effort and let you have more time for ideation, complex problem solving, and
writing differentiated code. CodeWhisperer is a really great tool for reducing the amount of time you are spending writing boilerplate code, so that leaves more time for you to focus on more interesting tasks
in software development. CodeWhisperer can make general
purpose code suggestions and AWS API related coding suggestions, and CodeWhisperer can help you improve application security by helping detect and remediate security vulnerabilities. This was a short demo
on Amazon CodeWhisperer, but I hope you can see how it can help you be more productive as a developer. Install the AWS toolkit in your IDE and sign in with an AWS
Builder ID to get started.