Skip to content
Home » Amazon Web Services (AWS) » AWS Fundamentals Specialization » AWS Cloud Technical Essentials » Week 4: Monitoring & Optimization

Week 4: Monitoring & Optimization

Welcome to Week 4, where you will learn about the benefits of monitoring on AWS, and how to optimize solutions on AWS. You will also learn about the function of Elastic Load Balancing (ELB), and how to differentiate between vertical scaling and horizontal scaling.

Learning Objectives

  • Configure high availability for your application
  • Differentiate between vertical and horizontal scaling
  • Route traffic with Amazon Elastic Load Balancing
  • Describe the function of Amazon Elastic Load Balancer
  • Discover how to optimize solutions on AWS
  • Describe the function of Amazon CloudWatch on AWS
  • Define the benefits of monitoring on AWS

Monitoring on AWS


Video: Introduction to Week 4

Addressing Application Performance and Scalability

The next phase of the course will address two key issues to improve the employee directory application:

  1. Monitoring: Gaining visibility into application performance and resource usage is crucial. The course will introduce Amazon CloudWatch as a tool to provide these insights.
  2. Scalability: Since application demand can fluctuate, the course will cover:
    • Automation: Adding or removing resources automatically based on demand to optimize performance and cost.
    • Load Balancing: Distributing traffic across multiple resources to handle changes in demand.

Goal: The aim is to create a more robust and cost-effective application that can efficiently adapt to varying user needs.

  • Hello everyone. Good to see you’re here
    and ready to learn. You’re almost done with the course. Now that we’ve added a
    storage and database layer, the employee directory
    application is fully functional but we still have two more issues. The first issue is that we have no insight into how our application is performing and how it utilizes resources. So to fix this, we will
    start off the next lesson with a discussion of a monitoring tool called Amazon CloudWatch. After we understand what the demand for your application looks like, we’ll address the second
    problem, scalability. You may find that your application demand
    isn’t always constant. So in the last section of the course, you’ll learn how to automate
    the process of adding resources as demand increases and reducing capacity as demand declines to reduce cost. You’ll also learn how to deliver traffic and load balance across a
    changing amount of resources. Let’s get started.

Video: Monitoring on AWS

Why Monitoring Matters

  • Proactive Problem Solving: Don’t wait for users to report problems! With monitoring, you can identify and address performance issues before they significantly impact the user experience.
  • Understanding Root Causes: User reports only tell you that there’s a problem, not why. Monitoring provides insights into where the problem originates (database, network, EC2, etc.).

The Monitoring Process

  1. Collect Data: Gather metrics (e.g., CPU usage) and logs (e.g., network traffic) from various cloud services involved in your application.
  2. Establish Baseline: Analyze historical data to determine normal operating conditions.
  3. Set Alerts: Define thresholds that, when exceeded, trigger automatic notifications for investigation or remediation.

The Role of Amazon CloudWatch

  • Centralized Monitoring: Collects and displays data from various AWS services (RDS, DynamoDB, EC2, etc.) in a single place.
  • Automation: Enables automated responses based on incoming data.

Key Takeaway: In dynamic cloud environments, effective monitoring is essential for maintaining a smoothly running application, providing a positive user experience, and quickly pinpointing the source of issues.

  • When you have a solution built out of many different pieces, like how we build solutions on AWS, it’s important to be able to see how different services
    are operating over time and react to operational
    events as they’re happening. Let’s consider the Employee
    Directory application. It’s Monday morning and the users are seeing
    latency on page loads. It’s probably not good enough to wait until a user sees the slowdown, calls, or enters a ticket saying, “Hello, your application is running slow.” If you receive a call or
    a ticket from your users, you can then react and
    troubleshoot the issue. Waiting for users to
    notice and report issues to investigate will generally
    lead to unhappy end users. Ideally, you’d be able to respond to operational issues before
    your end users notice. On top of that, the end user
    can only provide information about their experience, and they cannot give insight into the inner workings of your solution. So, they can report the issue to you. But where’s that issue coming from? Is it an EC2 problem? Is it a database problem? Is it a code change that
    has recently been deployed? Without monitoring, you have to do some digging
    to figure all of that out. So what do we need to do? The first step is to put
    proper monitoring in place. Monitoring is a verb. It’s something we do. We collect metrics, we collect logs, we watch network traffic. The data needed to help
    you pinpoint problems comes from the different
    services and infrastructure hosting the application,
    and monitoring tools help you collect the data being
    generated by these systems. In a cloud environment
    that is ever-changing and ever-scaling, it’s even more important to collect various types of
    information about the systems as they scale up and down
    and change over time. The different services
    that make up your solution generate data points that we call metrics. Metrics that are monitored over
    time are called statistics. And metric is a data point like
    the current CPU utilization of an EC2 instance, where
    other data you monitor could come from different
    forms like log files. For example, your network will
    generate data like flow logs so you can see how network
    traffic is coming into and out of your VPC. The servers will be generating metrics such as how much CPU is
    currently being used, or how much network traffic the instance is accepting at any given moment. Then finally, one more example
    is your database layer, which will generate
    metrics such as the number of simultaneous connections
    to your database. So you need a way to collect
    all of this information. Once it’s collected, it can be
    used to establish a baseline, and this baseline can be used to determine if things are operating smoothly or not. If the information
    collected deviates too far from the baseline, you would
    then trigger automatic alerts to go out to someone or something to try to remediate the issue. A good monitoring solution gathers data in one centralized location
    so you can proactively monitor and operate your system to
    keep your end users happy as well as allows for
    automated tasks to be triggered based on the data coming in. This is where Amazon CloudWatch comes in. CloudWatch allows you to
    monitor your solutions all in one place. Data will be collected from
    your cloud-based infrastructure so you can see things like
    database metrics coming from RDS or DynamoDB,
    alongside EC2 metrics, as well as metrics coming from other services making
    up your AWS solution. Coming up next, we’ll dive into some CloudWatch
    features and use cases.

Reading 4.1: Monitoring on AWS

Reading

Video: Introduction to Amazon CloudWatch

What is Amazon CloudWatch?

  • A powerful monitoring service within AWS that collects and visualizes metrics from various AWS resources.
  • Enables you to understand the health and performance of your applications and infrastructure.

Key Features Demonstrated

  1. Dashboards:
    • Customizable pages to monitor multiple resources in one view.
    • Example: Created a dashboard showing CPU utilization of the EC2 instance.
  2. Alarms:
    • Set thresholds for metrics and trigger automated actions when those thresholds are crossed.
    • Example: Created an alarm to send an email if CPU utilization exceeds 70% for over 5 minutes.

Other Important Notes

  • Automatic Metrics: Many AWS services send metrics to CloudWatch by default.
  • Custom Metrics: You can programmatically send additional application-specific metrics to CloudWatch.
  • Actions with Alarms: Besides email via SNS, CloudWatch alarms can trigger other actions (e.g., scaling resources), which will be explored later.

Key Takeaway: CloudWatch is a vital tool for keeping track of the health of your AWS infrastructure, alerting you to potential issues, and facilitating automated responses.

  • In this video, I’m
    gonna run through some of the features that Amazon
    CloudWatch has to offer. We’ve already deployed
    resources into our AWS account with the employee directory
    application, so to get a handle on what CloudWatch is and how it works, let’s dive into the AWS console
    and see what we can see. The end goal of this demo
    is to set up a dashboard that shows us the CPU utilization of the EC2 instance over time and set an alert that will be sent out if the CPU utilization of
    the instance goes over 60% for a period of time. I’m already in the console and I will navigate to Amazon CloudWatch. A lot of the AWS services
    begin reporting metrics to CloudWatch automatically
    when you create resources. Let’s explore what types
    of metrics are available from the resources that
    we have already created in a previous lesson. To do that, let’s first
    create a dashboard. A dashboard in CloudWatch
    is a customizable page in the console that you can use to monitor your different types of resources in a single view. This includes resources that are spread across different AWS regions. I want to create a dashboard that shows me the CPU utilization over
    time for the EC2 instance hosting the employee
    directory application. So now I’m going to click dashboards in the left-hand navigation and then click create a dashboard and we can then name this
    dashboard mydashboard. AWS resources automatically send metrics to CloudWatch and it’s
    built into the service. I’m going to select
    which widget we can add to the dashboard, and I will
    select a line graph here. Next, we can take a look at
    what the data source will be for this widget, and I will
    select metric for this. Now we can browse through
    all of the available metrics. This is organized by
    service, so I will find EC2. I will select EC2, and then I want to view per instance metrics. From here, I can scroll through the different EC2-specific metrics that are being reported to CloudWatch and I want to select CPU utilization. Click save, and now we are
    brought back to the dashboard with one line graph for one specific
    instance’s CPU utilization. This gives us visibility
    into one aspect of the health of our EC2 instance. You can explore in your own
    account what other metrics are available for you
    through CloudWatch by default but you can also report
    some custom metrics to CloudWatch programmatically. This is good to know because
    with EC2 and CloudWatch, you only get visibility into
    the health of the instance by default, which doesn’t really
    give you a holistic picture of the health of your application. The application running on the instance might not be operating correctly, but the CPU utilization could be fine. So keep in mind that you may
    choose to use custom metrics to get a more detailed and
    accurate view of the health in these dashboards. Once a dashboard is created,
    you can then share it with your team members. Now let’s move on to another feature of CloudWatch called
    Amazon CloudWatch alarms. CloudWatch alarms allow
    you to create thresholds for the metrics you’re monitoring, and these thresholds can
    define normal boundaries for the values of the metrics. If a metric crosses a
    boundary for a period of time, the alarm would be triggered,
    and then you can take a couple of different automated actions
    after an alarm is triggered. In our use case, I want to notify someone if the CPU utilization
    spikes over 70% for a period of time, say five minutes. Let’s create an alarm to do that. Let’s navigate to the Alarm
    section and click Create alarm. Now, what I want to do is create an alarm for CPU utilization, so
    I will select metric, click on EC2, per instance metrics and scroll down to select CPU utilization. Now I will select the time
    period we are monitoring to trigger the alarm, which
    in this case is five minutes. You want to make sure you
    pick a reasonable time period where you don’t wait too long to respond but you also don’t respond to every short-lived
    uptick in CPU utilization. There is a balance to strike here and it will be highly dependent on your specific situation
    and your desired outcome. Okay, so now we’ll scroll
    down and type in 70 for the static threshold,
    which we are watching for, which is representing the
    CPU utilization threshold. If it goes over 70% for
    more than five minutes, it’s likely that there’s a problem, and now we will configure
    the action that will be taken if the metric triggers the alarm. For CloudWatch alarms, there are three states an alarm can be in: either an ALARM, OK, or INSUFFICIENT_DATA. An alarm can trigger an
    action when it transitions between these three states. In our case, I want to
    have this send out an alert to an email address when it transitions from OK to an alarm. AWS has a service called Amazon
    Simple Notification Service, or SNS, which allows you to create a topic and then send out messages
    to subscribers of the topic. You can imagine a scenario
    where we have systems admins or developers on call for our employee directory application and if something goes wrong, we want to send them an email
    paging them, letting them know that something is going
    wrong with the app. So we will select create a new SNS topic since we do not have one in place already. Name it CPU_Utilization_Topic, and then I will put in
    an email address here to receive the alert. Notice as I scroll down here that there are other actions you can take for a CloudWatch alarm. We will talk more about some of these options in upcoming lessons. So then click Next and give
    this a name and description. And finally, click Create alarm. It will take some time for the alarm to begin
    collecting enough information to leave the insufficient data state and transition into the OK state. That is it for now. We will continue to use CloudWatch in upcoming lessons to
    make our solution elastic through EC2 auto scaling.

Reading: Reading 4.2: Introduction to Amazon CloudWatch

Reading

Optimization


Video: Optimizing Solutions on AWS

Current Infrastructure and Issues

  • Single EC2 instance in one Availability Zone (AZ) hosts the application.
  • DynamoDB and S3 have built-in high availability, but the application itself is a single point of failure.

Scaling: Why It Matters

  • To handle increased user demand as the company grows, scaling is required.
  • Two options:
    • Vertical Scaling: Increasing the size of the instance (limited).
    • Horizontal Scaling: Adding more instances (preferred for flexibility).

The Problem with Manual Scaling

  • Launching and terminating instances to match demand is tedious and inefficient.

Solutions

  1. Redundancy for Availability: Add another instance in a different AZ to ensure the application remains online even if one AZ has problems.
  2. EC2 Auto Scaling: Automatically adds or removes EC2 instances based on defined conditions. This ensures capacity matches demand and maintains the health of the instance fleet.
  3. Load Balancer: Distributes requests across multiple instances, eliminating the need to track individual IP addresses and simplifying routing.

Key Takeaway: To achieve a highly available and scalable infrastructure, it’s necessary to move beyond single instances and manual management. Auto scaling and load balancing are essential tools for this.

  • At this point, you
    know how to set up alarms to notify you when your
    infrastructure is having capacity, performance, or availability issues. But we need to go one step further. We don’t just want to
    know about these issues. We want to either prevent them or respond to them automatically. In this section of the course you’ll learn how to do just that. Currently, our infrastructure
    looks like this. We have one EC2 instance hosting our employee directory application in a single availability zone with DynamoDB as its database, and S3 for storage of static assets. If I wanted to evaluate the availability of our infrastructure, I have to look at each
    individual component. I know that both DynamoDB and Amazon S3 are highly available by design, so that leaves one issue, this singular instance of our application. If that instance goes down employees have no way to
    connect to our application. How do we solve this? Well, as you already know, to increase availability,
    we need redundancy. We can achieve this by
    adding one more server, but the location of that
    server is important. If instance A goes down we don’t want instance B to
    go down for the same reasons. So to avoid the unlikely event of, say, an AZ experiencing issues, we should ensure that instance
    B is put in another AZ. So now we have two instances
    that we’ve manually launched, but let’s say our company grows rapidly, and the employee directory application is constantly being accessed by thousands of employees around the world. To meet this demand we could scale our instances vertically, meaning we could increase the
    size of the instances we have, or we could scale our
    instances horizontally, meaning we could add more instances to create a fleet of instances. If we scale vertically
    eventually we’ll reach the upper limit of
    scalability for that instance. But if we scale horizontally, we don’t have those same limitations. I can add as many instances
    to my fleet as I’d like, but if I need, say, 15 more
    instances to meet demand, that means I’d have to manually
    launch those 15 instances and manually shut them down, right? Well, you could do it that way, or you could automate this process with Amazon EC2 auto scaling. This service will allow you to
    add and remove EC2 instances based off of conditions that you define. With this service, you can also maintain the health of your fleet of instances, but with more EC2 instances
    comes a bigger issue. How do we access those servers? Before we were simply using
    the instance public DNS name, or public IP address, but when you have multiple instances, you have multiple of IPs to route to. Instead of maintaining
    the logic to send requests to your various servers, you would use a load balancer
    to distribute those requests across a set of resources for you. And since you can connect
    from your load balancer to access the application, you no longer need to use the public IPs of your EC2 instances. Before the next video
    I’ll spin up two instances hosting our employee
    directory application. See you soon.

Reading 4.3: Optimizing Solutions on AWS

Reading

Video: Amazon EC2 Auto Scaling

Problem: Limited Scalability

  • The existing setup with two web servers can’t handle increasing traffic. Manual scaling is time-consuming.

Solution: EC2 Auto Scaling

  • Auto Scaling automatically adds or removes EC2 instances based on demand, ensuring the application can handle traffic spikes.

Components

  1. Launch Template:
    • Defines the configuration of instances to be launched (AMI, instance type, security group, user data).
    • Ensures new instances are identical to existing web servers.
  2. Auto Scaling Group (ASG):
    • Specifies where to launch instances (VPC, subnets).
    • Connects to the load balancer for traffic distribution.
    • Sets minimum (2), maximum (4), and desired (2) instance counts.
  3. Scaling Policy:
    • Uses target scaling to adjust ASG capacity.
    • Triggers scaling out (adding instances) when CPU utilization exceeds 60%.

Stress Test & Results

  • The “Stress CPU” button in the app simulates load.
  • CloudWatch shows:
    • CPU utilization alarm triggered.
    • Auto Scaling launched two new instances.
    • Load decreased as traffic spread to more instances.

Key Takeaways

  • Auto Scaling makes the application dynamically adjust to demand.
  • Correct termination is important: delete the Auto Scaling Group, not just the instances, to avoid unwanted replacements.

  • As more people visit our application, the demand on our two web
    servers is going to increase. At some point, our two instances aren’t going to be able to handle that demand, and we’re going to need
    to add more EC2 instances. Instead of launching
    these instances manually, we want to do it automatically
    with EC2 Auto Scaling. Autoscaling is what allows
    us to provision more capacity on demand depending on
    different thresholds that we set and we can set those in CloudWatch. Okay, so we’re going to
    draw a high level example of how this works and then Seth
    is going to build it for us. So looking at our application, traffic coming in from the outside can come down to either EC2 instance. In this video, these EC2 instances will be part of an auto-scaling group. Each of the EC2 instances that we launch will be completely identical. Then we’re going to run
    code to simulate stress on our employee directory application that will make the instances
    think that they’re overloaded. When this happens, the instances
    will report to CloudWatch and say that their CPUs are overloaded. CloudWatch will then
    go into an alarm state and tell auto-scaling,
    give me more EC2 instances. As each instance comes online, they will pass ALB Health checks. They’ll start to receive traffic and give us the horizontal
    scalability that we need. Seth will be helping us
    build out the scalability. Let’s have him join us. You got all that, right? – Yes. It’s time to make this app scale. Let’s make it happen. So to get this built out,
    the first thing we need to do is create a launch template. This is going to define what to launch. So it’s all about setting
    the configurations for the instances we want to launch. Morgan said we want our
    instances to be identical and that’s what we’ll be configuring here. First thing we’re going
    to do in the EC2 dashboard is find launch templates on the side panel and click create launch template. We’ll first provide a
    name and description. I’ll call this app-launch-template and give it a description of a web server for the employee directory app. Then you’ll notice this handy check box that asks if we are going
    to use this launch template with EC2 Auto Scaling. We are. So we’ll check it and scroll down. We’ll then choose the AMI, again, the launch template
    is all about what we launch. So what we want to do is create
    mirrors of the web servers hosting our app that we
    already have running. That way whenever we scale out,
    we just have more instances with the exact same configuration. When we launched the instance
    hosting our app earlier, we use the Amazon Linux, AMI,
    and a T2 micro instance type. So we’ll select both of those options. Next we choose the security
    group for our new instances. We’ll use the same security group you created earlier in the
    course, the web security group. Then we scroll down and expand the advanced details selection. Here we’ll choose our instance role, the same role we used previously in the instance profile dropdown. Once we do that, we’ll scroll all the way
    down and paste our user data. This is what grabs our source code and unzips it so that our
    application runs on EC2. Now we’re done and we can click create. Now that we’ve configured
    our launch template which again defines what we launch, we now have to define
    an auto-scaling group which tells us where, when and
    how many instances to launch. To create an autoscaling group, we’ll select autoscaling
    groups on the side panel and then create an autoscaling group. Here, we’ll enter a name such as app-asg and then select the launch
    template we just created app launch template and then click next. Then we’ll select our network. We’ll choose the same VPC we
    created earlier in the course. App-vpc and select both the
    private subnets we created, Private A and private
    B and then click next. We then need to select attached
    to an existing load balancer to receive traffic from the
    load balancer we created earlier and then choose our target
    group, at Target group. Click enable ELB health checks so that the load balancer will check if your instances are
    healthy and then click next. Now, we’ll choose the maximum
    minimum and desired capacity. The minimum we’ll say is two. This means that our auto-scaling group will always have at least two instances, one in each availability zone. The maximum will say is four, which means that our fleet can
    scale up to four instances. And the desired capacity, which is the number of instances
    you want to be running, we’ll say is two. Next, we can configure
    the auto-scaling policies. With scaling policies, you
    define how to scale the capacity of your auto-scaling group in
    response to changing demand. For example, you could
    set up a scaling policy with CloudWatch that whenever
    your instant CPU utilization or any other metric that you’d
    like reaches a certain limit, you can deploy more EC2 instances to your auto-scaling group. So what we want to do is
    use target scaling policies to adjust the capacity of this group. Earlier, Morgan created a CloudWatch alarm that resulted in the action
    of sending out an email. Here we’re going to create
    a target tracking policy, much like Morgan created the alarm, but this time it will result in the action of triggering an auto-scaling event. So we’ll name this CPU utilization and then we’ll say that we
    want to add a new instance to our fleet whenever
    the target value is 60%. We’ll also keep the instances
    need setting at 300 seconds to warm up before scaling. Then we’ll click next to
    configure notifications when a scaling event happens. This is optional, so for now
    we’re going to skip past it. All right, here we can review and click create autoscaling group. Now all that’s left is
    to stress our application and make sure that it actually
    scales up to meet the demand. To do that, I’ll open up a new tab and paste the endpoint for
    our elastic load balancer. Here’s our application. I’m going to go to the info page by app pending/info on the URL. You’ll notice here that we
    built in a stress CPU feature. This is going to mimic the
    load coming into our servers. In the real world, you would probably use
    a load testing tool, but instead we built a stress CPU button as a quick and easy way to test this out. Then we can watch our servers scale with that auto-scaling policy. So as the CPU utilization goes up, our auto-scaling policy will be triggered and our server groups will grow in size. I’m going to select 10
    minutes for our stress test. All right, some time has passed. We’ve stressed our application, and if we take a look at CloudWatch, we can see what happened. So let’s click on CloudWatch. If you look here, we can
    see our alarm summary. We went over 60% CPU
    utilization across instances. That was our threshold, so
    we launched new instances. We launched two new instances
    into our autoscaling group. Then you can see the
    loads start to come down. Because we’ve launched more instances into our autoscaling group, there were more hosts to accept traffic for our elastic load balancer and the average CPU utilization
    went down across servers. All right, let’s go look
    at the EC2 instances that were launched into
    the autoscaling group. I’m going to go to the EC2 dashboard. From here, I’m going to
    scroll down to target groups and select the app target group. I’m going to select
    targets, and we can now see that we have four healthy target instances in our auto-scaling group. So now we have an environment
    with an auto-scaling group and our launch template. We’ve set up an alarm in CloudWatch that whenever our CPU
    utilization goes over 60%, we’re going to launch more
    instances into that group. All right, and this app
    should be able to scale now using EC2 Auto Scaling. – Great. Thank you, Seth. All right. Now if we wait around long enough, what we would see is as
    the CPU load dropped off then one by one they
    would each get terminated because we no longer need them. Bringing our state all the way back down to the basic two we need, the minimum number for
    this particular group. All of that done without
    any human interaction. If for some reason you are
    doing this in your AWS account, don’t forget to delete
    the autoscaling group instead of the instances. Otherwise, guess what will happen? The autoscaling group will
    spin up more instances to replace the ones that were deleted. So ensure you are deleting the
    auto-scaling group as well.

Video: Route Traffic with Amazon Elastic Load Balancing

What is Elastic Load Balancing (ELB)?

  • A service that distributes incoming application traffic across multiple servers (e.g., EC2 instances).
  • Improves scalability and fault tolerance by ensuring no single server is overwhelmed.
  • Highly available and automatically scales to handle traffic fluctuations.

Types of Load Balancers

  • Application Load Balancer (ALB): Works with HTTP/HTTPS traffic (web applications).
  • Network Load Balancer: Handles TCP, UDP, and TLS traffic for high-performance needs.
  • Gateway Load Balancer: Routes traffic to third-party appliances.

Key Components of an ALB

  1. Listener:
    • Checks for incoming requests on a specific port and protocol (e.g., port 80, HTTP).
  2. Target Group:
    • Group of backend resources to route traffic to (EC2 instances, Lambda functions, etc.)
    • Has health checks to ensure targets are operational.
  3. Rule:
    • Defines how requests are routed to target groups.
    • Can be based on paths in the URL (e.g., send /info traffic to a specific target group).

Benefits of Using ELB

  • Scalability: Handles traffic surges without manual intervention.
  • Resilience: Reduces the impact of individual server failures.
  • Flexibility: Customizable routing rules based on application needs.

How to Create an ALB (Example Steps)

  1. Go to the EC2 console and select “Load Balancers”.
  2. Create a new ALB (internet-facing, HTTP on port 80).
  3. Select VPC, availability zones, and public subnets.
  4. Choose the appropriate security group (allow traffic on port 80).
  5. Create a target group, select instances, and include them.
  6. After the load balancer is created, copy its DNS URL.
  7. Test it in a browser – the load balancer will distribute your requests across the target EC2 instances.

  • Now that we have multiple EC2 instances hosting our application
    in private subnets, it’s time to distribute our
    request across our servers using the Elastic Load
    Balancing or ELB service. Conceptually, how this would work is a typical request for the application would start from the browser of the client and is then sent to the load balancer. From there, the load balancer
    determines which EC2 instance to send the request to. After it sends the request, the
    return traffic would go back through the load balancer and
    back to the client browser, meaning your load balancer is directly in the path of traffic. Now if we’re looking at the architecture, the ELB looks like one thing. It looks like a single point of failure. But ELB is actually, by
    design, a highly available and automatically scalable
    service, much like S3 is. What that means is that A,
    the ELB is a regional service, so you don’t have to worry about maintaining nodes
    in each availability zone or having to configure
    high availability yourself. AWS maintains that for you. And B, the ELB is designed to
    handle additional throughput and will automatically scale up to handle the traffic coming in, without you having to
    configure that feature. Now there are several types of load balancers that you can choose. There’s the Application Load Balancer that load balances HTTP and HTTPS traffic. There’s the Network Load Balancer that routes TCP, UDP, and TLS traffic. And there’s the Gateway Load Balancer that is mainly used to
    load balance requests to third-party applications. For our employee directory application, we’ll be load balancing web traffic so we’ll be using the Application
    Load Balancer or the ALB. When you create an ALB, you’ll need to configure
    three main components. The first component is a listener. The goal of the listener
    is to check for requests. To define a listener, a port must be provided
    as well as the protocol. For example, since we’re
    routing web traffic and we set our application to use port 80, we’d want our load balancer to listen to port 80
    using the HTTP protocol. Additionally, we could set
    up a listener port for 443 using the HTTPS protocol. The second component is a target group. A target is the type of backend you want to direct traffic
    to, such as an EC2 instance, AWS Lambda functions, or IP addresses. A target group is simply just a group of these backend resources. Each target group needs
    to have a health check, which is how the load balancer can check that the target is healthy so that it can start accepting traffic. The ALB operates on the application layer, which is layer seven of the OSI model. This gives the load balancer
    a lot of cool features, and one of those is the third
    component, which is a rule. A rule defines how your requests
    are routed to your targets. Each listener has a default rule and you can optionally
    define additional rules. So if we had two target groups, A and B, we can set up an additional rule that says if traffic is coming to our /info page, we can deliver that
    traffic to target group B. So because of the fact ALB
    operates on layer seven, you can customize paths for your traffic. Okay, now it’s time to
    put this in practice. Let’s create an Application
    Load Balancer in the console. If you want to find the ELB service, you’ll need to go to the EC2 console, so we’ll type in EC2 to
    the service search bar and then on the side panel,
    we’ll select Load Balancers. Once we bring up our
    load balancers dashboard that shows all of the load
    balancers for this region, we’ll then click Create load balancer. Here’s where we choose
    which type of ELB we want and you see the three main types. We’ll be using the ALB. From here, we’ll choose a
    name and call it app-elb. The next setting asks if
    we want an internet-facing or an internal-facing load balancer. An internet-facing load balancer
    does what you might expect, which is route requests from
    clients over the internet to your backend servers or targets. An internal load balancer,
    on the other hand, will route requests from
    clients with a private IP to targets with a private IP. For example, if you had
    a three-tier application with a web, app, and database tier, you could use an
    internal-facing load balancer to route traffic from your
    web tier to your app tier. For this app, we’ll be
    routing internet traffic, so you can leave it as internet-facing. After that, we choose
    which availability zones we want to route traffic to. First, we’ll choose the VPC,
    select both availability zones, and then choose the two public subnets. Now you choose your security
    group for your load balancer. This is where you decide which
    traffic you want to allow in. Here we’ll choose a security group that will allow traffic
    on port 80 from anywhere. Then, we configure the
    listeners and routing. Currently, the default setting is to allow HTTP traffic on port 80. If we want to allow or
    limit to HTTPS traffic, we can click Add listener
    and choose HTTPS, but for this demo, we’ll
    stick with the default. Next, we’ll configure routing, and for this, we need to click
    on Create a target group. This will open a new page where we can configure the target group. First, we will select the target type, which will be instances. Then we can scroll down and we’ll give it a name,
    such as app-target-group, and leave all of the defaults selected. Then we’ll click Next to
    choose which instances we want to live in the target group. For here, I’ll choose the
    two instances I’ve created in private subnets and click
    Include as pending below and click Create target group. Now coming back to the load
    balancer creation page, if we refresh the dropdown
    for target groups, we’ll see the target group
    that was just created. We will select this and then scroll down, accepting the rest of the defaults and clicking on Create load balancer. After the load balancer has been created, we can then select it and find the DNS URL
    in the description box. We can then pull that DNS URL, copy and paste it into a new tab, and from here you can see our app. If I go to the /info page of our app, you can see which availability
    zone we’re currently in. If I refresh the page a few times, you should see eventually
    that I’m being directed to both my EC2 instances in both AZs.

Reading 4.4: Route Traffic with Amazon Elastic Load Balancing

Reading 4.5: Amazon EC2 Auto Scaling

Week 4 Exercise & Assessment


Video: Introduction to Lab 4

Objectives

  • Make the employee directory application highly available using AWS tools.

Steps

  1. Review and Validate EC2 Instance: Examine the existing EC2 instance’s configuration for necessary settings and specifications.
  2. Create a Launch Template: Build a template based on the EC2 instance configuration, which will define how new instances are launched.
  3. Create an Application Load Balancer (ALB): Set up the ALB to distribute traffic across multiple EC2 instances for greater availability.
  4. Set up Auto Scaling Group: Using the launch template, configure an auto-scaling group that enables the application to dynamically scale up or down based on demand.
  5. Testing:
    • Utilize the application’s built-in “stress” feature to simulate increased user demand.
    • Verify that the auto-scaling group responds by launching additional EC2 instances to handle the load.

Goal: Ensure your application can handle varying traffic loads, maintaining high availability and responsiveness for users.

  • All right, one more lab, and this time we’re going to make the employee directory
    application highly available. In this lab, you will first
    review an Amazon EC2 instance and validate its configurations. Then using this information, you will create a launch template, which will be used for EC2 Auto Scaling and you will create an
    Application Load Balancer. Next, you will set up
    an Auto Scaling group using the launch template. This will enable your application to scale in and out with demand. Then you’ll need to test it all out. The application has a stress
    feature built into it, which is used to simulate demand. You’ll use this stress feature and then validate that
    scaling did actually occur. Alright, go ahead and get started.

Video: Demo Making Employee Directory Application

Goal

  • Handle increased traffic on the “Employee Directory” application through load balancing for even request distribution, and auto-scaling to adjust the number of running instances based on demand.

Steps

  1. Launch a New Instance: Create a new instance of the application, ensuring it has the same configuration and data access as the existing instance.
  2. Create a Load Balancer:
    • Set up an Application Load Balancer (ALB) to distribute traffic across multiple instances.
    • Configure the ALB with the appropriate security group to allow access.
  3. Create a Target Group:
    • This defines the instances the ALB will forward requests to.
    • Set up health checks to determine which instances are healthy and can receive requests.
    • Register the new instance in the Target Group.
  4. **Launch Template:
    • Create a template that defines how new instances will be configured during auto-scaling. This includes instance type, security settings, and the user data script to install the application.
  5. Auto Scaling Group (ASG):
    • Create an ASG with the launch template, ensuring it’s attached to the correct load balancer.
    • Set desired, minimum, and maximum capacity (e.g., start with 2 instances, scale up to 4).
    • Configure a scaling policy based on average CPU utilization (e.g., scale up when CPU exceeds 60%).
  6. Testing:
    • Verify the application is accessible via the load balancer’s endpoint.
    • Use the built-in stress test to increase CPU load and trigger scaling.
    • Monitor auto-scaling events: instances being added and becoming healthy.

Key Concepts

  • Load Balancing: Distributes traffic across multiple instances to prevent overload and improve responsiveness.
  • Auto Scaling: Automatically adjusts the number of running instances based on defined criteria (like CPU utilization) for efficient resource usage and handling changing demand.
  • Target Groups: Hold instances that the load balancer will route traffic to.
  • Launch Templates: Predefined instructions for launching instances, ensuring consistency during scaling events.

  • [Instructor] Hey y’all, and welcome to our final exercise walkthrough where we handle the load balancing and auto scaling for our application. As you can see, I am already logged into the AWS Management Console and I’m using the Admin user that I created earlier in order to handle these tasks. So in order to set up our load balancing and Auto Scaling, the first thing I’m going to do is launch the application instance. So I’m going to go over to EC2, and then go over to my Instances and work with the last
    instance that I created. In this case, it’s employee-directory-app-dynamodb. So I will select that and I will go ahead and use the shortcut that we’ve been using, where I click Actions,
    Image and templates, and then Launch more like this. In this case, instead of appending dynamodb, I am going to append lb for load balancing for this instance, and then I’m just gonna scroll and make sure that everything is where I want it to be. So I want to make sure I continue using the same key pair, and then I also want to make sure that I enable the public IP, so that I can access and test this instance. From there, I’m going to scroll down to the Advanced details. Again, just to double check that my instance role is still selected and that my user data is where I left it. And from there, I’ll go ahead and click Launch Instance. And now that that instance is launching, I can go over to View my instances, and I will just give this some time to go ahead and be fully launched, so that we can make sure that the instance is up and running and that the application is ready to go. All right, now that it
    has been a few minutes, I’ll go ahead and click refresh again. And it looks like two of two checks have passed for this application instance. So just to verify that the application is up and running, I’ll copy that IPV for IP address and then paste that into a new tab. And it looks like our Employee Directory is up and running. And because it does have access to the S3 bucket and DynamoDB table that were created, the data that I had already added to the directory is still going to be visible even though this is a new instance. So now that I verify that that works, I can go ahead and close that. And what I want to do next is to create the application load balancer. So to do that, I do need to be in the EC2 Console because the load balancers are accessed through the EC2 Console. So the way that I get to the load balancing is, I scroll down in this
    side navigation pane, and then there’s this section near the bottom that says Load Balancing. What I want to do is click Load Balancers. And now that I’m in the Load Balancers page, I will go ahead and click Create load balancer. And the type of load balancer that I want here is an Application Load Balancer. So under that option, I will go ahead and click Create. For my load balancer name, I want to go ahead and put in app-elb, and then I do wanna make sure that this is an internet-facing load balancer, so that I can access its endpoint from the internet. After that, I’ll go ahead and scroll down. And under Network mapping, I need to make sure that I select my app-vpc because that is where my instances are launched, that’s where my instances for this application will be launched. And then for the
    Availability Zone mappings, I want to select both us-west-2a and us-west-2b. Once those are selected, you can see that it pops up with the ability for me to specify which subnets I want to route within, and I will go ahead and keep that with the Public Subnets for those Availability Zones. So scrolling down, I get to Security groups. And what I want to do here is just make sure that proper access is set up for my load balancer. So to do that, instead of using this
    default security group, I’m going to go ahead and click Create new security group, and that will take me to a page where I can do just that. I can create a new security group. So for this, I’m going to name it load-balancer-sg and then my Description is just… This is for HTTP access. The VPC for this needs to be my app-vpc, so just making sure that that’s the one that is selected. And then I want to work with my rules. So it currently has no inbound rules, so I’ll go ahead and add a rule, and then add HTTP access from anywhere. And from there, I will go ahead and create my security group. So now that that security
    group has been created, I can close out that new tab and then refresh my security group list. And now I have the ability to select the security group that I just created, and that will be the security group that I use for this load balancer. Now that that’s been done, I’m going to scroll down to Listeners and routing, and this is gonna be where I determine how my load balancer is receiving requests and how it’s forwarding those requests on. So here I’m going to click Create target group, and that will open up a new tab. I want to keep Instances selected as the target type, and then I want to name my target group, app-target-group. After that, I want to make sure that it’s associated with the correct VPC. And then I want to scroll down to Health checks. And Health checks is going to be how the load balancer determines which instances are capable or which targets are capable of receiving requests. Here I want to expand Advanced health check settings, and I want to make my
    Healthy threshold, two, and my Unhealthy threshold, five. I also want to make my Timeout 30 seconds, and I want to change the interval of checks to 40 seconds. So what this does is, the healthy threshold is how many consecutive health checks that are successful are going to determine if the instance or if the target is available to receive requests. An unhealthy threshold is how many consecutive failed health checks determine which targets are not available to receive requests. For the Timeout, this is just how long is it going to take for the health check to fail if no response is received. And then the Interval is just how often are the health checks going to be sent. With all of those selected, I’m going to go ahead and scroll down, and then hit Next. And now I am able to register targets to my target group that I am creating. For this, I am going to go ahead and select the instance that I had just launched to include as a target, and then I’m going to select Include as pending below. Now that that has been included, what’s going to happen is, I will create the target group and then the load balancer will start those health checks to make sure that that instance as a target is up and running. So I’ll go ahead and click Create target group, and that target group has been created. So since that target
    group has been created, I will go ahead and close this tab. And back on my load
    balancer creation page, I can go ahead and refresh my listeners in routing dropdown list, and I can select app-target-group which I have just created in order to be utilized by the load balancer. Once I’ve selected that, I can scroll down and click Create load balancer. And now that the load
    balancer is being created, I can click View load balancer. And as you can now see, I have a load balancer that has freshly been created and it is just currently
    being provisioned, and that will just take a couple of minutes. So after a couple of minutes, we can see that the state of the load balancer has changed to Active. So what I want to do from here is just test to make sure that I can access my application from the load balancer. So to do that, I will select this specific load balancer. And what I want to do is copy the DNS name for this load balancer. With that copied, I will go ahead and open up a new tab, paste that load balancer
    endpoint in there. And as we can see, the load balancer is directing traffic to the application instance with the data that I have added. Okay, now that I can see that that works, I’m going to go back to the EC2 Management Console. And the next thing that I want to do is get everything ready for scaling. So the first thing that I need to do in order to do that is to create a launch template. So in the left-hand menu here, under Instances, there’s a selection for Launch Templates. I’m going to go ahead and select that. And because I don’t currently have any launch templates created, I will go ahead and click Create launch template. My name for this will be app-launch-template and my description will just be A web server for the employee directory application. From there for Auto Scaling guidance, I want to select this because I am specifically going to be focusing on EC2 Auto Scaling, and then I want to scroll down and select the components that are necessary for me to launch my instances. What the launch template does is just, it tells Auto Scaling what it is that I want to launch. So what instances, how to configure those instances and all of the details that you would use if you were manually launching instances. So what I can do here is, since I do have instances already running, I can select Currently in use, and that will bring up what I am currently running. And because my account only has one instance currently running, it will pull the information for that instance. For my Instance type, I want to select a t2 micro, because that is the instance type that I want to scale, and it’s Free tier eligible, so that will keep those cost down for this little exercise. For my key, I’ll just keep selecting the same keys that I have been. And then for my Network settings, I want to make sure that I select the security group that I created specifically for this. And so that is going to be my web-security-group. After I’ve done that, I’m going to go ahead and scroll down to my Advanced details and expand that. And the two main things that I want to do here are, I want to make sure that my instance profile is utilizing the role that I’ve set up for access to the other resources, and then I want to make sure that I add my user data in order to install and configure the application that I’m using. So to do that, I will go ahead and paste the user data in here, but then I need, I just need to make sure that I change the necessary components so that I am working with the same resources. So I’ll change my bucket to the bucket that I’ve created for this application, and I will change my region to us-west-2 for the Oregon region. So once I’ve done that, I can create my launch template. And as we can see, my launch template has
    been successfully created. And if I go over to View launch templates, I can see the newly
    created launch template. From here, I need to create the actual Auto Scaling group. So to do that, I am going to scroll down in this left-hand navigation pane. And under Auto Scaling, I’m going to select Auto Scaling Groups. I don’t currently have
    any Auto Scaling groups, so I will click Create Auto Scaling group. The name for this group will be app-asg. And because I’ve created
    a launch template, I can go ahead and just select the launch template that was created
    specifically for this group. Once that has been selected, I can go ahead and click Next. And this is where I get to choose various launch options. So in this case, I want to make sure that I am launching into my app-vpc, and then I want to launch these instances into my public subnets. So I will choose Public Subnet 1 and Public Subnet 2. With those selected, I can go ahead and click Next. And now I’m in my advanced options. The reason that this is important is because for one I’ve
    created a load balancer and I need to make sure that this Auto Scaling group is utilizing that load balancer. So what I can do is attach to an existing load balancer, and then I can choose my load balancer from my load balancer target groups. From here, I can just drop this down and select app-target-group. And then for my Health checks, I will just have the load balancer handle the health checks, since those have already been set up. After that, I will click Next. And here is where I can decide how large and how small I want my group of instances or my fleet of instances to be. So what I’m going to do here is, I’m going to change my Desired capacity to two. I also want my Minimum capacity to be two, and then I want my Maximum
    capacity to be four. This will mean that my Auto Scaling group will launch with two instances, and then it will never get smaller than two instances. So if one of those two instances is deemed unhealthy, it will launch an instance to replace that. If there is a load on the load balancer and a load on the instances, then the maximum that my fleet, the maximum size that my fleet will be is four instances. And so that’s what this group size is establishing. After that, I want to make sure that, for my Scaling policies, I choose a Target scaling policy. And for this Target scaling policy, I can change my target value for my average CPU utilization to be 60, and I’ll keep the 300 seconds of warmup as the default. And that’s just going to make sure that if I scale, the scaling is going to be based on my average CPU
    utilization across my fleet, and if that hits or goes above 60%, then it will launch new instances, but it will give those
    instances 300 seconds or five minutes to warm up and start passing their health checks before it does another scaling action. So now that that’s done, I can go ahead and click Next. And for notifications, I am not currently going to add notifications for my walkthrough, but here you could add a notification and make that notification an SNS topic and utilize your email address so that you can be notified anytime a scaling action is taken. But instead, I’m just going to go ahead and click Next, and then Next again, and just quickly review that everything is where I want it before creating my Auto Scaling group. So now that my Auto Scaling
    group has been created, I need it to get everything to that desired capacity. And once it’s there, I can test and make sure that my application is still accessible, and then I can test scaling. So before I test my application, what I want to do is go back to the EC2 Console, and then under Load Balancing, near the bottom of the left-hand menu, I want to hit Target Groups. The reason I’m doing this is that, I want to make sure that before I start testing anything, that everything is running and is healthy. If I want to keep an eye on it, what I can do is click Targets. And then as you can see, the Auto Scaling group has launched two additional instances, one in each Availability Zone, and they are currently healthy. So that means that my application is up and running and I have multiple instances now that will be utilized whenever requests are made. And so, if one were to become unhealthy or go down for any reason, I have the other instances there to still answer those requests. So to test my application and the scaling of the application, what I’m going to do is go back to the application page that I have open, and there’s a stress test built into this application just to make it easier for this exercise. So what I’ll do is append /info to the end of my URL, and that will take me to the page that allows me to handle some tooling. One of the things I can do is I can refresh this page and it, as you’ll see, it will go between the various instances that I have running and the Availability Zones that those instances are running in. So refreshing this will show that it’s changing the instances and changing the Availability Zones that are being hit. But what I want to do to test this is utilize this Stress cpu, built-in test here. And I’m gonna go ahead and stress the CPU for 10 minutes. So this is going to allow me to stress my CPU to get it above that 60% threshold and launch new instances. And so while I’m waiting for that to happen, I’m going to go back over to the target group and view my targets. And then I’m just going to periodically refresh this to see when new instances are being added and when they’re healthy. So I’m just going to
    give that some minutes as I wait for those to launch. So now that I’ve given it some time, I’m going to just see if some new instances have been launched. So I’m gonna go over to Instances, and as I can see, two more instances have been launched in response to the scaling action that is happening because of the CPU stress that we initiated. So these instances will be launched and then they will be added to the group of instances that are part of the entire fleet. I hope you enjoyed the course and that the exercises
    were helpful to you. And I’ll see you in another course.

Quiz: Week 4 Quiz

What are the three components of Amazon EC2 Auto Scaling?

Which of the following features are included in Elastic Load Balancing (ELB)?

True or False: When a user uses Elastic Load Balancing (ELB) with an Auto Scaling group, it is not necessary to manually register individual Amazon Elastic Compute Cloud (Amazon EC2) instances with the load balancer.

An application must choose target groups by using a rule that is based on the path of a URL. Which Elastic Load Balancing (ELB) type should be used for this use case?

What are the two ways that an application can be scaled?

Which elements in Amazon CloudWatch dashboards can be used to view and analyze metrics?

What are the possible states of a metric alarm in Amazon CloudWatch?

What kind of data can a company collect with VPC Flow Logs?

What is a benefit of monitoring on AWS?

True or False: When a company redesigns an application by using a serverless service on AWS, they might not need to configure networking components, such as a virtual private cloud (VPC), subnets, and security groups.

Going Serverless


Video: Redesigning the Employee Directory

Original Architecture:

  • Employee directory application hosted on EC2 instances (combining presentation and application logic).
  • Application Load Balancer for traffic distribution.
  • DynamoDB database.
  • S3 for image storage.

Challenges of Original Architecture:

  • Potential overload on EC2 instances due to handling both website display and backend logic.
  • Maintenance overhead including patching, updates, and instance optimization.

Proposed Serverless Redesign:

  • S3 for Static Website Hosting: Move HTML, CSS, JavaScript (presentation layer) to S3.
  • JavaScript for Dynamic Content: Leverages JavaScript’s ability to make HTTP requests to load data from the backend.
  • AWS Lambda for Application Logic: Replaces EC2 instances, running backend code in response to events (API calls).
  • Amazon API Gateway: Front-end for the backend API, triggers Lambda functions based on requests.
  • DynamoDB and S3 Retained: Database and image storage remain the same.
  • IAM Roles: Manage secure access between components.

Benefits of Serverless Redesign:

  • Scalability: Serverless components scale automatically based on demand.
  • Reduced Operational Overhead: No more patching or server management.
  • Potential Cost Optimization: Depends on usage patterns and traffic.
  • Flexibility: Modular design allows for future changes with less disruption.

Additional Services:

  • Amazon Route 53: DNS management.
  • Amazon CloudFront: Caching for faster delivery of static content.

Overall: The serverless architecture offers improved scalability, easier maintenance, and the potential for cost savings by focusing on pay-per-use components (especially for variable traffic loads).

  • Nice work getting through this course. Let’s take a look at how this architecture we built turned out. The employee directory application
    is currently being hosted across multiple EC Two instances inside of a VPC in a private subnet. These EC Two instances are part of an EC Two auto-scaling group. And traffic is being distributed across them using an
    application load balancer. The database is being
    hosted on Amazon DynamoDB. And the images are stored in S3. Beautiful. Looking at this from a
    maintenance perspective, you would need to ensure that your auto-scaling
    policies are working well with your expectations and it would likely take
    some tweaking over time. You also would need to install
    security patches and updates for EC Two, as well as keep an
    eye out for new instant sizes or types that might help you
    further optimize your solution. Now, this is really great. But as with everything built on AWS, there are multiple ways you
    can architect a solution and have success. It really depends on what
    you are optimizing for and what you are trying to do that will determine how you
    architect an application. That being said, what I want to do now, is present to you an architecture that could be a wonderful
    serverless redesign of the employee directory application, taking full advantage
    of cloud native services like AWS Lambda. I’m going to touch on some services we haven’t covered yet in this course to give you ideas of
    alternative architectures. So this employee directory application is a great example of a
    standard three-tier application, where you have the presentation layer, the application layer, and the data layer. The presentation layer
    is the user interface. The application layer
    is the business logic. And the data layer is the database. As things are right now, the Amazon EC Two instances are hosting both the presentation layer, as well as the application layer. This is true because the EC Two instances have a web server running that is serving the
    content for the website, like the HTML, CSS, and JavaScript, which is the presentation layer. Then the same instances
    are also handling requests for the backend logic for
    viewing, adding, updating and deleting employees, which
    is the application layer. What I want to do now is separate
    those two pieces entirely, having the front-end of the
    website hosted separately from the backend application logic. It’s important to separate
    the presentation layer from the application layer so that the instances are not overloaded by handling different types
    of requests at the same time. We’re going to move the presentation layer to be hosted out of Amazon S3. S3 supports static website hosting, and therefore this is a great place for us to host the HTML, CSS and
    JavaScript for our website. When you’re hosting a
    static website with S3, you may think, “Well, my website isn’t
    static. It’s dynamic.” It’s pulling data from a database, so this isn’t a static website, and therefore S3 would not
    work for this use case. This is where JavaScript comes in. JavaScript files have the
    ability to make HTTP requests and load dynamic content, modifying the static
    page to display results that come back from requests. So this should work well. The presentation layer is taken care of. Now, I want to tackle
    the application layer. It used to be hosted on Amazon EC Two. But let’s go ahead and
    change this to AWS Lambda. This means that our employee
    directory application code would only be run in response
    to events being triggered by the front-end presentation layer. Now, you don’t want your
    front-end talking directly to your backend code. So you would instead expose
    your backend using an API. We would use a service Amazon
    API Gateway to host this API. Each action you could take on an employee would have its own method on the API. This API hosted on API Gateway would act as a front door
    to trigger the backend code, which we would host on AWS Lambda instead of EC Two as discussed. We could have one Lambda
    function handle all of the requests for employee data, or we could have one Lambda
    function for each action. We would keep DynamoDB for the
    database or the data layer, and we would also keep S3 for
    the employee photo storage. All of the access between
    these services would be handled via role-based access using IAM roles. One nice thing about this, is notice how, because we built the
    solution in a modular way, we were able to swap out how we were handling the
    presentation and application layer while leaving the data
    layer totally intact with no modifications. That is the type of flexibility
    that can help you innovate and adapt quickly to changes. So now for completeness and clarity, let’s focus on the new architecture and fill it out a bit more. I will add some other AWS
    services to this diagram that you can explore on your own. First, I’m going to add Amazon Route 53 for domain name management and Amazon Cloud front here as well, which will allow us to
    cash those static assets like the HTML, CSS and JavaScript, closer to the end users by taking advantage of AWS Edge locations. If a user wants to visit the employee directory application website and view all of its
    employees, here’s the flow. The user would first type in
    the domain for the website, which would get sent to Amazon Route 53. Route 53 would send back to the client the address of the static
    website being hosted on S3, and then the website would
    be rendered in the browser. This website has JavaScript making the API calls to the backend to load the dynamic content. So the API call to load all
    of the employees would be made and it would hit API Gateway first. API Gateway would validate the request and then trigger the
    backend Lambda function. The Lambda function would
    then send an API call to DynamoDB to query the
    employees in the table and it would return that
    data to API Gateway, which would then be
    returned to the JavaScript, which would finally be
    rendered on the page. All right, and that’s that. With this architecture we just laid out, we have optimized for
    scalability, operational overhead, and depending on your usage, it could also be optimized for cost. The serverless aspects of this make the operations for support much less than compared with Amazon
    EC Two based workloads. There is no patching, or AMI management, when you use serverless
    solutions like AWS Lambda. Also, notice how it was not required that I create a VPC,
    subnets, security groups, or network access control
    list for the solution. The networking aspect of
    this is managed for you. Though, you can integrate
    serverless services with your VPC if you need
    to, for compliance reasons. But it’s not required to get
    a solution up and running. You have many options to choose from when designing your application. You can imagine a scenario where you redesign the same application to be hosted using AWS container services, and then this entire
    diagram would change again. There are a lot of ways
    you can build on AWS and that’s the beauty of it. You can swap certain pieces
    of your solutions out as AWS services are released
    or gain new features. And because everything
    in AWS is an API call, you can automate the
    process along the way. That’s it for this course. From me, Seph, Meowzy and Fluffy, thank you so much for learning with us. One more reminder to
    please, please, please, remember to delete any
    resources that you’ve created in your own AWS account for this class to avoid incurring any costs if you’ve been following along. Thanks again and see you next time.

Practice Quiz: Self-Graded Activity: Capstone Project

Reading

End of Course Assessment


Quiz: End of Course Assessment

What are the four main factors that a solutions architect should consider when they must choose a Region?

Which statement BEST describes the relationship between Regions, Availability Zones and data centers?

Which of the following can be found in an AWS Identity and Access Management (IAM) policy?

A solutions architect is consulting for a company. When users in the company authenticate to a corporate network, they want to be able to use AWS without needing to sign in again. Which AWS identity should the solutions architect recommend for this use case?

A company wants to allow resources in a public subnet to communicate with the internet. Which of the following must the company do to meet this requirement?

What does an Amazon Elastic Compute Cloud (Amazon EC2) instance type indicate?

What is a typical use case for Amazon Simple Storage Service (Amazon S3)?

A solutions architect is working for a healthcare facility, and they are tasked with storing 7 years of patient information that is rarely accessed. The facility’s IT manager asks the solutions architect to consider one of the Amazon Simple Storage Service (Amazon S3) storage tiers to store the patient information. Which storage tier should the solutions architect suggest?

Which task of running and operating the database are users responsible for when they use Amazon Relational Database Service (Amazon RDS)?

True or false: A Multi-AZ deployment is beneficial when users want to increase the availability of their database.

What are the three components of Amazon EC2 Auto Scaling?

An application must choose target groups by using a rule that is based on the path of a URL. Which Elastic Load Balancing (ELB) type should be used for this use case?

Video: Introduction to Amazon CodeWhisperer

Absolutely! Here’s a concise summary of the provided text, focusing on Amazon CodeWhisperer’s key features and benefits:

What is Amazon CodeWhisperer?

  • AI-powered coding assistant integrated within your IDE.
  • Generates code suggestions in real-time based on your comments and existing code.
  • Supports both general-purpose coding tasks and optimization for AWS services.

How does it work?

  1. Natural Language Comments: Describe your desired code in plain English, and CodeWhisperer will offer corresponding code snippets.
  2. Code Generation: Can suggest single lines or whole functions to accelerate development.
  3. Code Completion: Provides smart suggestions as you type, streamlining common syntax.

Key Benefits

  • Productivity Boost: Reduces time spent on boilerplate code and searching for syntax.
  • Security Assistance: Helps you identify and fix potential vulnerabilities aligning with security best practices.
  • Easier Learning: Simplifies learning new APIs, especially for AWS services.

How to Get Started:

  1. Install the AWS toolkit for your IDE.
  2. Create an AWS Builder ID (free).

Important Notes

  • CodeWhisperer’s suggestions are a starting point – always review and customize the generated code.
  • Use descriptive comments and intuitive naming conventions to maximize suggestion quality.

  • [Morgan] Wouldn’t it be
    nice if when you were coding, you had an always-online,
    always-available companion helping you along the way? With generative AI, you can have a helpful virtual coding
    buddy right in your IDE. I’m talking about the
    service Amazon CodeWhisperer. CodeWhisperer is an AI coding companion that generates coding suggestions to you in real time as you type. These coding suggestions
    may be single line or full function code and it can really help you accelerate how quickly you can build software. With CodeWhisperer,
    you can write a comment in natural language that outlines
    a specific task in English such as upload a file to Amazon S3 with server-side encryption. Then CodeWhisperer recommends
    one or more code snippets that you can use to accomplish the task directly in your IDE. You can then optionally
    accept that suggestion or cycle through other suggestions. CodeWhisperer is a general purpose tool meaning it can help you
    with general coding tasks, but it is also optimized
    for popular AWS services. Think about how many
    times you’ve forgotten the syntax for a common task or maybe you don’t know the syntax for how to interact
    with a specific AWS API. With CodeWhisperer, you don’t need to go search on the internet for an answer. Instead you can write a comment for what you are trying to do and CodeWhisperer will generate
    some code for you to review. This can save you a lot of time when compared to reading
    through documentation searching for an example. CodeWhisperer will give
    you a code suggestion and you can get answers without
    needing to leave your IDE. That being said, it’s not just
    about simple code generation. You can also have it generate more complex code like full functions and with that, it can help
    you solve problems quickly and efficiently through
    its code suggestions. CodeWhisperer Individual is
    available to use at no cost by creating and signing
    in with an AWS Builder ID. The signup process only
    takes a few minutes and does not require a credit card or an AWS account. CodeWhisperer also has a feature where it scans your code to detect hard-to-find vulnerabilities and gives you code
    suggestions to remediate them. This can help you align to best practices for tackling security
    vulnerabilities like those outlined by Open Worldwide Application
    Security Project or OWASP or those that don’t meet
    crypto library best practices or other similar security best practices. All right, enough talking
    about this amazing tool. Let’s see it in action. I’m in my PyCharm IDE and I already have the
    AWS toolkit installed which includes Amazon CodeWhisperer under the developer tools section I am also logged in with my Builder ID, so I’m ready to start using CodeWhisperer. I have a blank Python file and since I already have
    CodeWhisperer turned on, now I want to create a simple program that creates an S3 bucket
    given a bucket name. First, I’m going to type
    the import statement at the top of the file, importing boto3 which is
    the AWS SDK for Python. Then I’m going to make a
    comment to create the S3 client. You can see CodeWhisperer is popping up to make a suggestion
    for creating the client. I can use the arrow keys to cycle through the different suggestions and I can press the Tab button
    to accept the suggestion which I will do in this case. Then I’m going to delete this next comment that it generated. Then I want to write a comment describing the function
    to create the bucket. I will make a comment saying
    function that inputs a string and creates an S3 bucket
    with error handling. And then I can hit enter and you can see CodeWhisperer
    is making some suggestions. I can cycle through these
    using the arrow keys, so I’ll hit the arrow key and we can see the different suggestions that it’s making to us. I am going to accept this
    suggestion by hitting Tab. Now if I hit enter, we can see what else
    CodeWhisperer suggests. I’m going to accept some more suggestions until this function looks completed. All right, so now we have a function that will create an S3 bucket
    given the string as a name and it will catch an exception and print it so that the
    program doesn’t throw a stack trace if something goes wrong. Note how I’m writing comments and letting CodeWhisperer
    generate suggestions that way. You don’t need to use it this way. You can code like you usually do and it will suggest things along the way. For now, let’s continue
    using the comment method. Let’s write a comment
    to call the new function and let CodeWhisperer
    generate that as well. CodeWhisperer put in a
    placeholder name here that says bucket name, but the thing is about S3 is that bucket names have
    to be globally unique. So I’m going to append this bucket name a bunch of numbers to make sure that this wasn’t taken by another account. Then I can hit enter and continue. Now I want to use CodeWhisperer to generate some sample JSON data and then I want to write that data into an object in our new S3 bucket. So I will create another comment and type generate JSON
    data to upload to S3. Then I can hit enter and we can then start
    to see the suggestions. I’m going to accept this suggestion here. This fills out some dummy data for this particular JSON sample. This looks fine for our
    sample proof of concept. I’m gonna go ahead and hit enter again. Then I will make another comment to upload this object to S3 by saying upload data to S3 and then we can cycle
    through the suggestions here, and I’m going to accept this one. And you can see that this line of code to upload the data to S3 has the bucket name
    that we created earlier. This is because CodeWhisperer looks at all of the code in the file and uses that context to generate code. CodeWhisperer uses information
    like code snippets, comments, cursor location, and contents from files open in the IDE as inputs to provide code suggestions. So that is how it can give you really specific suggestions like this because it takes in all of this context. Now we need to run the file. So before I do that, I’m going to add the closing
    parentheses to this statement and then I want to modify the code that CodeWhisperer suggested to me. I’m going to save the
    response into a variable here and then I want to add
    some print statements. So I’m going to print the response and you can see CodeWhisperer
    is helping me do that. And from here I’m going to save it and then we can right click on App py and select Run App. We can see that we have a
    status code of 200 right here which means that the bucket was created and the object was uploaded. You can do all sorts of things
    with Amazon CodeWhisperer. You can generate single lines of code, full functions, generate
    sample data and more. Now that we have explored
    how to use comments to generate the suggestions, let’s move on to using
    single line code completion meaning while I’m typing, it will make suggestions sort of like a really smart IntelliSense. To highlight this, I’m going to add a new function definition to this file. Back in the file, I’m going to scroll down to the bottom and then we can begin
    typing our new function. I want this one to list all
    of the buckets in the account, so I’m going to type def list_buckets and then we can see that CodeWhisperer is suggesting some code for
    how to write out this function. I’m going to accept the first one and we can see that this
    will call the S3 client and call the API list buckets and then print out the response. So you don’t always need to make a comment for CodeWhisperer to
    generate suggestions for you. You can just code like normal and it will kick in and start working. Something to note about CodeWhisperer and generative AI in general is that it doesn’t magically know exactly what you’re trying to do meaning that CodeWhisperer
    might not generate exactly what you want every time, but using it can still save tons of time as it can lay out a lot of
    the basics that you need. Then you can go in and modify
    it to your specific needs. You own the code that you write and are responsible for it including any code suggestions
    provided by CodeWhisperer. So always review a code
    suggestion before accepting it and edit it as needed. To get the most out of CodeWhisperer, you should write comments that are short and mapped to smaller discrete tasks so that no single function or code block is too long. CodeWhisperer is the most helpful when you use intuitive names
    for various code elements such as function names. The more code that is available
    as surrounding context, the better the suggestions will be. To round out this video, let’s review why you
    should use CodeWhisperer. CodeWhisperer helps accelerate
    software development by providing code suggestions that can reduce total development effort and let you have more time for ideation, complex problem solving, and
    writing differentiated code. CodeWhisperer is a really great tool for reducing the amount of time you are spending writing boilerplate code, so that leaves more time for you to focus on more interesting tasks
    in software development. CodeWhisperer can make general
    purpose code suggestions and AWS API related coding suggestions, and CodeWhisperer can help you improve application security by helping detect and remediate security vulnerabilities. This was a short demo
    on Amazon CodeWhisperer, but I hope you can see how it can help you be more productive as a developer. Install the AWS toolkit in your IDE and sign in with an AWS
    Builder ID to get started.