Amazon AWS Certified SysOps Administrator Associate – S3 Storage and Data Management – For SysOps Part 6
15. [SAA] S3 Analytics
So one more thing to add on the storage class analysis, so you can set up s three analytics to determine when to transition objects from standard to standard IA. So figuring out after how many days it is best for you to transition the objects, it does not work for ones in IA or Glacier, just for standard to standard IA. And so this report, once you enable it, is updated daily. And if you want to first enable it, it will take between between 24 to 48 hours for the first start.
So in case you want to build lifecycle rules or improve them, the first very good step to doing so is to enable s three analytics to figure out after how many days it will be wise for you to move your objects from standard to standard IA. And that very simple thing is something that can come up in the exam. That’s it. I hope you liked it and I will see you in the next lecture.
16. [SAA/DVA] S3 Performance
We have to talk about the S three baseline performance. So by default, Amazon S Three automatically scales to a very, very high number of requests, and has a very, very low latency between 102 hundred milliseconds to get the first bite out of S three. So this is quite fast. And in terms of how many requests per second you can get, you can get 3500 put, copy post elites per second per prefix, 505,500 get head requests per second per prefix in the bucket.
So this is something you can get on the website, and I think it’s not very clear. So I’ll explain to you what per second per prefix means. But what that means, and viral, is that it’s really, really high performance and there’s no limit to the number of prefixes in your bucket. So let’s take an example of four objects named file, and let’s analyze the prefix for that object.
The first one is in your bucket, in folder one, subfolder one file. In this case, the prefix is going to be anything between the bucket and the file. So in this case it is folder one, sub one. So that means that for this file, in this prefix, you can get 3500 puts and 500 gets per second. Now, if we have another folder one and then sub two, the prefix is anything between bucket and file. So slash folder one, sub two. And so we get also 3500 puts, and 5500 gets for that one prefix, and so on. So if I have one and two, we have different prefixes. And so it’s easy now to understand what a prefix is. And so it’s easy to understand the rule of 3500 puts and 500 gets per second per prefix in the bucket.
So that means that if you spread reads across all the four prefixes above evenly, you can achieve 220 requests per second for head and get. So that’s perfect. Next, let’s talk about Kms as a limitation to S Three performance. So, if you have Kms encryption on your objects using SSE Kms, then you may be impacted by the Kms limits. When you upload a file, it will call S three on your behalf to generate data key Kms API. And when you download a file from S three using Ssekms, s Three itself will call the decrypt Kms API. And so these two requests will count towards the Kms quota. So let’s have an example. Our users connect to SV bucket. They want to upload or download a file using SSE Kms encryption.
And so Svocet will perform an API call, either generate data key or decrypt to a Kms key and get the result from it. And so by default, Kms has a quota of number requests per second. And based on the region you’re in, it could be 500 or 30,000 /second request. And you can request a quota increase using the service quota console if you need to go over that limit. So what this means is that if you have more than 10,000 requests per second in a specific region that only supports 500 requests per second for Kms, then you will be throttled. So you need to ensure that Kms doesn’t block your performance on S three. Now, these quotas are pretty big for normal usage, but still good to know if you have many, many files and a high usage of your S three bucket. Now, let’s talk about S three performance, how we can optimize it. The first one is multipart upload. So it is recommended to use multipart upload for files that are over 100 megabytes, and it must be used for files that are over 5GB.
And what multipart upload does is that it parallelizes uploads, and that will help us speed up the transfers to maximize the bandwidth. So as a diagram, it always makes more sense. So we have a big file and we want to upload that file into Amazon is free. We will divide it in parts so smaller chunks of that file and each of these parts will be uploaded in parallel to Amazon is free and Amazon is free. Once all the parts have been uploaded, it’s smart enough to put them together back into the big file. Okay, very important. Now we have S three transfer acceleration, which is for upload and download. And it is to increase the transfer speed by transferring a file to an atom isled edge location, which will forward then the data to the Sri bucket in the target region. So edge locations, there are more than regions.
There are about over 200 edge locations today, and it’s growing. And let me show you in the graph what that means. And that transfer exploration is compatible with multipart upload. So let’s have a look. We have a file in the United States of America and we want to upload it to a mystery bucket in Australia. So what this will do is that we will upload that file through an edge location in the United States, which will be very, very quick. And then we’ll be using the public Internet. And then from that edge location to the Amazon, a three bucket in Australia, the edge location will transfer it over the fast private database network. So this is called transfer acceleration because we minimize the amount of public Internet that we go through and we maximize the amount of private database network that we go through. So transfer acceleration is a great way to speed up transfers.
Okay? Now how about getting files, how about reading a file in the most efficient way? We have something called an S three byte render fetches. And so it is to paralyze gets by getting specific byte ranges for your files. So it’s also in case you have a failure to get a specific byte range, then you can retry a smaller bite range and you have better resilience in case of failures. So it can be used to speed up downloads this time. So let’s have a file in s three. It’s really, really big. And this is the file. Maybe you want to request the first part, which is the first few bytes of the file, then the second part and then the end part. So we request all these parts as specific byte range fetches.
It’s what it’s called by wrench because we only request a specific range of the file and all these requests can be made in parallel. So the idea is that we can parallelize the gets and speed up the downloads. The second use case is to only retrieve a partial amount of the file. For example, if you know that the first 50 bytes of the file in s three are a header and give you some information about the file, then you can just issue a header request, a byte range request for the headers using the first, say 50 bytes, and you would get that information very quickly. All right, so that’s it for s three performance, we’ve seen how to speed up uploads download, we’ve seen the baseline performance and we’ve seen the Kms limits. So make sure you know those into going into the exam. And I will see you in the next lecture.
17. [SAA/DVA] S3 & Glacier Select
Quick theory lecture on s three select and Glacier select. The idea is that we want to retrieve less data, so subsets of what we’re requesting using SQL by performing server side filtering. And so the SQL queries are quite simple. They can only be used to filter by rows and columns. So they’re very simple SQL statements. You cannot do aggregations or anything like this, and you will use Nestor less network and less CPU cost clientside. Because you don’t retrieve the full file. S Three will perform the select the filtering for you and only return to you what you need. So the idea is that before you have Amazon S Three sending all the data into your application, and then you have to filter its application site to find the right rows you want and only keep the columns you want.
And after you request the data from S Three using S Three Select, and it only gives you the data you need, the columns you want, and the rows you want. And the results Amazon is telling you is that you are up to 400% faster and up to 80% cheaper because you have less network traffic going through and the filtering happens server side. Okay, so similarly, let’s just do another diagram. We have the client asking to get a CSV file with S Three Select to only get a few columns and a few rows. Amazon S Three will perform server side filtering on that CSV file to find the right columns and the rows we want and send back the data filtered back to our client.
So obviously less network, less CPU and faster. So this is great. So to summarize from an exam perspective, anytime you see filtering of the data server side in S Three, you get less. Think about guess s three select and Glacier. Select. That works on Glacier as well. And then for more complex querying, that’s going to be serverless on S Three, you’ll see in the future lectures, we have something called Amazon Athena. All right, that’s it. I will see you in the next lecture.
18. [SAA/DVA] S3 Event Notifications
Okay, so now let’s talk about Amazon S three event notifications. So some events happen in your S Three bucket. For example, this could be a new object created, an object removed, an object has been restored, or there is an S three replication happening. And so you want to be able to react to all these events. You can create rules, and for these rules you can also filter by object names. For example, you want to react only to the JPEG file. So star JPEG and so you can create event notification rules. And these rules allow you to trigger some sort of automation inside of your AWS accounts. So use case for it. A very classic one would be to generate thumbnails of images uploaded to Amazon S three. So what are the possible targets for S three event notifications? Well, you have three.
You have SNS, which is a simple notification service to send notifications and emails. We have SQS for a simple queue service to add messages into a queue and finally lambda functions to generate some custom code. Now we’ll see all these services SNS, SQS and lambda in details in this course. But for now, just remember that you have these three targets for Amazon Sree event notifications and that will make a lot of sense. By the end of this course, you can create as many SRE events as desired, and most of the time they will be delivered in seconds, but sometimes they can take a minute or longer. And there’s a small caveat, which is that if you want to make sure every single event notification is delivered, you need to enable versioning on your bucket. So this is what these two very long lines of text are saying and this comes from the documentation. So let’s go ahead in the console to see how we can set up a variation as simple as free event notification.
5. Exam Preparation – Domain 4 Hey everyone and welcome back. In today’s video we will be discussing about the important pointers for exams for domain four. So the first thing here is that you should know about tagging strategies, about resource groups, about the EC to pricing models here then the overview about S… Read More »
4. Exam Preparation – Domain 3 Hey everyone and welcome back. Now in today’s video for important pointers for exams our focus would be on domain three. So the domain three is migration planning and it constitutes to be 15% of the examination. So let’s get started and see what are some of the important… Read More »
3. Exam Preparation Part 02 – Domain 2 Hey everyone, and welcome back. Now, in today’s video, we will be continuing our journey, understanding some of the important pointers for exams for domain two. As we have discussed, even in the earlier video, be aware about the distinction on which use cases where DynamoDB would… Read More »
2. Exam Preparation Part 01 – Domain 2 Hey everyone and welcome back in today’s video for important pointers. For exam, our focus would be on domain two. Now the domain two, which is designed for new solutions is one of the largest domains of this certification both in terms of the number of topics… Read More »
1. Exam Preparation – Domain 1 Hey everyone and welcome back. Now in today’s video we will be discussing about some of the important pointers for exam as far as the domain one is concerned. Now the domain one is basically designed for organizational complexity and it constitutes to be twelve 5% of the overall… Read More »
69. Building the function for our API Hey everyone and welcome back. Now in the earlier lecture we discussed that there are two important components that we need. The first is the function and second is the API. So what we’ll be doing today is we will be creating a function after function is created.… Read More »