AWS (Amazon Web Services) Cloud Data storage

Published:
Est. reading time: 5 minutes
Author: Mia Hatton

Amazon S3 is described by Amazon as “object storage built to store and retrieve any amount of data from anywhere”. Similar to Google Drive in that it can store many different data types, Amazon S3 makes it easy to meet the data needs of large and complex web applications - including Netflix, Airbnb, and Amazon itself.

Mia Hatton

Budding data scientist with an entrepreneurial and science communication background.

More

Amazon S3 is an Amazon Web Services (AWS) product. Read more about AWS here.

Definition of Amazon S3

From Amazon Web Services:

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides easy-to-use management features so you can organize your data and configure finely-tuned access controls to meet your specific business, organizational, and compliance requirements. Amazon S3 is designed for 99.999999999% (11 9’s) of durability, and stores data for millions of applications for companies all around the world.

Does your organisation need Amazon S3?

Amazon S3 is described by Amazon as “object storage built to store and retrieve any amount of data from anywhere”. Similar to Google Drive in that it can store many different data types, Amazon S3 makes it easy to meet the data needs of large and complex web applications - including Netflix, Airbnb, and Amazon itself.

Amazon S3 has a non-hierarchical structure, allowing you to organise your data in ways that best suit your business needs. Object data is stored within ‘buckets’ and organised using flexible key-value tags. Keeping track of your data is made easy by the S3 Inventory report, which can be configured to generate daily or weekly reports to keep you informed.

There are several storage classes available for Amazon S3 storage, differentiated by how frequently you would access the data. The Standard storage class is intended for frequently accessed data, whereas the Glacier storage class is intended for data archiving and as a result has longer retrieval times. The Intelligent Tiering class is designed to optimise storage costs by automatically moving data to the most cost-effective storage access tier. The storage class options make Amazon S3 valuable for a variety of purposes, from creating backups to hosting web applications.

You may need Amazon S3 if:

  • you need a reliable and scalable back-up solution for a variety of data types
  • you need to retain data for long periods of time without accessing it
  • transfering your data to analytics solutions is costly and time consuming, and you want a storage solution that can be queried directly
  • you are building a mobile or web application and need to store production data
  • you want to quickly and easily build a data lake solution on which to build AI products
  • you want to host a web application or website

Benefits

Benefits of Amazon S3 include:

  • It features industry-leading availability, durability and scalability
  • There are a range of cost-effective storage classes available to meet and scale with your needs
  • A number of tools are available to ensure security and compliance
  • It features a flexible storage management and administration capabilities
  • The query-in-place services for analytics remove the need to transfer data for analysis
  • It integrates with a wide range of solutions for primary storage, backup and restore, archive, and disaster recovery

Technical considerations

Prerequisites and Integrations

To get started with Amazon S3, you need an AWS account. Read more about AWS here.

You can set up Amazon S3 in no time by following Amazon’s comprehensive Getting Started Guide. The guide walks you through the process of signing up for Amazon S3, creating your first bucket, adding objects to it, and manipulating the objects within the bucket.

There are several options for transferring your data into Amazon S3, including via the command line, using the API, or through the AWS Management Console. Your offline data can be transferred usine AWS Direct Connect or AWS Storage Gatway, or it can be physically transported into Amazon S3 by an AWS Snowball device.

A variety of Software Development Kits (SDKs) are available to make it easy to integrate your application with S3, including SDKs for Java, Python, .NET, PHP, and Node.js.

Security and Compliance

Amazon S3 is “secure by default” - only resource owners have access to the resources they create. Access control mechanisms can be used to selectively grant access to additional users, while the the Amazon S3 console makes it easy to review permissions and accessibility.

Data transfer into and out of Amazon S3 can be performed via SSL endpoints using the HTTPS protocol, and optionally the Server-Side Encryption can be used to secure your data at rest.

An additional AWS service, Amazon Macie, can be employed to ensure data protection within Amazon S3. Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS.

You can read more about AWS security here.

Amazon S3 has been assessed by third-party auditors to ensure its security and compliance against a range of international standards. AWS provide several resources and services to help you ensure that your configuration is compliant with industry standards.

Pricing

Amazon S3 is a pay-as-you-go service so you only pay for what you use, and there are no up-front set-up fees. Read more about AWS and its pricing structure here.

With Pay-As-You-Go (PAYG) pricing for Amazon S3, your monthly bill is calculated based on:

  • the size of the objects stored, their storage class, and the time for which they are stored
  • the number of requests made against the objects in storage
  • the number of retrievals
  • any storage management features that are enabled
  • the number of objects stored in Intelligent Tiering object classes, as these incur monthly monitoring and automation fees

You can calculate the monthly cost of using Amazon S3 here.

Alternatives to Amazon S3

Amazon S3 offers enormous scope for functionality, flexibility, and security, with which necessarily comes great complexity. If you don’t have need for Amazon S3’s more advanced features then you may find it easier to set up your object storage on a different platform - such as Dropbox or Google Drive, which are designed with team collaboration in mind.

Other cloud vendors offer similar products to Amazon S3, so if you already use Azure or Google Cloud services, you might find that choosing the same vendor for your object storage saves time and money. These products are: