CIRRUS S3 Storage¶
To help enable cloud native application development there is a S3 compatible storage system on site that can be used from CIRRUS. This is an object storage system and does not provide a POSIX compliant file system.
The system provides HA object storage that is replicated between ML and NWSC. If a site goes down then objects will continue to be available. S3 systems are most often accessed programmatically and at this time we do not offer any GUI browsability of buckets and objects.
Access¶
Open a jira ticket or email cirrus-admin@ucar.edu for credentials to the system.
Endpoint¶
Access Methods¶
S3 was created by Amazon AWS and is a way to access objects over http calls. Thus it can be access via curl but that is tedious.
Bash¶
To access S3 via a bash/sh/csh/etc CLI it is best to use a widely available CLI utility such as the aws CLI provided by Amazon or an open source project such as s5cmd.
AWS CLI
The official documentation is the best way to familiarize yourself with this tool. You will use the s3 subcomamnd to work with objects. AWS CLI S3 Command Reference. The --endpoint-url option will be used with a value of endpoint from above.
The aws command needs to be initially configured with credentials see the aws configure subcommand documented here.
Profiles and credentials can be set up manually via the ~/.aws/config and ~/.aws/credentials files.
# Example of ~/.aws/config file using cirrus as the default
[default]
endpoint_url = https://s3.k8s.ucar.edu:5443
# Example of ~/.aws/credentials file
# this is where the credentials from the cirrus team are used
[default]
aws_access_key_id=ASDF1234
aws_secret_access_key=JKL987
When using either aws or s5cmd you need to use an additional flags --endpoint-url=https://s3.k8s.ucar.edu:5443.
s5cmd
s5cmd is a community project for interacting with s3 resources. It uses the same configuration files above. Generally it is much faster than aws cli or s3cmd project. This is our recommended method for accessing S3 objects.
Python¶
There are many libraries that can access s3 resources but the aws official boto3 is often the best way to go about it.
boto3
boto3 - Official docs
Examples
Installation in python environment