Choosing an asset sourcing and hosting option

Before you can upload tasks, you need to have figure out where you want to source and host your assets. Typically, users upload their assets directly to Sama's S3 bucket, where they are hosted and pulled from when they are needed as a part of an annotation task.

We have numerous approaches to where assets can be sourced from and hosted, but they come with some important trade-offs. Before deciding which solution is best for you, we recommend talking to our Solutions Engineering team so that we can better understand your existing and desired setups and implement an effective solution.

Here’s a summary of the current asset sourcing and hosting solutions that we can offer:

Option

Asset Sourcing

Asset Hosting

Pros

Cons

1

You upload assets directly to Sama's S3 bucket

Assets are hosted on Sama's S3 bucket

• Sama is better able to prevent service interruptions
• Sama's delivery centers can load assets faster during annotation using their local asset servers

• Lack of control of your asset hosting

2

You give Sama public or pre-signed URLs to the individual assets from your own cloud storage service

Assets are automatically copied into Sama's S3 bucket and are hosted there, but only for the duration of the project

• You manage your own assets in your cloud storage service and don't have to upload them to Sama's S3 bucket
• Sama's delivery centers can load assets faster during annotation using their local asset servers

• Public or pre-signed URLs may change or expire, disrupting your project, and connectivity may be an issue

3

You give Sama public or pre-signed URLs to the individual assets from your own cloud storage service

Assets are hosted in your cloud storage service (images only) (to set this up, see below)

• You manage your own assets in your cloud storage service and don't have to upload them to Sama's S3 buckets

• Public or pre-signed URLs may change or expire, disrupting your project, and connectivity may be an issue
• **Images larger than 2 MB are not supported as it may impact load times in the Sama delivery centers

4

You set an S3 bucket policy that shares your bucket's contents with Sama's AWS account (to set this up, see below)

Assets are automatically copied into Sama's S3 bucket and are hosted there, but only for the duration of the project

• You manage your own assets in your cloud storage service and don't have to upload them to Sama's S3 bucket
• Sama's delivery centers can load assets faster during annotation using their local asset servers

5

You set an S3 bucket policy that shares your bucket's contents with Sama's AWS account (to set this up, see below)

Assets are hosted in your cloud storage service (to set this up, see below)

• You manage your own assets in your cloud storage service and don't have to upload them to Sama's S3 buckets

**• Images larger than 2 MB are not supported as it may impact load times in the Sama delivery centers



Setting up the configuration for hosting your own assets

If you want to keep your assets hosted on your cloud storage service (images only), please ensure that your CORS configuration is properly set up:

  • The Origin will be https://app.sama.com
  • The Access-Control-Request-Method is GET

Here is a sample AWS S3 CORS bucket configuration that will enable the Sama platform to properly serve images:

[
    {
        "AllowedHeaders": [],
        "AllowedMethods": [
            "GET"
        ],
        "AllowedOrigins": [
            "https://app.sama.com"
        ],
        "ExposeHeaders": [],
        "MaxAgeSeconds": 3000
    }
]


Sharing your S3 bucket's contents with Sama's AWS account

Sama can directly fetch assets from your AWS S3 bucket, without you needing to generate pre-signed or public URLs. You'll need to configure your S3 bucket policy as follows:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "sama-s3-getobjects",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::596059236576:user/hub-prod"
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET_NAME>",
                "arn:aws:s3:::<BUCKET_NAME>/*"
            ]
        }
    ]
}

Replace <BUCKET_NAME> with the name of the bucket Sama will need access to. This will give Sama read-only access to the entire contents of the bucket.

If more granular access is needed, the arn:aws:s3:::<BUCKET_NAME>/* resource can be replaced with a list of resources that include the paths to which Sama will be granted access, such as arn:aws:s3:::<BUCKET_NAME>/path/that/sama/needs/* and arn:aws:s3:::<BUCKET_NAME>/other/path/that/sama/needs/*.