Amazon S3 Integration

Access your Similarweb Batch API reports via Amazon S3.

Access your Batch API reports directly from an Amazon S3 bucket maintained by Similarweb, so you can easily process your custom reports and automatically integrate them into your own database systems.

🚧

Compatible with Similarweb Batch API only

To get access, speak to a Similarweb representative.

Setup instructions

To access your reports securely via the S3 bucket, use the ‘Get-S3-Credentials’ endpoint to retrieve a personalized AWS access key.

Endpoint-

[GET] https://api.similarweb.com/v3/batch/s3-credentials

Response - 200 OK

Example flow:

  1. Use the Get-S3-Credentials endpoint to generate AWS credentials authorized with READ access to all files under “s3_bucket“ and “s3_prefix”.

Note: It is crucial to save the AWS keys as you will not be able to generate them again.

  1. Request a report via the Request Report endpoint, use "delivery_method": "bucket_access". Take note of the generated “report_id”.

  2. Once the report is ready, use the Request Report Status endpoint. Instead of the “download_url” link, use the “S3_path” link.

  1. Use the AWS credentials using any AWS client (e.g. awscli, boto3, etc).

Your report will be ready for you under the {s3_bucket}/{s3_prefix}/{report_id} folder.
For example, s3://web-bulk-api-reports-production-us-east-1/123414/070f4e79-5248-487e-a43ee51309610c

  1. List the files under the report path and use them as you wish.

🚧

S3 credentials expire after 1 year

Once expired, revoke the connection using the Revoke S3 endpoint, then repeat the set-up instructions above to renew your credentials.

FAQs

What if I have multiple report files?

The Batch API splits the report into multiple files based on the primary key of the requested data, as detailed in the Batch API documentation.

For example, the primary key for the ‘desktop_visits’ metric is (‘site’,’country’), and for the‘desktop_top_geo’ metric it’s (‘site’).

Requesting both of these in the same request will result in 2 report folders.

  • s3://web-bulk-api-reports-production-us-east-1/123414/070f4e79-5248-487e-a43e-62e51309610c/keys=site-country

  • s3://web-bulk-api-reports-production-us-east-1/123414/070f4e79-5248-487e-a43e-62e51309610c/keys=site

Which files should be ignored?

Ignore any files that start with ‘_’ (underscore) letter.

Python Example:

📘

Need help?

If you are experiencing any issues with your AWS credentials or would like to speak to one of our dedicated technical API specialists, please reach out to your Account Manager.