Set up your Batch API

Step by step guide to set up your Similarweb Batch API

Welcome to Similarweb's Batch API - giving you scalable access to the world's largest digital measure database!
Get Similarweb data for more than 1,000,000 domains and 5 years of history, tens of metrics - in one API call!
This guide has 2 quick steps to get millions of data points from our API.

Get started checklist:

  • πŸ”‘ Get an API Key tutorial OR if you already Batch API access generate API Key here
  • πŸ“ˆ Choose the data and metrics you need based on your subscription datahub.similarweb.com Or discover new datasets here here
  • πŸ“ Create a request report with a valid JSON
  • πŸ”— Connect and integrate to your Data lake (S3, Snowflake, Databricks)

Step-by-step guide:

  1. Make a POST request with a JSON in the body or attached as a file as multipart/form-data
https://api.similarweb.com/batch/v4/request-report
import requests

url = "https://api.similarweb.com/batch/v4/request-report"

payload={}
files=[
  ('request',('Batchexample.json',open('/Users/Batchexample.json','rb'),'application/json'))
]
headers = {
  'api-key': '{{your_api_key}}'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)

Example JSON:

{
  "delivery_information": 
  	{
    	"response_format": "csv"
  	},
  "report_query": {
    "tables": [
      {
        "vtable": "traffic_and_engagement",
        "granularity": "monthly",
        "filters": {
          "domains": [
            "similarweb.com",
            "api.similarweb.com"
          ],
          "countries": [
            "WW",
            "US"
          ],
          "include_subdomains": true
        },
        "metrics": [
          "all_traffic_visits",
          "desktop_new_visitors",
          "desktop_pages_per_visit",
          "desktop_returning_visitors"
        ],
        "start_date": "2023-02",
        "end_date": "2024-02"
      }
    ]
  }
}

When requesting a report you must include in the JSON the following parameters.

Mandatory Parameters:

ParametersDescriptionAcceptable Values
vtableThis represent the data set you are looking to choose metrics form you can find the full list on datahub.similarweb.com or heretraffic_and_engagement
domainsCharacters in domain names can include letters, numbers, dashes, and hyphens. One request can include up to 1M domains.amazon.com
countriesCountries with standard 2-letter ISO encoding when calling all metrics (excluding desktop_top_geo). For worldwide, use "WW". This parameter is case-sensitive and must be inputted in capital letters. When calling desktop_top_geo, you must remove any countries from your JSON file.WW, US, GB All country codes
metricsList of metrics per datasetall_traffic_visits
start_date, end_dateFor daily granularity, format the start-and-end date like this: YYYY-MM-DD. For monthly granularity, format the start-and-end date like this- YYYY-MMDaily: 2023-06-30
Monthly: 2023-06
granularityTime series granularitymonthly, weekly, daily
response_formatOutput of the API callJSON, csv, parquet, orc

Make sure to save the report ID you receive after your API request.

❗️

The request limit per user is 100 pending requests. if you receive a '429' error it means you've exceeded the limit of allowed pending requests. Reduce the frequency of your requests to stay within the limits of your account.

Optional Parameters:

ParameterDescriptionAcceptable values
delivery_methodThe default Value is "download_link". When the delivery method is set to β€œsnowflake”, the β€œresponse_format” field is not requireddownload_link, bucket_access, snowflake
delivery_method_paramsUse this when requesting reports to be delivered to aggregated Snowflake tables. Input β€œtable_name”: β€œyour_table_name”. See set-up guide for more details.table_name, integration_name, retention_days, overwrite_partitions
all_historyBoolean, when set to true, will automatically override the dates to the minimum start date and maximum end date, valid values true or false, default is false.true/false
latestBoolean, when set to true will override the end date with the latest available date, if the start date is not specified it will also override the start date with the same.true/false
window_sizeString, when set will override the start date with a time relative to the end date.Should be in the format - {number}{y/m/d}, for example - '12d', '3m', or '2y'.
limitInteger, Limits the number of results per entity selected.above 0, most metrics default is 100
Include_subdomainsBoolean, Default is true.true/false
webhook_urlEnter the delivery URL you'd like us to ping when the status of your report changes.URL
sortAllows you to sort by a specific metricspecific metrics: "sort": "all_traffic_visits"
  1. After you made your request and got your report ID, use the Request Report Status to receive the report status.

Upon completion, you will need to request the report status.

GET Request Report Status

https://api.similarweb.com/v3/batch/request-status/{{generated_report_id}}
import requests

url = "https://api.similarweb.com/v3/batch/request-status/{{generated_report_id}}"

payload={}
headers = {
  'api-key': '{{your_api_key}}'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

Example response:

{
    "data_points_count": 1779429,
    "download_url": "example_url.com",
    "status": "completed",
    "used_quota": 35589
}
{
    "status": "pending"
}

Date Credits cost per request:

πŸ‘

The download link will remain valid for 30 days. We recommend saving these for a certain time period just in case you will need our assistance to troubleshoot any issue that may occur.

πŸ‘

Data credits are calculated for each report based on the number of results you are actually receiving:

Formula: Number of domains X Number of metrics X history X cadence (daily/monthly) X Number of countries X Number of results

πŸ‘

In order to calculate the estimated credits the report will cost, you can use the "request-validate" endpoint

https://api.similarweb.com/v3/batch/request-validate


What’s Next