New Advanced PDF + OCR Interface for Document AI

Set up Microsoft Azure Blob storage

Connect your Microsoft Azure Blob storage container with Label Studio. For details about how Label Studio secures access to cloud storage, see Secure access to cloud storage.

Set up CORS for Azure blob storage

If you are planning to use proxy storage, you can skip this step.

If you are planning to use pre-signed URLs, you must configure CORS.

For more information, see Pre-signed URLs vs. Storage proxies.

  1. In the Azure portal, navigate to the page for the storage account.

  2. From the menu on the left, scroll down to Settings > Resource sharing (CORS).

  3. Under Blob service add the following rule:

    • Allowed origins: https://app.humansignal.com (or the domain you are using)
    • Allowed methods: GET, HEAD, OPTIONS
    • Allowed headers: *
    • Exposed headers: *
    • Max age: 3600
  4. Click Save.

Screenshot

Azure blob storage

Before you begin, review the information in Cloud storage for projects and Secure access to cloud storage.

You will also need to provide the following information. It can all be found on the resource page for your storage account in the Azure console.

You will need:

  • The name of the container you are using
  • The name of your storage account
  • The access key associated with your storage account

Screenshot

Tip

If you are working in an on-prem deployment, you can set the AZURE_BLOB_ACCOUNT_NAME and AZURE_BLOB_ACCOUNT_KEY environment variables instead of manually adding them into the UI.

Create a source storage connection

From Label Studio, open your project and select Settings > Cloud Storage > Add Source Storage.

Select Azure Blob Storage and click Next.

Configure Connection

Complete the following fields and then click Test connection:

Field Description
Storage Title Enter a name to identify the storage connection.
Container Name Enter the name of your Azure storage container. This can be found in the Azure console on your storage account resource page under Data storage > Containers. (See the screenshot above.)
Account Name Enter the name of your Azure storage account. (See the screenshot above.)
Account Key Enter the access key for your Azure storage account This can be found in the Azure console on your storage account resource page under Security + networking > Access keys. (See the screenshot above.)
Use pre-signed URLs (On) /
Proxy through the platform (Off)
This determines how data from your container is loaded:
  • Use pre-signed URLs: Label Studio generates time-limited HTTPS links directly to your S3/GCS/Azure objects and redirects the browser there (HTTP 303), so annotators’ browsers download media straight from cloud storage. This is usually faster and scales better, but requires correct CORS and presign permissions on the bucket. It also means traffic flows from browser to storage, not through Label Studio.
  • Proxy through the platform – The backend downloads the file from cloud storage and streams it to the browser, so all media traffic passes through the Label Studio server. This keeps data fully inside the Label Studio/network boundary, enforces task-level access checks on every request, and avoids CORS/presign setup, but uses more Label Studio worker resources and can be slightly slower.

For more information, see Pre-signed URLs vs Storage proxies.
Expire pre-signed URLs (minutes) Control how long pre-signed URLs remain valid.

Import Settings & Preview

Complete the following fields and then click Load preview to ensure you are syncing the correct data:

Bucket Prefix Optionally, enter the directory name within your container that you would like to use. For example, data-set-1 or data-set-1/subfolder-2.
Import Method Select whether you want create a task for each file in your container or whether you would like to use a JSON/JSONL/Parquet file to define the data for each task.
File Name Filter Specify a regular expression to filter container objects. Use .* to collect all objects.
Scan all sub-folders Enable this option to perform a recursive scan across subfolders within your container.

Review & Confirm

If everything looks correct, click Save & Sync to sync immediately, or click Save to save your settings and sync later.

Tip

You can also use the API to sync import storage.

Create a target storage connection

From Label Studio, open your project and select Settings > Cloud Storage > Add Target Storage.

Select Azure Blob Storage and click Next.

Complete the following fields:

Storage Title Enter a name to identify the storage connection.
Container Name Enter the name of your Azure storage container. This can be found in the Azure console on your storage account resource page under Data storage > Containers. (See the screenshot above.)
Container Prefix Optionally, enter the directory name within your container that you would like to use. For example, data-set-1 or data-set-1/subfolder-2.
Account Name Enter the name of your Azure storage account. (See the screenshot above.)
Account Key Enter the access key for your Azure storage account This can be found in the Azure console on your storage account resource page under Security + networking > Access keys. (See the screenshot above.)
Can delete objects from storage Enable this option if you want to delete annotations stored in the container when they are deleted in Label Studio.

After adding the storage, click Sync.

Tip

You can also use the API to sync export storage.

Azure blob storage with Service Principal

In Label Studio Enterprise, you can use Azure Service Principal authentication to securely connect Label Studio to Azure Blob Storage without using storage account access keys.

For more information, see Azure Blob Storage with Service Principal in our Enterprise documentation.

Add storage with the Label Studio API

You can also use the API to programmatically create connections. See our API documentation.