Using SAM2 with Label Studio for Video Annotation

This guide describes the simplest way to start using SegmentAnything 2 with Label Studio.

This repository is specifically for working with object tracking in videos. For working with images, see the segment_anything_2_image repository

sam2

Before you begin

Before you begin, you must install the Label Studio ML backend.

This tutorial uses the segment_anything_2_video example.

Running from source

To run the ML backend without Docker, you have to clone the repository and install all dependencies using pip:

git clone https://github.com/HumanSignal/label-studio-ml-backend.git
cd label-studio-ml-backend
pip install -e .
cd label_studio_ml/examples/segment_anything_2_video
pip install -r requirements.txt

Download segment-anything-2 repo into the root directory. Install SegmentAnything model and download checkpoints using the official Meta documentation. Make sure that you complete the steps for downloadingn the checkpoint files!
Export the following environment variables (fill them in with your credentials!):

LABEL_STUDIO_URL: the http:// or https:// link to your label studio instance (include the prefix!)
LABEL_STUDIO_API_KEY: your api key for label studio, available in your profile.

Then you can start the ML backend on the default port 9090:

cd ../
label-studio-ml start ./segment_anything_2_video

Note that if you’re running in a cloud server, you’ll need to run on an exposed port. To change the port, add -p <port number> to the end of the start command above. 5. Connect running ML backend server to Label Studio: go to your project Settings -> Machine Learning -> Add Model and specify http://localhost:9090 as a URL. Read more in the official Label Studio documentation. Again, if you’re running in the cloud, you’ll need to replace this localhost location with whatever the external ip address is of your container, along with the exposed port.

Labeling Config

For your project, you can use any labeling config with video properties. Here’s a basic one to get you started!

<View>
    <Labels name="videoLabels" toName="video" allowEmpty="true">
        <Label value="Player" background="#11A39E"/>
        <Label value="Ball" background="#D4380D"/>
    </Labels>

    <!-- Please specify FPS carefully, it will be used for all project videos -->
    <Video name="video" value="$video" framerate="25.0"/>
    <VideoRectangle name="box" toName="video" smart="true"/>
</View>

Known limitations

As of 8/11/2024, SAM2 only runs on GPU servers.
Currently, we only support the tracking of one object in video, although SAM2 can support multiple.
Currently, we do not support video segmentation.
No Docker support

If you want to contribute to this repository to help with some of these limitations, you can submit a PR.

Customization

The ML backend can be customized by adding your own models and logic inside the ./segment_anything_2_video directory.