Grounding DINO backend integration
This integration will allow you to:
- Use text prompts for zero-shot detection of objects in images.
- Specify the detection of any object and get state-of-the-art results without any model fine tuning.
See here for more details about the pre-trained Grounding DINO model.
Before you begin
Before you begin, you must install the Label Studio ML backend.
This tutorial uses the grounding_dino
example.
Quickstart
Make sure Docker is installed.
Edit
docker-compose.yml
to include the following:LABEL_STUDIO_HOST
sets the endpoint of the Label Studio host. Must begin withhttp://
LABEL_STUDIO_ACCESS_TOKEN
sets the API access token for the Label Studio host. This can be found by logging into Label Studio and going to the Account & Settings page.
Example:
LABEL_STUDIO_HOST=http://123.456.7.8:8080
LABEL_STUDIO_ACCESS_TOKEN=your-api-key
Run
docker compose up
Check the IP of your backend using
docker ps
. You will use this URL when connecting the backend to a Label Studio project. Usually this ishttp://localhost:9090
.Create a project and edit the labeling config (an example is provided below). When editing the labeling config, make sure to add all rectangle labels under the
RectangleLabels
tag, and all corresponding brush labels under theBrushLabels
tag.
<View>
<Style>
.lsf-main-content.lsf-requesting .prompt::before { content: ' loading...'; color: #808080; }
</Style>
<View className="prompt">
<Header value="Enter a prompt to detect objects in the image:"/>
<TextArea name="prompt" toName="image" editable="true" rows="2" maxSubmissions="1" showSubmitButton="true"/>
</View>
<Image name="image" value="$image"/>
<RectangleLabels name="label" toName="image">
<Label value="cats" background="yellow"/>
<Label value="house" background="blue"/>
</RectangleLabels>
</View>
- From the Model page in the project settings, connect the model.
- Go to an image task in your project. Enable Auto-annotation (found at the bottom of the labeling interface). Then enter in the prompt box and press Add. After this, you should receive your predictions. See the video above for a demo.
Using GPU
For the best user experience, it is recommended to use a GPU. To do this, you can update the docker-compose.yml
file including the following lines:
environment:
- NVIDIA_VISIBLE_DEVICES=all
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Using GroundingSAM
If you are looking for GroundingDINO integration with SAM, check this example.