Skip to content

Commit a669eb4

Browse files
committed
Merge branch 'develop' into 'fb-BROS-5/iva-fixes'
Workflow run: https://github.com/HumanSignal/label-studio/actions/runs/15111983801
2 parents bac55de + aaf97d2 commit a669eb4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

61 files changed

+1533
-146
lines changed

.github/workflows/apply-linters.yml

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: Apply linters
2+
3+
on:
4+
workflow_dispatch:
5+
inputs:
6+
branch_name:
7+
description: 'Branch name to run the workflow on'
8+
required: true
9+
type: string
10+
11+
env:
12+
NODE: "18"
13+
FRONTEND_MONOREPO_DIR: "web"
14+
15+
jobs:
16+
lint:
17+
runs-on: ubuntu-latest
18+
19+
steps:
20+
- name: Checkout code
21+
uses: actions/checkout@v4
22+
with:
23+
token: ${{ secrets.GIT_PAT }}
24+
ref: ${{ inputs.branch_name }}
25+
26+
- name: Set up Python
27+
uses: actions/setup-python@v5
28+
with:
29+
python-version: '3.12'
30+
31+
- name: Install pre-commit
32+
run: |
33+
python -m pip install --upgrade pip
34+
pip install pre-commit
35+
36+
- name: Setup frontend environment
37+
uses: ./.github/actions/setup-frontend-environment
38+
with:
39+
node-version: "${{ env.NODE }}"
40+
directory: "${{ env.FRONTEND_MONOREPO_DIR }}"
41+
42+
- name: Run formatters
43+
run: make fmt-all || true
44+
45+
- name: Ensure no lint remains
46+
run: make fmt-all
47+
48+
- name: Commit changes
49+
run: |
50+
git config --global user.name "robot-ci-heartex"
51+
git config --global user.email "robot-ci-heartex@users.noreply.github.com"
52+
git add .
53+
git commit -m "Apply pre-commit linters" || echo "No changes to commit"
54+
git push

.github/workflows/docker-build-ontop.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ jobs:
110110
${{ steps.calculate-docker-tags.outputs.docker-tags }}
111111
112112
- name: Push Docker image
113-
uses: docker/build-push-action@v6.16.0
113+
uses: docker/build-push-action@v6.17.0
114114
id: docker_build_and_push
115115
with:
116116
context: .

.github/workflows/docker-build.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ jobs:
130130
type=raw,value=${{ steps.version.outputs.build_version }}
131131
132132
- name: Push Docker image
133-
uses: docker/build-push-action@v6.16.0
133+
uses: docker/build-push-action@v6.17.0
134134
id: docker_build_and_push
135135
with:
136136
context: .

.github/workflows/docker-release-promote.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ jobs:
195195
${{ steps.generate-tags.outputs.ubuntu-tags }}
196196
197197
- name: Build and Push Release Ubuntu Docker image
198-
uses: docker/build-push-action@v6.16.0
198+
uses: docker/build-push-action@v6.17.0
199199
id: docker_build
200200
with:
201201
context: ${{ steps.release_dockerfile.outputs.release_dir }}

.github/workflows/release-cut-off-release-branch.yml

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ on:
1414
type: string
1515

1616
env:
17-
RELEASE_BRANCH_PREFIX: "ls-release"
17+
RELEASE_BRANCH_PREFIX: "ls-release/"
1818

1919
jobs:
2020
draft-new-release:
@@ -45,22 +45,24 @@ jobs:
4545
- name: Calculate branch name and version
4646
id: calculate_branch_name_and_version
4747
shell: bash
48+
env:
49+
VERSION: "${{ inputs.version }}"
4850
run: |
4951
set -xeuo pipefail
5052
5153
regexp='^[v]?([0-9]+)\.([0-9]+)\.0$';
5254
53-
if [[ "${{ inputs.version }}" =~ $regexp ]]; then
55+
if [[ "${VERSION}" =~ $regexp ]]; then
5456
first="${BASH_REMATCH[1]}"
5557
second="${BASH_REMATCH[2]}"
56-
third="${BASH_REMATCH[3]}"
58+
third="0"
5759
else
58-
echo "${{ inputs.version }} does not mach the regexp ${regexp}"
60+
echo "::error::${VERSION} does not mach the regexp ${regexp}"
5961
exit 1
6062
fi
6163
6264
release_version="${first}.${second}.${third}"
63-
release_branch="${{ env.RELEASE_BRANCH_PREFIX }}/${first}.${second}.${third}"
65+
release_branch="${{ env.RELEASE_BRANCH_PREFIX }}${first}.${second}.${third}"
6466
next_develop_version="${first}.$(($second + 1)).0.dev0"
6567
6668
echo "release_branch=${release_branch}" >> "${GITHUB_OUTPUT}"

.github/workflows/tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ jobs:
140140

141141
- name: Upload coverage to Codecov
142142
if: ${{ github.event.pull_request.head.repo.fork == false && github.event.pull_request.user.login != 'dependabot[bot]' }}
143-
uses: codecov/codecov-action@v5.4.2
143+
uses: codecov/codecov-action@v5.4.3
144144
with:
145145
name: codecov-python-${{ matrix.python-version }}
146146
flags: pytests

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Have a custom dataset? You can customize Label Studio to fit your needs. Read an
2323

2424
## Try out Label Studio
2525

26-
Install Label Studio locally, or deploy it in a cloud instance. [Or, sign up for a free trial of our Enterprise edition.](https://humansignal.com/free-trial).
26+
Install Label Studio locally or deploy it in a cloud instance. [Or sign up for a free trial of our Starter Cloud edition!](https://humansignal.com/platform/starter-cloud/) You can learn more about what each edition offers [here](https://labelstud.io/guide/label_studio_compare).
2727

2828
- [Install locally with Docker](#install-locally-with-docker)
2929
- [Run with Docker Compose (Label Studio + Nginx + PostgreSQL)](#run-with-docker-compose)

docs/source/guide/install_enterprise_docker.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ See [Secure Label Studio](security.html) for more details about security and har
1919

2020
To install Label Studio Community Edition, see [Install Label Studio](https://labelstud.io/guide/install). This page is specific to the Enterprise version of Label Studio.
2121

22+
!!! note
23+
On-prem deployments of Label Studio Enterprise are not supported for Academic licenses.
24+
2225
{% insertmd includes/deploy.md %}
2326

2427
## Install Label Studio Enterprise using Docker

docs/source/guide/install_enterprise_k8s.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ Your Kubernetes cluster can be self-hosted or installed somewhere such as Amazon
2323

2424
</div>
2525

26+
!!! note
27+
On-prem deployments of Label Studio Enterprise are not supported for Academic licenses.
28+
2629
This high-level architecture diagram that outlines the main components of a Label Studio Enterprise deployment.
2730

2831
<img src="/images/LSE_k8s_scheme.png"/>

docs/source/tags/pdf.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
title: PDF
3+
type: tags
4+
order: 302
5+
meta_title: PDF Tag for loading PDF documents
6+
meta_description: Label Studio PDF Tag for loading PDF documents for machine learning and data science projects.
7+
---
8+
9+
The `Pdf` tag displays a PDF document for labeling. Use for performing document-level annotations, transcription, and summarization.
10+
11+
Use with the following data types: PDF.
12+
13+
### Parameters
14+
15+
| Param | Type | Default | Description |
16+
| --- | --- | --- | --- |
17+
| name | <code>string</code> | | Name of the element |
18+
| value | <code>string</code> | | Value of the element - field name to retrieve the PDF URL from |
19+
20+
### Supported Control tags
21+
Document-level annotations are supported with Pdf tag, for example:
22+
23+
- Document classification with [Choices](/tags/choices.html)
24+
- Document rating with [Rating](/tags/rating.html)
25+
- Transcription and summarization with [TextArea](/tags/textarea.html)
26+
27+
### Example
28+
29+
Labeling configuration to label PDF documents:
30+
31+
```html
32+
<View>
33+
<Pdf name="pdf" value="$pdf" />
34+
<Choices name="choices" toName="pdf">
35+
<Choice value="Legal" />
36+
<Choice value="Financial" />
37+
<Choice value="Technical" />
38+
</Choices>
39+
</View>
40+
```
41+
42+
**Example Input data:**
43+
44+
```json
45+
{
46+
"pdf": "https://app.humansignal.com/static/samples/sample.pdf"
47+
}
48+
```
49+

docs/source/templates/pdf_classification.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,11 @@ If you want to perform PDF classification, use this template. This template prom
2626
<Choice value="Important article"/>
2727
<Choice value="Yellow press"/>
2828
</Choices>
29-
<HyperText name="pdf" value="$pdf" inline="true"/>
29+
<Pdf name="pdf" value="$pdf"/>
3030
</View>
3131

3232
<!-- {
33-
"pdf": "<embed src='https://app.heartex.ai/static/samples/sample.pdf' width='100%' height='600px'/>"
33+
"pdf": "/static/samples/sample.pdf"
3434
} -->
3535
```
3636

@@ -56,9 +56,9 @@ Use the [Choices](/tags/choices.html) control tag to present classification opti
5656
</Choices>
5757
```
5858

59-
Use the [HyperText](/tags/hypertext.html) tag to render an inline version of the PDF data:
59+
Use the [Pdf](/tags/pdf.html) tag to render an inline version of the PDF data:
6060
```xml
61-
<HyperText name="pdf" value="$pdf" inline="true"/>
61+
<Pdf name="pdf" value="$pdf"/>
6262
```
6363

6464
### Input data
@@ -74,4 +74,4 @@ Label Studio does not support labeling PDF-formatted files directly. You should
7474
## Related tags
7575
- [Rating](/tags/rating.html)
7676
- [Choices](/tags/choices.html)
77-
- [HyperText](/tags/hypertext.html)
77+
- [Pdf](/tags/pdf.html)

label_studio/annotation_templates/structured-data-parsing/pdf-classification/config.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
<Choice value="Important article"/>
77
<Choice value="Yellow press"/>
88
</Choices>
9-
<HyperText name="pdf" value="$pdf" inline="true"/>
9+
<Pdf name="pdf" value="$pdf"/>
1010
</View>
1111

1212

label_studio/annotation_templates/structured-data-parsing/pdf-classification/config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ config: |
1212
<Choice value="Important article"/>
1313
<Choice value="Yellow press"/>
1414
</Choices>
15-
<HyperText name="pdf" value="$pdf" inline="true"/>
15+
<Pdf name="pdf" value="$pdf"/>
1616
</View>
1717
1818
1919
<!-- {
2020
"data": {
21-
"pdf": "<embed src='/static/samples/sample.pdf' width='100%' height='600px'/>"
21+
"pdf": "/static/samples/sample.pdf"
2222
}
2323
} -->
2424

label_studio/core/settings/base.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,7 @@
494494
'.mp4',
495495
'.webm',
496496
'.webp',
497+
'.pdf',
497498
]
498499
)
499500

@@ -607,6 +608,7 @@
607608
FEATURE_FLAGS_GET_USER_REPR = 'core.feature_flags.utils.get_user_repr'
608609

609610
# Test factories
611+
ORGANIZATION_FACTORY = 'organizations.tests.factories.OrganizationFactory'
610612
PROJECT_FACTORY = 'projects.tests.factories.ProjectFactory'
611613
USER_FACTORY = 'users.tests.factories.UserFactory'
612614

label_studio/data_manager/actions/cache_labels.py

Lines changed: 29 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
from core.permissions import AllPermissions
77
from core.redis import start_job_async_or_sync
8+
from label_studio_sdk.label_interface import LabelInterface
89
from tasks.models import Annotation, Prediction, Task
910

1011
logger = logging.getLogger(__name__)
@@ -18,6 +19,8 @@ def cache_labels_job(project, queryset, **kwargs):
1819
source_class = Annotation if source == 'annotations' else Prediction
1920
control_tag = request_data.get('custom_control_tag') or request_data.get('control_tag')
2021
with_counters = request_data.get('with_counters', 'Yes').lower() == 'yes'
22+
label_interface = LabelInterface(project.label_config)
23+
label_interface_tags = {tag.name: tag for tag in label_interface.find_tags('control')}
2124

2225
if source == 'annotations':
2326
column_name = 'cache'
@@ -38,7 +41,7 @@ def cache_labels_job(project, queryset, **kwargs):
3841
task_labels = []
3942
annotations = source_class.objects.filter(task=task).only('result')
4043
for annotation in annotations:
41-
labels = extract_labels(annotation, control_tag)
44+
labels = extract_labels(annotation, control_tag, label_interface_tags)
4245
task_labels.extend(labels)
4346

4447
# cache labels in separate data column
@@ -57,20 +60,36 @@ def cache_labels_job(project, queryset, **kwargs):
5760
return {'response_code': 200, 'detail': f'Updated {len(tasks)} tasks'}
5861

5962

60-
def extract_labels(annotation, control_tag):
63+
def extract_labels(annotation, control_tag, label_interface_tags=None):
6164
labels = []
6265
for region in annotation.result:
6366
# find regions with specific control tag name or just all regions if control tag is None
6467
if (control_tag is None or region['from_name'] == control_tag) and 'value' in region:
65-
# scan value for a field with list of strings,
66-
# as bonus it will work with textareas too
68+
# scan value for a field with list of strings (eg choices, textareas)
69+
# or taxonomy (list of string-lists)
6770
for key in region['value']:
68-
if (
69-
isinstance(region['value'][key], list)
70-
and region['value'][key]
71-
and isinstance(region['value'][key][0], str)
72-
):
73-
labels.extend(region['value'][key])
71+
if region['value'][key] and isinstance(region['value'][key], list):
72+
73+
if key == 'taxonomy':
74+
showFullPath = 'true'
75+
pathSeparator = '/'
76+
if label_interface_tags is not None and region['from_name'] in label_interface_tags:
77+
# if from_name is not a custom_control tag, then we can try to fetch taxonomy formatting params
78+
label_interface_tag = label_interface_tags[region['from_name']]
79+
showFullPath = label_interface_tag.attr.get('showFullPath', 'false')
80+
pathSeparator = label_interface_tag.attr.get('pathSeparator', '/')
81+
82+
if showFullPath == 'false':
83+
for elems in region['value'][key]:
84+
labels.append(elems[-1]) # just the leaf node of a taxonomy selection
85+
else:
86+
for elems in region['value'][key]:
87+
labels.append(pathSeparator.join(elems)) # the full delimited taxonomy path
88+
89+
# other control tag types like Choices & TextAreas
90+
elif isinstance(region['value'][key][0], str):
91+
labels.extend(region['value'][key])
92+
7493
break
7594
return labels
7695

label_studio/data_manager/actions/remove_duplicates.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,8 @@ def restore_storage_links_for_duplicated_tasks(duplicates) -> None:
172172
link = storage_link_class(
173173
task_id=task['id'],
174174
key=link_instance.key,
175+
row_index=link_instance.row_index,
176+
row_group=link_instance.row_group,
175177
storage=link_instance.storage,
176178
)
177179
link.save()

label_studio/feature_flags.json

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3120,6 +3120,33 @@
31203120
"version": 2,
31213121
"deleted": false
31223122
},
3123+
"fflag_feat_root_11_support_jsonl_cloud_storage": {
3124+
"key": "fflag_feat_root_11_support_jsonl_cloud_storage",
3125+
"on": false,
3126+
"prerequisites": [],
3127+
"targets": [],
3128+
"contextTargets": [],
3129+
"rules": [],
3130+
"fallthrough": {
3131+
"variation": 0
3132+
},
3133+
"offVariation": 1,
3134+
"variations": [
3135+
true,
3136+
false
3137+
],
3138+
"clientSideAvailability": {
3139+
"usingMobileKey": false,
3140+
"usingEnvironmentId": false
3141+
},
3142+
"clientSide": false,
3143+
"salt": "85e018dcd2e64c689a61ee7ed3c5edb2",
3144+
"trackEvents": false,
3145+
"trackEventsFallthrough": false,
3146+
"debugEventsUntilDate": null,
3147+
"version": 2,
3148+
"deleted": false
3149+
},
31233150
"fflag_feature_all_optic_1421_cold_start_v2": {
31243151
"key": "fflag_feature_all_optic_1421_cold_start_v2",
31253152
"on": false,

0 commit comments

Comments
 (0)