HumanSignal
diff --git a/‎docs/source/guide/agreement_metrics.md‎
Lines changed: 780 additions & 0 deletions b/‎docs/source/guide/agreement_metrics.md‎
Lines changed: 780 additions & 0 deletions
diff --git a/‎docs/source/guide/custom_metric.md‎
Lines changed: 4 additions & 3 deletions b/‎docs/source/guide/custom_metric.md‎
Lines changed: 4 additions & 3 deletions
diff --git a/‎docs/source/guide/dashboard_members.md‎
Lines changed: 3 additions & 0 deletions b/‎docs/source/guide/dashboard_members.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/source/guide/label_studio_compare.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/source/guide/label_studio_compare.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/source/guide/manage_data.md‎
Lines changed: 26 additions & 108 deletions b/‎docs/source/guide/manage_data.md‎
Lines changed: 26 additions & 108 deletions
diff --git a/‎docs/source/guide/project_settings_lse.md‎
Lines changed: 78 additions & 17 deletions b/‎docs/source/guide/project_settings_lse.md‎
Lines changed: 78 additions & 17 deletions
diff --git a/‎docs/source/guide/quality.md‎
Lines changed: 0 additions & 5 deletions b/‎docs/source/guide/quality.md‎
Lines changed: 0 additions & 5 deletions
@@ -1,17 +1,18 @@
 ---
 title: Add a custom agreement metric to Label Studio
-short: Custom agreement metric
+short: Custom metrics
 tier: enterprise
 type: guide
 order: 0
 order_enterprise: 310
 meta_title: Add a Custom Agreement Metric for Labeling
 meta_description: Label Studio Enterprise documentation about how to add a custom agreement metric to use for assessing annotator agreement or the quality of your annotation and prediction results for data labeling and machine learning projects.
 section: "Review & Measure Quality"
-
+parent: "stats"
+parent_enterprise: "stats"
 ---
 
-Write a custom agreement metric to assess the quality of the predictions and annotations in your Label Studio Enterprise project. Label Studio Enterprise contains a variety of [agreement metrics for your project](stats.html) but if you want to evaluate annotations using a custom metric or a standard metric not available in Label Studio, you can write your own. 
+Write a custom agreement metric to assess the quality of the predictions and annotations in your Label Studio Enterprise project. Label Studio Enterprise contains a variety of [agreement metrics for your project](agreement_metrics) but if you want to evaluate annotations using a custom metric or a standard metric not available in Label Studio, you can write your own. 
 
 !!! note
     This functionality is available out-of-the-box for Label Studio Enterprise Cloud users. 
 
@@ -63,6 +63,9 @@ The Annotator Agreement Matrix helps you see how consistently different members
 - **Hover over any cell** to view more information including the number of tasks where both members made an annotation. If a member made more than one annotation in a task, the additional annotation(s) are also considered. 
 - **Use the label dropdown** to filter and explore agreement when at least one annotation contains the specified label.
 
+!!! note
+    Agreement in the Members Dashboard reflects the [Pairwise agreement](stats#Pairwise) between annotators, regardless of what methodology you have selected for the project. 
+
 ## Agreement Distribution
 
 The Agreement Distribution visualizes how agreement scores vary across tasks in your project. The bar chart displays the number of tasks at each agreement score range.
 
@@ -303,7 +303,7 @@ Label Studio is available to everyone as open source software (Label Studio Comm
   <tr>
     <td><b>Agreement metrics</b><br/><a href="https://docs.humansignal.com/guide/stats.html">Define how annotator consensus is calculated using pre-defined agreement metrics.</a></td>
     <td style="text-align:center">❌</td>
-    <td style="text-align:center">✅</td>
+    <td style="text-align:center">Limited</td>
     <td style="text-align:center">✅</td>
   </tr>
   <tr>
 
@@ -29,7 +29,7 @@ For information on setting up a project, see [Create and configure projects](set
 
 </div>
 
-In Label Studio Community Edition, the data manager is the default view for your data. In Label Studio Enterprise, click **Data Manager** to open and view the data manager page. Every row in the data manager represents a labeling task in your dataset.
+Every row in the data manager represents a labeling task in your dataset.
 
 <div class="enterprise-only">
 
@@ -142,136 +142,54 @@ If you want to make changes to the labeling interface or perform a different typ
 
 <div class="enterprise-only">
 
-## Agreement and Agreement (Selected) columns
+## Agreement columns
 
-These two columns allow you to see agreement scores at a task level. 
+The agreement columns in the Data Manager reflect consensus between annotators for a task. For more information on agreement and how it is calculated, see [Task agreement](stats).
 
-### Agreement
+You will see the following agreement columns in the Data Manager: 
 
-The **Agreement** column displays the average agreement score between all annotators for a particular task. 
+* **Agreement** - The is the overall agreement for the task. 
 
-Each annotation pair's agreement score will be calculated as new annotations are submitted. For example, if there are three annotations for a task, there will be three unique annotation pairs, and the agreement column will show the average agreement score of those three pairs. 
+    This is calculated as the mean agreement score between all control tags for a particular task. See [Overall agreement](stats#Overall-agreement).
+* **[Control tag] agreement** - Each control tag has its own agreement score. 
 
-Here is an example with a simple label config. Let's assume we are using ["Exact matching choices" agreement calculation](stats#Exact-matching-choices-example)
-```xml
-<View>
-  <Image name="image_object" value="$image_url"/>
-  <Choices name="image_classes" toName="image_object">
-    <Choice value="Cat"/>
-    <Choice value="Dog"/>
-  </Choices>
-</View>
-```
-Annotation 1: `Cat`  
-Annotation 2: `Dog`  
-Annotation 3: `Cat`  
+    How control tag agreement is calculated depends on how your project is set up. See [Per-control-tag agreement](stats#Per-control-tag-agreement).
 
-The three unique pairs are
-1. Annotation 1 <> Annotation 2 - agreement score is `0`
-2. Annotation 1 <> Annotation 3 - agreement score is `1`
-3. Annotation 2 <> Annotation 3 - agreement score is `0`
+![Screenshot](/images/review/agreement-dm.png)
 
-The agreement column for this task would show the average of all annotation pair's agreement score:
-`33%`
+### Annotators and models
 
-### Agreement (Selected)
+Click any agreement column to select specific annotators and models that you want to use for agreement calculation.
 
-The **Agreement (Selected)** column builds on top of the agreement column, allowing you to get agreement scores between annotators, ground truth, and model versions. 
+![Screenshot](/images/review/agreement-dm-modal.png)
 
-The column header is a dropdown where you can make your selection of which pairs you want to include in the calculation.
+By default, all annotators (and not models) are selected for agreement calculation.
 
-<img src="/images/project/agreement-selected.png" class="gif-border" style="max-width:679px">
+However, you can customize this to select a subset of annotators, models, or models and annotators to compare. 
 
-Under **Choose What To Calculate** there are two options, which can be used for different use cases. 
+For example, if you have 10 annotators and you select 3, the overall agreement score and the control tag agreement scores will be recalculated to reflect only your selections. 
 
-#### Agreement Pairs
-
-This allows you to select specific annotators and/or models to compare.  
-
-
-You must select at least two items to compare. This can be used in a variety of ways. 
-
-**Subset of annotators**
-
-You can select a subset of annotators to compare. This is different and more precise than the **Agreement** column which automatically includes all annotators in the score.
-
-This will then average all annotator vs annotator scores for only the selected annotators.
-
-<img src="/images/project/agreement-selected-annotators.png" class="gif-border" style="max-width:679px">
-
-**Subset of models**
-
-You can also select multiple models to see model consensus in your project. This will average all model vs model scores for the selected models.
-
-<img src="/images/project/agreement-selected-models.png" class="gif-border" style="max-width:679px">
-
-**Subset of models and annotators**
-
-Other combinations are also possible such as selecting one annotator and multiple models, multiple annotators and multiple models, etc.
-
-* If multiple annotators are selected, all annotator vs annotator scores will be included in the average.
-* If multiple models are selected, all model vs model scores will be included in the average. 
-* If one or more annotators are selected along with one or more models, all annotator vs model scores will be included in the average. 
-
-#### Ground Truth Match
-
-If your project contains ground truth annotations, this allows you to compare either a single annotator or a single model to ground truth annotations. 
-
-<img src="/images/project/agreement-selected-gt.png" class="gif-border" style="max-width:679px">
-
-
-#### Limitations
-
-We currently only support calculating the **Agreement (Selected)** columen for tasks with 20 or less annotations. If you have a task with more than this threshold, you will see an info icon with a tooltip.
-
-<img src="/images/project/agreement-selected-threshold.png" class="gif-border" style="max-width:679px">
-
-
-#### Example Score Calculations
-
-Example using the same simple label config as above: 
+!!! note
+    You must select at least two items to compare. 
 
-```xml
-<View>
-  <Image name="image_object" value="$image_url"/>
-  <Choices name="image_classes" toName="image_object">
-    <Choice value="Cat"/>
-    <Choice value="Dog"/>
-  </Choices>
-</View>
-```
+    Your selections will apply to all agreement columns in the Data Manager. You cannot select different annotators and models for different agreement columns.
 
-Lets say for one task we have the following:
-1. Annotation 1 from annotator 1 - `Cat` (marked as ground truth)
-2. Annotation 2 from annotator 2 - `Dog`
-3. Prediction 1 from model version 1 - `Dog` 
-4. Prediction 2 from model version 2 - `Cat` 
 
-Here is how the score would be calculated for various selections in the dropdown
+### Ground truth match
 
-#### `Agreement Pairs` with `All Annotators` selected
-This will match the behavior of the **Agreement** column - all annotation pair's scores will be averaged:
+If your project contains ground truth annotations, you can use this option to compare either a single annotator or a single model to ground truth annotations. 
 
-1. Annotation 1 <> Annotation 2: Agreement score is `0`
+Label Studio will apply whatever agreement metrics and methodology you have configured for your project, but will limit the calculation to the selected annotator or model and the annotations marked as ground truth.  
 
-Score displayed in column for this task: `0%`
+<img src="/images/review/agreement-dm-gt.png" class="gif-border" style="max-width:679px">
 
-#### `Agreement Pairs` with `All Annotators` and `All Model Versions` selected
-This will average all annotation pair's scores, as well as all annotation <> model version pair's scores
-1. Annotation 1 <> Annotation 2 - agreement score is `0`
-4. Annotation 1 <> Prediction 1 - agreement score is `0`
-5. Annotation 1 <> Prediction 2 - agreement score is `1`
-6. Annotation 2 <> Prediction 1 - agreement score is `1`
-7. Annotation 2 <> Prediction 2 - agreement score is `0`
+### Agreement popover
 
-Score displayed in column for this task: `40%` 
+Click any agreement column to see a popover that has information about the metric and methodology used. 
 
-#### `Ground Truth Match` with `model version 2` selected
-This will compare all ground truth annotations with all predictions from `model version 2`.
+If you are using **Pairwise** methodology, you will see a breakdown of agreement scores for the selected annotators and models.
 
-In this example, Annotation 1 is marked as ground truth and Prediction 2 is from `model version 2`:
+<img src="/images/review/agreement-dm-popover.png" class="gif-border" style="max-width:600px">
 
-1. Annotation 1 <> Prediction 2 - agreement score is `1`
 
-Score displayed in column for this task: `100%` 
 </div>
@@ -783,15 +783,18 @@ For more information about pausing annotators, including how to manually pause s
 
 </dd>
 
-<dt id="task-agreement">Agreement</dt>
+<dt id="task-agreement">Agreement <span class="badge"></span></dt>
 
 <dd>
 
 When multiple annotators are labeling a task, the task agreement reflects how much agreement there is between annotators. 
 
 For example, if 10 annotators review a task and only 2 select the same choice, then that task would have a low agreement score.  
 
-You can customize how task agreement is calculated and how it should affect the project workflow. For more information, see [Task agreement and how it is calculated](stats). 
+You can customize how task agreement is calculated and how it should affect the project workflow. For more information, see [Task agreement](stats). 
+
+!!! error Enterprise
+    Label Studio Starter Cloud only supports the **Pairwise** methodology. Each control tag uses the [default built-in metric](agreement_metrics#Default-metric-reference) for agreement calculation.
 
 <table>
 <thead>
@@ -803,20 +806,90 @@ You can customize how task agreement is calculated and how it should affect the
 <tr>
 <td>
 
-**Agreement metric**
+**Methodology**
+
+</td>
+<td>
+
+Methodology to use for calculating task agreement. 
+
+* **Consensus**: Consensus measures *"What percentage of annotators chose the most common answer?"*
+* **Pairwise**: Pairwise measures *"What is the average agreement score across all pairs of annotators?"*
+
+For more information, see [Task agreement - methodology](stats#Methodology).
+
+</td>
+</tr>
+<tr>
+<td>
+
+**Built-in Metrics vs Custom**
+
+</td>
+<td>
+
+Select whether you want to use the built-in metrics or custom metrics for agreement.
+
+For more information, see [Built-in agreement metrics reference](agreement_metrics) and [Custom agreement metrics](custom_metric).
+
+</td>
+</tr>
+<tr>
+<td>
+
+**Overall Agreement**
+
 </td>
 <td>
 
-Select the [metric](stats#Available-agreement-metrics) that should determine task agreement.
+Configure how overall agreement is calculated by setting the weight for each control tag.
+
+For more information, see [Configure weight for the overall agreement](stats#Configure-weight-for-the-overall-agreement).
+
 
 </td>
 </tr>
 <tr>
 <td>
 
+**Agreement Columns**
+
+</td>
+<td>
+
+Configure how agreement is calculated for each control tag.
+
+For more information, see [Configure agreement for each control tag](stats#Configure-agreement-for-each-control-tag).
+
+</td>
+</tr>
+</table>
+
+</dd>
+
+<dt id="low-agreement">Low Agreement Resolution <span class="badge"></span></dt>
+
+<dd>
+
+!!! note
+    Low agreement resolution settings are only available when the project is configured to [automatically assign tasks](#distribute-tasks). If you are using Manual distribution, this section will not appear in your project settings.
+    
+    If you switch a project from Automatic to Manual distribution, low agreement resolution is automatically disabled.
+
+Resolve tasks with low agreement scores by automatically assigning additional annotators to the task. 
+
+<table>
+<thead>
+    <tr>
+      <th>Field</th>
+      <th>Description</th>
+    </tr>
+</thead>
+<tr>
+<td>
+
 **Assign additional annotator**
 
-<span class="badge"></span>
 </td>
 <td>
 Enable this option to automatically assign an additional annotator to any tasks that have a low agreement score. 
@@ -832,7 +905,6 @@ Note that to see this setting, the project must be set up with [automatic task a
 
 **Agreement threshold**
 
-<span class="badge"></span>
 </td>
 <td>
 
@@ -845,7 +917,6 @@ Enter the agreement score that a task must meet before it can be considered comp
 
 **Maximum additional annotators**
 
-<span class="badge"></span>
 </td>
 <td>
 
@@ -860,16 +931,6 @@ Annotators are assigned one at a time until the agreement threshold is achieved.
 !!! note
     When configuring **Maximum additional annotators**, be mindful of the number of annotators available in your project. If you have fewer annotators available than the sum of [**Annotations per task**](#overlap) + **Maximum additional annotators**, you might encounter a scenario in which a task with a low agreement score cannot be marked complete.
 
-</dd>
-
-<dt>Custom weights</dt>
-
-<dd>
-
-Set custom weights for tags and labels to change the agreement calculation. The options you are given are automatically generated from your labeling interface setup. 
-
-Weights set to zero are ignored from calculation.
-
 </dd>
 </dl>
 
 
@@ -183,11 +183,6 @@ Review a table to see the following for each annotator:
 - The agreement of their annotations with the ground truth annotations, if there are any.
 - The agreement of their annotations with predicted annotations, if there are any.
 
-See the following video for an overview of annotator agreement metrics: 
-
-<iframe class="video-border" width="560" height="315" src="https://www.youtube.com/embed/Lo_PVE9Pyw4?si=z1vtyI_xIo8aR8fY" width="100%" height="400vh" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
-
-
 ### Review annotator agreement matrix
 
 You can also review the overall annotator agreement on a more individual basis with the annotator agreement matrix.