Copied
Docs

Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request.
We’ll get back to you as soon as possible.

Please fill out the contact form below and we will reply as soon as possible.

EMPLOYEE LOGIN
  • Home
  • Getting Started
  • Annotate
  • Tasks
  • API
  • Recipes
  • Integrations

Categorization Export JSON Format

Updated at January 23rd, 2026

1. Overview

This document describes the JSON export format for categorization projects in the annotation store. The export format transforms the internal flat annotation structure into a hierarchical JSON format optimized for customer delivery and downstream processing.

Key features include:

  • Hierarchical structure: context (reference data) and annotations
  • Multi-item support: Multiple categorization items per task
  • Flexible keying: Annotations keyed by class name or name attribute (for dynamic or dynamic-like taxonomies)
  • Delivery controls: Visibility rules for sensitive attributes
  • Asset handling: Structured support for text, images, and videos

2. How to Read the JSON Export (Hierarchy & IDs)

This section provides a high-level overview of how the JSON export is structured and how different levels relate to each other.

Hierarchy and Levels

The JSON export is hierarchical. Relationships between objects are defined by JSON nesting, not by referencing IDs. Each task contains one or more items, and each item contains context data and annotations.

Task
 └─ items[]
     ├─ context
     │   └─ fields (text, image, video)
     └─ annotations
         └─ annotation objects

Understanding __id Fields

The __id field appears throughout the export (items, context, annotations, and individual annotation entries). These values are internal object identifiers used for traceability and feedback workflows.

  • __id values do not carry business meaning
  • __id values are not stable across exports
  • Clients should not use __id values for joins or downstream logic

Clients should treat __id as opaque metadata and rely on JSON structure and field names instead.

Keys vs IDs in Annotations

Within the annotations object, the JSON property names (for example, "height" or "content_safe") represent the semantic meaning of the annotation. These keys should be used by clients to interpret the data.

The accompanying __id fields exist only for internal tracking and should be ignored for data interpretation.

Groups and Taxonomy Structure

Although taxonomies may include groups, classes, and nested relationships internally, the export JSON does not preserve group or parent-child taxonomy nesting. Only final, exportable annotation values are included in the output.

This ensures the export remains simple and focused on usable annotation results.

3. JSON Structure Overview

Top-Level Task Structure

{
  "id": "1234567rtyy9876532",
  "primary_keys": ["key1", "key2"],
  "metadata": {
    "custom_field": "value"
  },
  "items": [
    {
      "__id": 1,
      "context": { ... },
      "annotations": { ... }
    }
  ]
}
Field Type Description
id string Unique task identifier
primary_keys string[] Optional customer-provided keys
metadata object Optional task-level metadata
items array Categorization items (can be multiple per task)

Item Structure

Each item contains:

  • __id: Internal object ID for feedback linking
  • context: Reference data presented to annotators
  • annotations: Output data from annotators

3. Context  Object

The context object contains the reference data shown to annotators. It's structured as a map of named fields.

Structure

"context": {
  "__id": 2,
  "__order": ["field_name", "another_field"],
  "field_name": "value or object",
  "another_field": "value"
}

The __order field is an array of strings listing the keys in the order they appear in the annotation UI.

Context Field Types

Plain Text

{
  "SKU": "1234567890",
  "description": "Product description text"
}

Formatted Text

"product_details": {
  "type": "text",
  "format": "markdown",
  "text": "This is a **bold** product with *italic* features"
}

Images

"main_image": {
  "type": "image",
  "assetId": "7f3a9c2e-4b81-9d6a-b7e2-5c91e8a4d0f3",
  "url": "https://assets.sama.com/api/v1/assets/123e4567...",
  "metadata": {
    "width": 1920,
    "height": 1080,
    "originalUrl": "https://example.com/product.jpg"
  }
}

Videos

"demo_video": {
  "type": "video",
  "assetId": "7f3a9c2e-4b81-9d6a-b7e2-5c91e8a4d0f3",
  "url": "https://assets.sama.com/api/v1/assets/123e4567...",
  "metadata": {
    "duration": 120.5,
    "width": 1920,
    "height": 1080,
    "framesPerSecond": 30,
    "originalUrl": "https://example.com/demo.mp4"
  }
}

4. Annotations  Object

The annotations object contains the output from annotators. The structure depends on the project taxonomy configuration (shouldDisplayAttributesInSameGroup).

Keying Strategies

By Class Name (Non-Dynamic Projects)

"annotations": {
  "__id": 6,
  "product_category": {
    "__id": 7,
    "category": "Electronics",
    "subcategory": "Smartphones"
  },
  "quality_assessment": {
    "__id": 8,
    "condition": "New",
    "defects": []
  }
}

By Name Attribute (Dynamic-like Projects)

"annotations": {
  "__id": 6,
  "product_height": {
    "__id": 7,
    "__type": "dimension_attribute",
    "value": "15",
    "unit": "cm"
  },
  "brand_name": {
    "__id": 8,
    "__type": "text_attribute",
    "value": "Samsung"
  }
}

Annotation Fields

Field Type When Present Description
__id number Always Object ID for feedback linking
__order string[] Internal use only Order of annotation keys

Delivery Visibility

Attributes marked with deliveryVisibility: "private" in the taxonomy are excluded from the export unless explicitly overridden (for example, when an internal user is using the API to fetch this data).

5. Complete Examples with Taxonomies

All of the internal __id fields in the taxonomy, as well as the asset ids, are UUIDs in our system, but here they are represented as regular strings so relationships are easier to see.

Example 1: Dynamic-like Product Categorization

Taxonomy Definition

{
  "__id": "dynamic-taxonomy-v1",
  "__type": "taxonomy",
  "type": "categorization",
  "shouldDisplayAttributesInSameGroup": true,
  "errorCodes": [],
  "scene": [],
  "objects": [
    {
      "__id": "text-attr",
      "__type": "class",
      "name": "text_attribute",
      "label": "Text Attribute",
      "attributes": [
        {
          "__id": "attr-name",
          "__type": "text_attribute",
          "name": "name",
          "role": "name",
          "textType": "short-text",
          "deliveryVisibility": "private"
        },
        {
          "__id": "attr-desc",
          "__type": "text_attribute",
          "name": "description",
          "role": "description",
          "textType": "long-text",
          "deliveryVisibility": "private"
        },
        {
          "__id": "attr-no-mapping",
          "__type": "boolean_attribute",
          "name": "no_direct_mapping",
          "deliveryVisibility": "public",
          "options": {
            "true": { "label": "Yes", "value": "true", "isDefault": false },
            "false": { "label": "No", "value": "false", "isDefault": true }
          }
        },
        {
          "__id": "attr-mapping-opts",
          "__type": "dynamic_list_attribute",
          "name": "mapping_options",
          "deliveryVisibility": "private"
        },
        {
          "__id": "attr-mapping",
          "__type": "single_selection_from_dynamic_list_attribute",
          "name": "mapping",
          "listAttributeId": "attr-mapping-opts",
          "deliveryVisibility": "public"
        }
      ]
    }
  ]
}

Exported JSON (with deliveryVisibility respected)

Note that all attributes that are marked as deliveryVisibility: "private" in the taxonomy are excluded from the export.

{
  "id": "task-123",
  "primary_keys": ["SKU-123"],
  "metadata": {},
  "items": [
    {
      "__id": 1,
      "context": {
        "__id": 2,
        "SKU": "ABC123",
        "product_name": "Smartphone Case",
        "product_image": {
          "type": "image",
          "assetId": "asset-456",
          "url": "https://assets.sama.com/api/v1/assets/asset-456",
          "metadata": {
            "originalUrl": "https://store.example.com/case.jpg"
          }
        }
      },
      "annotations": {
        "__id": 6,
        "height": {
          "__id": 7,
          "__type": "text_attribute",
          "no_direct_mapping": "false",
          "mapping": "Height"
        },
        "size": {
          "__id": 8,
          "__type": "list_attribute",
          "no_direct_mapping": "true",
          "extracted_value": "M"
        }
      }
    }
  ]
}

Example 2: Content Moderation with Class Groups

Taxonomy Definition

{
  "__id": "moderation-taxonomy-v1",
  "__type": "taxonomy",
  "type": "categorization",
  "shouldDisplayAttributesInSameGroup": false,
  "errorCodes": [],
  "scene": [],
  "objects": [
    {
      "__id": "content-group",
      "__type": "group",
      "label": "Content Classification",
      "children": [
        {
          "__id": "safe-class",
          "__type": "class",
          "name": "content_safe",
          "label": "Safe Content",
          "attributes": [
            {
              "__id": "conf-attr",
              "__type": "single_choice_attribute",
              "name": "confidence",
              "deliveryVisibility": "public",
              "options": [
                { "label": "High", "value": "high", "isDefault": true },
                { "label": "Medium", "value": "medium", "isDefault": false },
                { "label": "Low", "value": "low", "isDefault": false }
              ]
            }
          ]
        }
      ]
    }
  ]
}

Exported JSON (internal export to show __mutable)

Note that a non-internal user will not see the __mutable: true field.

{
  "id": "task-456",
  "primary_keys": [],
  "metadata": {
    "source": "user_upload",
    "timestamp": "2025-10-16T10:30:00Z"
  },
  "items": [
    {
      "__id": 10,
      "context": {
        "__id": 11,
        "__order": ["content_id", "text_content", "image_content"],
        "content_id": "post-789",
        "text_content": {
          "type": "text",
          "format": "plaintext",
          "text": "Check out this amazing sunset photo I took!"
        },
        "image_content": {
          "type": "image",
          "assetId": "asset-789",
          "url": "https://assets.sama.com/api/v1/assets/asset-789",
          "metadata": {
            "originalUrl": "https://social.example.com/sunset.jpg"
          }
        }
      },
      "annotations": {
        "__id": 15,
        "__order": ["content_safe"],
        "content_safe": {
          "__id": 16,
          "__mutable": true,
          "confidence": "high"
        }
      }
    }
  ]
}

Example 3: Multi-modal Categorization

Exported JSON

{
  "id": "task-789",
  "primary_keys": ["content-abc-123"],
  "metadata": {
    "campaign": "holiday-2025"
  },
  "items": [
    {
      "__id": 20,
      "context": {
        "__id": 21,
        "title": "Holiday Campaign Ad",
        "description": {
          "type": "text",
          "format": "markdown",
          "text": "Review this **holiday campaign** ad for quality and compliance.\n\n* Check video quality\n* Verify audio clarity\n* Assess overall message"
        },
        "thumbnail": {
          "type": "image",
          "assetId": "asset-100",
          "url": "https://assets.sama.com/api/v1/assets/asset-100",
          "metadata": {
            "originalUrl": "https://cdn.example.com/thumb.jpg"
          }
        },
        "main_video": {
          "type": "video",
          "assetId": "asset-101",
          "url": "https://assets.sama.com/api/v1/assets/asset-101",
          "metadata": {
            "originalUrl": "https://cdn.example.com/holiday-ad.mp4"
          }
        }
      },
      "annotations": {
        "__id": 26,
        "media_analysis": {
          "__id": 27,
          "primary_media_type": "video",
          "quality_score": "4",
          "video_transcript": "Welcome to our holiday sale! Get 50% off all items this weekend only. Shop now at example.com"
        }
      }
    }
  ]
}
classification data format

Was this article helpful?

Yes
No
Give feedback about this article
1. Overview 2. How to Read the JSON Export (Hierarchy & IDs) Hierarchy and Levels Understanding __id Fields Keys vs IDs in Annotations Groups and Taxonomy Structure 3. JSON Structure Overview Top-Level Task Structure Item Structure 3. Context Object Structure Context Field Types Plain Text Formatted Text Images Videos 4. Annotations Object Keying Strategies By Class Name (Non-Dynamic Projects) By Name Attribute (Dynamic-like Projects) Annotation Fields Delivery Visibility 5. Complete Examples with Taxonomies Example 1: Dynamic-like Product Categorization Taxonomy Definition Exported JSON (with deliveryVisibility respected) Example 2: Content Moderation with Class Groups Taxonomy Definition Exported JSON (internal export to show __mutable) Example 3: Multi-modal Categorization Exported JSON

The first B Corp-certified AI company

  • Security
  • Terms
  • Privacy
  • Quality & Information

Copyright © 2023 Samasource Impact Sourcing, Inc. All rights reserved.


Knowledge Base Software powered by Helpjuice

Expand