Categorization Export JSON Format
Updated at January 23rd, 2026
1. Overview
This document describes the JSON export format for categorization projects in the annotation store. The export format transforms the internal flat annotation structure into a hierarchical JSON format optimized for customer delivery and downstream processing.
Key features include:
- Hierarchical structure: context (reference data) and annotations
- Multi-item support: Multiple categorization items per task
- Flexible keying: Annotations keyed by class name or name attribute (for dynamic or dynamic-like taxonomies)
- Delivery controls: Visibility rules for sensitive attributes
- Asset handling: Structured support for text, images, and videos
2. How to Read the JSON Export (Hierarchy & IDs)
This section provides a high-level overview of how the JSON export is structured and how different levels relate to each other.
Hierarchy and Levels
The JSON export is hierarchical. Relationships between objects are defined by JSON nesting, not by referencing IDs. Each task contains one or more items, and each item contains context data and annotations.
Task
└─ items[]
├─ context
│ └─ fields (text, image, video)
└─ annotations
└─ annotation objectsUnderstanding __id Fields
The __id field appears throughout the export (items, context, annotations, and individual annotation entries). These values are internal object identifiers used for traceability and feedback workflows.
- __id values do not carry business meaning
- __id values are not stable across exports
- Clients should not use __id values for joins or downstream logic
Clients should treat __id as opaque metadata and rely on JSON structure and field names instead.
Keys vs IDs in Annotations
Within the annotations object, the JSON property names (for example, "height" or "content_safe") represent the semantic meaning of the annotation. These keys should be used by clients to interpret the data.
The accompanying __id fields exist only for internal tracking and should be ignored for data interpretation.
Groups and Taxonomy Structure
Although taxonomies may include groups, classes, and nested relationships internally, the export JSON does not preserve group or parent-child taxonomy nesting. Only final, exportable annotation values are included in the output.
This ensures the export remains simple and focused on usable annotation results.
3. JSON Structure Overview
Top-Level Task Structure
{
"id": "1234567rtyy9876532",
"primary_keys": ["key1", "key2"],
"metadata": {
"custom_field": "value"
},
"items": [
{
"__id": 1,
"context": { ... },
"annotations": { ... }
}
]
}| Field | Type | Description |
| id | string | Unique task identifier |
| primary_keys | string[] | Optional customer-provided keys |
| metadata | object | Optional task-level metadata |
| items | array | Categorization items (can be multiple per task) |
Item Structure
Each item contains:
- __id: Internal object ID for feedback linking
- context: Reference data presented to annotators
- annotations: Output data from annotators
3. Context Object
The context object contains the reference data shown to annotators. It's structured as a map of named fields.
Structure
"context": {
"__id": 2,
"__order": ["field_name", "another_field"],
"field_name": "value or object",
"another_field": "value"
}The __order field is an array of strings listing the keys in the order they appear in the annotation UI.
Context Field Types
Plain Text
{
"SKU": "1234567890",
"description": "Product description text"
}Formatted Text
"product_details": {
"type": "text",
"format": "markdown",
"text": "This is a **bold** product with *italic* features"
}Images
"main_image": {
"type": "image",
"assetId": "7f3a9c2e-4b81-9d6a-b7e2-5c91e8a4d0f3",
"url": "https://assets.sama.com/api/v1/assets/123e4567...",
"metadata": {
"width": 1920,
"height": 1080,
"originalUrl": "https://example.com/product.jpg"
}
}Videos
"demo_video": {
"type": "video",
"assetId": "7f3a9c2e-4b81-9d6a-b7e2-5c91e8a4d0f3",
"url": "https://assets.sama.com/api/v1/assets/123e4567...",
"metadata": {
"duration": 120.5,
"width": 1920,
"height": 1080,
"framesPerSecond": 30,
"originalUrl": "https://example.com/demo.mp4"
}
}4. Annotations Object
The annotations object contains the output from annotators. The structure depends on the project taxonomy configuration (shouldDisplayAttributesInSameGroup).
Keying Strategies
By Class Name (Non-Dynamic Projects)
"annotations": {
"__id": 6,
"product_category": {
"__id": 7,
"category": "Electronics",
"subcategory": "Smartphones"
},
"quality_assessment": {
"__id": 8,
"condition": "New",
"defects": []
}
}By Name Attribute (Dynamic-like Projects)
"annotations": {
"__id": 6,
"product_height": {
"__id": 7,
"__type": "dimension_attribute",
"value": "15",
"unit": "cm"
},
"brand_name": {
"__id": 8,
"__type": "text_attribute",
"value": "Samsung"
}
}Annotation Fields
| Field | Type | When Present | Description |
| __id | number | Always | Object ID for feedback linking |
| __order | string[] | Internal use only | Order of annotation keys |
Delivery Visibility
Attributes marked with deliveryVisibility: "private" in the taxonomy are excluded from the export unless explicitly overridden (for example, when an internal user is using the API to fetch this data).
5. Complete Examples with Taxonomies
All of the internal __id fields in the taxonomy, as well as the asset ids, are UUIDs in our system, but here they are represented as regular strings so relationships are easier to see.
Example 1: Dynamic-like Product Categorization
Taxonomy Definition
{
"__id": "dynamic-taxonomy-v1",
"__type": "taxonomy",
"type": "categorization",
"shouldDisplayAttributesInSameGroup": true,
"errorCodes": [],
"scene": [],
"objects": [
{
"__id": "text-attr",
"__type": "class",
"name": "text_attribute",
"label": "Text Attribute",
"attributes": [
{
"__id": "attr-name",
"__type": "text_attribute",
"name": "name",
"role": "name",
"textType": "short-text",
"deliveryVisibility": "private"
},
{
"__id": "attr-desc",
"__type": "text_attribute",
"name": "description",
"role": "description",
"textType": "long-text",
"deliveryVisibility": "private"
},
{
"__id": "attr-no-mapping",
"__type": "boolean_attribute",
"name": "no_direct_mapping",
"deliveryVisibility": "public",
"options": {
"true": { "label": "Yes", "value": "true", "isDefault": false },
"false": { "label": "No", "value": "false", "isDefault": true }
}
},
{
"__id": "attr-mapping-opts",
"__type": "dynamic_list_attribute",
"name": "mapping_options",
"deliveryVisibility": "private"
},
{
"__id": "attr-mapping",
"__type": "single_selection_from_dynamic_list_attribute",
"name": "mapping",
"listAttributeId": "attr-mapping-opts",
"deliveryVisibility": "public"
}
]
}
]
}Exported JSON (with deliveryVisibility respected)
Note that all attributes that are marked as deliveryVisibility: "private" in the taxonomy are excluded from the export.
{
"id": "task-123",
"primary_keys": ["SKU-123"],
"metadata": {},
"items": [
{
"__id": 1,
"context": {
"__id": 2,
"SKU": "ABC123",
"product_name": "Smartphone Case",
"product_image": {
"type": "image",
"assetId": "asset-456",
"url": "https://assets.sama.com/api/v1/assets/asset-456",
"metadata": {
"originalUrl": "https://store.example.com/case.jpg"
}
}
},
"annotations": {
"__id": 6,
"height": {
"__id": 7,
"__type": "text_attribute",
"no_direct_mapping": "false",
"mapping": "Height"
},
"size": {
"__id": 8,
"__type": "list_attribute",
"no_direct_mapping": "true",
"extracted_value": "M"
}
}
}
]
}Example 2: Content Moderation with Class Groups
Taxonomy Definition
{
"__id": "moderation-taxonomy-v1",
"__type": "taxonomy",
"type": "categorization",
"shouldDisplayAttributesInSameGroup": false,
"errorCodes": [],
"scene": [],
"objects": [
{
"__id": "content-group",
"__type": "group",
"label": "Content Classification",
"children": [
{
"__id": "safe-class",
"__type": "class",
"name": "content_safe",
"label": "Safe Content",
"attributes": [
{
"__id": "conf-attr",
"__type": "single_choice_attribute",
"name": "confidence",
"deliveryVisibility": "public",
"options": [
{ "label": "High", "value": "high", "isDefault": true },
{ "label": "Medium", "value": "medium", "isDefault": false },
{ "label": "Low", "value": "low", "isDefault": false }
]
}
]
}
]
}
]
}Exported JSON (internal export to show __mutable)
Note that a non-internal user will not see the __mutable: true field.
{
"id": "task-456",
"primary_keys": [],
"metadata": {
"source": "user_upload",
"timestamp": "2025-10-16T10:30:00Z"
},
"items": [
{
"__id": 10,
"context": {
"__id": 11,
"__order": ["content_id", "text_content", "image_content"],
"content_id": "post-789",
"text_content": {
"type": "text",
"format": "plaintext",
"text": "Check out this amazing sunset photo I took!"
},
"image_content": {
"type": "image",
"assetId": "asset-789",
"url": "https://assets.sama.com/api/v1/assets/asset-789",
"metadata": {
"originalUrl": "https://social.example.com/sunset.jpg"
}
}
},
"annotations": {
"__id": 15,
"__order": ["content_safe"],
"content_safe": {
"__id": 16,
"__mutable": true,
"confidence": "high"
}
}
}
]
}Example 3: Multi-modal Categorization
Exported JSON
{
"id": "task-789",
"primary_keys": ["content-abc-123"],
"metadata": {
"campaign": "holiday-2025"
},
"items": [
{
"__id": 20,
"context": {
"__id": 21,
"title": "Holiday Campaign Ad",
"description": {
"type": "text",
"format": "markdown",
"text": "Review this **holiday campaign** ad for quality and compliance.\n\n* Check video quality\n* Verify audio clarity\n* Assess overall message"
},
"thumbnail": {
"type": "image",
"assetId": "asset-100",
"url": "https://assets.sama.com/api/v1/assets/asset-100",
"metadata": {
"originalUrl": "https://cdn.example.com/thumb.jpg"
}
},
"main_video": {
"type": "video",
"assetId": "asset-101",
"url": "https://assets.sama.com/api/v1/assets/asset-101",
"metadata": {
"originalUrl": "https://cdn.example.com/holiday-ad.mp4"
}
}
},
"annotations": {
"__id": 26,
"media_analysis": {
"__id": 27,
"primary_media_type": "video",
"quality_score": "4",
"video_transcript": "Welcome to our holiday sale! Get 50% off all items this weekend only. Shop now at example.com"
}
}
}
]
}