The input metadata provides key details about the image, including its size, height, and width. Please note, this image metadata is accessible only when the 'Include Image Metadata' option is activated.
📘 Note
You can enable this feature by navigating to 'Project > Settings > Outputs > Pre-processing Settings'.
Ensuring this setting is on will allow you to access the full range of image information.
{
"data":{
"image Original url":"https:///IMG_2299.JPG",
"image Original file name":"abc123.jpg",
"image Size":"1452322",
"image Height":"2448",
"image Width":"3264"
}
}
Input
The 'data' element systematically enumerates the inputs as specified in the project settings. In this context, the defined inputs include 'name' and 'url'. This means that the 'data' element will specifically list and reference these inputs, aligning with how they are configured within the project's parameters.
Scene outputs reflect the entire workspace. In this instance, elements like 'image', 'date', and 'comments', prefixed with 'output_', are examples of scene outputs. These outputs also encompass the overall scene context, such as indicating whether it's day or night, or whether the weather is sunny or rainy.
{
"output_image":{
"layers":{
"vector_tagging":[
{
"shapes":[
]
}
]
}
},
"output_date":"2025-05-07",
"output_comments":"This is a test comment"
}
Layers
The output is designed with a primary layer that acts as a wrapper, containing all the annotation details pertinent to a particular task. Nested within this primary layer is a specialized sub-layer named 'vector_tagging'. This layer functions as a dedicated storage for diverse geometric shapes employed in vector annotations, including lines, polygons, and points.
Every shape drawn on an image is represented as a 'shape output'. These shape outputs are also known as 'nested outputs' due to their structural positioning. They are either 'nested' within a workspace output or manifest as a nested object within the JSON output file.
The workspace is the canvas where annotations are made. Shapes drawn on this canvas are the workspace's outputs. It can be configured as either an 'image' or a 'video', accommodating different types of visual data
Multi-level menus consist of a hierarchy of nested menus. In our JSON format representation, this hierarchical structure is denoted using the '|' character. For instance, in the 'object' output provided as an example, 'color' represents the first level of the hierarchy, while 'multiple colors' signifies the second level. This notation effectively communicates the layered organization of the menu options.
{
"object":"color|multiple colors"
}
Dropdown
The syntax for a dropdown output is structured simply as 'key': 'value'. For example, a dropdown can be found within the 'category' tag in this scenario.
{
"category":"truck"
}
Radio Button
The syntax used for a radio button output adopts a straightforward 'key': 'value' format. In the given example, a radio button is located under the 'position' tag.
{
"position":"forward"
}
Checkbox
The syntax for a checkboxes output is structured as 'key': { 'option1': '0', 'option2': '1' }, where the number '1' indicates the selected option. Examples of checkboxes can be found under the 'road side' tag.
The date output adheres to the standard date format, exemplified as ‘2025-05-07’
{
"output_date":"2025-05-07"
}
Text area
The syntax for a text area output uses a straightforward 'key': 'value' format. In this structure, the value can represent a paragraph of text.
{
"output_comments":"This is a test comment"
}
Shapes
Slider-rectangle
The "slide rectangle" is a hybrid geometric shape that merges 2D and 3D aspects.
It follows the structure:
[(x1,y1), (x3,y1), (x1,y2), (x3,y2), (x2,y1)]
The fifth point is denoted by the coordinates (x2,y1). This point, unlike the others, acts as a vertex on a line, introducing a unique three-dimensional characteristic to the rectangle.
📘Note
The additional vertex is always the fifth element in the list, in this example corresponds to [2319, 1650]
This additional vertex suggests that the slide rectangle extends beyond the flat plane, pointing towards or away from the viewer, thus adding depth to the shape. Essentially, the line represented by this fifth point is a key to understanding the slide rectangle’s placement and orientation in a three-dimensional space. It implies that the rectangle is not just a flat shape but a face or a part of a larger 3D object.
An arrow is defined using two sets of coordinates. These coordinates represent the start and end points of the arrow: - The starting point of the arrow is at the coordinates (926, 1147). - The ending point of the arrow is at the coordinates (2929, 1071). This simple structure indicates direction or movement from one point to another.
A "cuboid" is defined using two sets of points: "points" and "key points". Here's how you can describe it clearly:
-Points: The 'points' list consists of several pairs of coordinates. Each pair represents a corner of the cuboid. From the given data, these points are likely to be the corners of the front and back faces of the cuboid. For example, the first pair of coordinates (1465, 1826) is the corner of the front face, and so on for the rest of the points.
-Key Points: Denote the horizon, depth, and height of the cube. However “key points” are not explicitly listed under a separate category. They might be inferred from the 'points' data, indicating the dimensions and orientation of the cuboid in space.
-Dimensions and Orientation: The coordinates in 'points' help define the spatial dimensions and orientation of the cuboid. By analyzing these points, one can determine the length, width, and height of the cuboid and how it is positioned in the given space.
A point represents an exact location in space. It has no dimensions, meaning it has no length, width, or height is defined by a set of coordinates that specify its position in a particular space, such as on a plane. In this JSON the point is defined by the coordinates (1153, 1600). This pair of numbers represents the position of the point in a two-dimensional space, likely on a graphical or digital plane, where 1153 is the position along the horizontal axis (x-coordinate) and 1600 is the position along the vertical axis (y-coordinate)
A line is defined as a straight one-dimensional array that extends infinitely in both directions but has no thickness. It is the shortest distance between two points In this JSON the line is defined by two points: (2371, 2093) and (1989, 2443). These coordinates specify the endpoints of a line segment in a two-dimensional space. The first point (2371, 2093) represents one end of the line segment, while the second point (1989, 2443) that represents the other end.
A rectangle is a four-sided shape where each side is at a right angle to the adjacent sides. Is defined by four coordinates, arranged as [[x1,y1],[x1,y2],[x2,y1],[x2,y2]], where (x1,y1) and (x2,y2) are diagonal from each other. This structure ensures that opposite sides of the rectangle are parallel and equal in length, and all angles are right angles. In this JSON he rectangle is represented by the coordinates (2869,7) (3263,7), (2869,674) and (3263,674) These points correspond to the four corners of the rectangle, ensuring that opposite sides are equal and parallel, forming the right angles at each corner.
A keypoint is a notable point within an image. By connecting multiple keypoints, one can construct a skeletal framework that helps in establishing relationships between different parts of the object. In this JSON example, keypoints represent:
(2705, 1746): Starting point of the shape. (2699, 1801): Close to the first point, it corresponds to an adjacent joint or key feature. (2606, 1921): This point indicates an extremity. (2784, 1920): Nearly aligned vertically with the third point but on the opposite side, indicates the other extremity this adds symmetry and balance to the structure. (2694, 1952): Positioned centrally among the other points, representing a central joint of the shape. (2671, 2104) and (2729, 2107): These points are lower, indicating the lowest extremities of the shape.
A polygon is a two-dimensional shape defined by a series of points connected by straight lines. The number of points can vary, reflecting the complexity of the polygon. In the JSON example, the polygon is quite complex, as indicated by the large number of points listed. The polygon is formed by connecting 23 points in the order they are listed. Each point is represented by a pair of coordinates, likely representing positions in a two-dimensional plane.
The points in the JSON example are: (1982, 1788) (1976, 1789) (1977, 1801) ... (1981, 1788)
The structure of this polygon is created by drawing straight lines between each consecutive pair of points, starting from the first point and ending at the last point, with the final point (1981, 1788) likely connecting back to the first point (1982, 1788) to close the shape.
The "index" refers to a unique identifier assigned to each shape, used to distinguish it from others in a sequence. It is a sequential number, meaning each shape is given a consecutive number based on its order in the sequence.
{
"index":1
}
Tags
Tags are descriptive attributes assigned to a shape to provide additional information or classification, depending of how the project is set up tags can be in different format:
1. Multi-Level Menu ("Object"): This tag uses a hierarchical format, like "color|multiple colors," indicating categories and subcategories. 2. Dropdown ("Category"): A dropdown format, such as "bridge," offers a selection from predefined options. 3. Radio Button ("Position"): This format, like "forward," allows for choosing one option from a set. 4. Checkbox-Style ("Road Side"): A nested structure with binary choices (0 or 1) to indicate features like "left," "none," or "right" roadside.
These varied tag formats provide detailed and specific information about the object, facilitating nuanced classification and understanding.