This covers the scenario when there is a linking required between the 3d cuboid made in point clouds and corresponding bounding box made on image.
Calibration plays an important role here, where the projection of cuboid can be made on the image using extrinsic and camera intrinsic parameter

{
  "reference_id": "001",
  "data": {
    "sensor_data": {
      "frames": [
        {
          "sensors": [
            {
              "sensor_id": "lidar",
              "data_url": "",
              "sensor_pose": {
                "position": {
                  "x": 0,
                  "y": 0,
                  "z": 0
                },
                "heading": {
                  "w": 1,
                  "x": 0,
                  "y": 0,
                  "z": 0
                }
              }
            },
            {
              "sensor_id": "18158562",
              "data_url": "",
              "sensor_pose": {
                "position": {
                  "x": -0.8141599005737696,
                  "y": 1.6495307329711615,
                  "z": -1.5230365881437538
                },
                "heading": {
                  "w": 0.6867388282287469,
                  "x": 0.667745267519749,
                  "y": -0.21162707775631337,
                  "z": 0.19421642430111224
                }
              }
            }
          ],
          "ego_pose": {},
          "frame_id": "0001"
        },
        {
          "sensors": [
            {
              "sensor_id": "lidar",
              "data_url": "",
              "sensor_pose": {
                "position": {
                  "x": 0,
                  "y": 0,
                  "z": 0
                },
                "heading": {
                  "w": 1,
                  "x": 0,
                  "y": 0,
                  "z": 0
                }
              }
            },
            {
              "sensor_id": "18158562",
              "data_url": "",
              "sensor_pose": {
                "position": {
                  "x": -0.8141599005737696,
                  "y": 1.6495307329711615,
                  "z": -1.5230365881437538
                },
                "heading": {
                  "w": 0.6867388282287469,
                  "x": 0.667745267519749,
                  "y": -0.21162707775631337,
                  "z": 0.19421642430111224
                }
              }
            }
          ],
          "ego_pose": {},
          "frame_id": "0002"
        }
      ],
      "sensor_meta": [
        {
          "id": "lidar",
          "name": "lidar",
          "state": "editable",
          "modality": "lidar",
          "primary_view": true
        },
        {
          "id": "18158562",
          "name": "18158562",
          "state": "editable",
          "modality": "camera",
          "primary_view": false,
          "intrinsics": {
            "cx": 0,
            "cY": 0,
            "fx": 0,
            "fy": 0,
            "k1": 0,
            "k2": 0,
            "k3": 0,
            "k4": 0,
            "p1": 0,
            "p2": 0,
            "skew": 0,
            "scale_factor": 1
          }
        }
      ]
    }
  },
  "tag": "track_3d_bounding_boxes"
}
{
  "data": {
    "job_id": "3f3e8675-ca69-46d7-aa34-96f90fcbb732",
    "reference_id": "001",
    "tag": "Sample-task"
  },
  "success": true
}
import requests
import json


"""
Details for creating JOBS,
project_id ->> ID of project in which job needed to be created
x_api_key ->> secret api key to create JOBS
tag ->> You can ask this from playment side
batch_id ->> The batch in which JOB needed to be created
"""
project_id = ''
x_api_key = ''
tag = ''
batch_id = ''

def Upload_jobs( DATA):
    base_url = f"https://api.playment.io/v0/projects/{project_id}/jobs"
    response = requests.post(base_url, headers={'x-api-key': x_api_key}, json=DATA)
    print(response.json())
    if response.status_code >= 500:
        raise Exception("Something wrong at Playment's end")
    if 400 <= response.status_code < 500:
        raise Exception("Something wrong!!")
    return response.json()

def create_batch(batch_name,batch_description):
    headers = {'x-api-key':x_api_key}
    url = 'https://api.playment.io/v1/project/{}/batch'.format(project_id)
    data = {"project_id":project_id,"label":batch_name,"name":batch_name,"description":batch_description}
    response = requests.post(url=url,headers=headers,json=data)
    print(response.json())
    if response.status_code >= 500:
        raise Exception("Something wrong at Playment's end")
    if 400 <= response.status_code < 500:
        raise Exception("Something wrong!!")
    return response.json()['data']['batch_id']

"""
Defining Sensor: This will contain detail about sensor's attributes.
:param _id: This is the sensor's id.
:param name: Name of the sensor.
:param primary_view: Only one of the sensor can have primary_view as true.
:param state(optional): If you want this sensor not to be annotated, provide state as non_editable. Default is editable.
:param modality: This is the type of sensor.
:param intrinsics: In case of a camera modality sensor we will need the sensor intrinsics. 
                This field should ideally become part of the sensor configuration, and not be sent as part of each Job.
                "cx": principal point x value; default 0
                "cy": principal point y value; default 0
                "fx": focal length in x axis; default 0
                "fy": focal length in y axis; default 0
                "k1": 1st radial distortion coefficient; default 0
                "k2": 2nd radial distortion coefficient; default 0
                "k3": 3rd radial distortion coefficient; default 0
                "k4": 4th radial distortion coefficient; default 0
                "p1": 1st tangential distortion coefficient; default 0
                "p2": 2nd tangential distortion coefficient; default 0
                "skew": camera skew coefficient; default 0
                "scale_factor": The factor by which the image has been downscaled (=2 if original image is twice as
                                large as the downscaled image)
"""

#Name the sensors, similarly you can define this for multiple cameras
lidar_sensor_id = 'lidar'

#Preparing Lidar Sensor
lidar_sensor = {"id": lidar_sensor_id, "name": "lidar", "primary_view": True, "modality": "lidar","state": "editable"}

#Preparing Camera Sensor for camera_1
camera_1_intrinsics = {
    "cx": 0, "cy": 0, "fx": 0, "fy": 0,
    "k1": 0, "k2": 0, "k3": 0, "k4": 0, "p1": 0, "p2": 0, "skew": 0, "scale_factor": 1}

camera_1 = {"id": "camera_1", "name": "camera_1", "primary_view": False, "modality": "camera",
            "intrinsics": camera_1_intrinsics,"state": "editable"}

#Preparing Camera Sensor for camera_2
camera_2_intrinsics = {
    "cx": 0, "cy": 0, "fx": 0, "fy": 0,
    "k1": 0, "k2": 0, "k3": 0, "k4": 0, "p1": 0, "p2": 0, "skew": 0, "scale_factor": 1}

camera_2 = {"id": "camera_2", "name": "camera_2", "primary_view": False, "modality": "camera",
            "intrinsics": camera_2_intrinsics,"state": "editable"}

#SENSOR META - it contains information about sensors and it is constant across all jobs of same sensor
sensor_meta = [lidar_sensor, camera_1, camera_2]

#Collect frames for every sensor.
lidar_frames = [
    "https://example.com/pcd_url_1",
    "https://example.com/pcd_url_2"
]

camera_1_frames = [
    "https://example.com/image_url_1",
    "https://example.com/image_url_2"
]

camera_2_frames = [
    "https://example.com/image_url_3",
    "https://example.com/image_url_4"
]

#Preparing job creation payload

sensor_data = {"frames" : [],"sensor_meta": sensor_meta}
for i in range(len(lidar_frames)):
    reference_id = i
    
    #Collect ego_pose if the data is in world frame of reference
    ego_pose = {
    "heading": {"w": 1, "x": 0,
                "y": 0, "z": 0},
    "position": {"x": 0, "y": 0, "z": 0}
    }
    frame_obj = {"ego_pose" : ego_pose, "frame_id" : str(i),"sensors":[]}
    
    lidar_heading = {"w": w, "x": x, "y": y, "z": z}
    lidar_position = {"x": 0, "y": 0, "z": 0}

    lidar_sensor = {"data_url": lidar_frames[i], "sensor_id": lidar_sensor_id,
                    "sensor_pose": {"heading": lidar_heading, "position": lidar_position}}

    camera_1_heading = {"w": w, "x": x, "y": y, "z": z}
    camera_1_position = {"x": 0, "y": 0, "z": 0}

    camera_1_sensor = {"data_url": camera_1_frames[i], "sensor_id": 'camera_1',
                       "sensor_pose": {"heading": camera_1_heading, "position": camera_1_position}}

    camera_2_heading = {"w": w, "x": x, "y": y, "z": z}
    camera_2_position = {"x": 0, "y": 0, "z": 0}

    camera_2_sensor = {"data_url": camera_2_frames[i], "sensor_id": 'camera_2',
                       "sensor_pose": {"heading": camera_2_heading, "position": camera_2_position}}
    
    frame_obj['sensors'].append(lidar_sensor)
    frame_obj['sensors'].append(camera_1_sensor)
    frame_obj['sensors'].append(camera_2_sensor)
    
    sensor_data['frames'].append(frame_obj)
    
job_payload={"data":{"sensor_data":sensor_data}, "reference_id":str(reference_id)}


data = {"data": job_payload['data'], "reference_id": job_payload['reference_id'], 'tag': tag, "batch_id": batch_id}
def to_dict(obj):
    return json.loads(
        json.dumps(obj, default=lambda o: getattr(o, '__dict__', str(o)))
    )
print(json.dumps(to_dict(data)))
response = Upload_jobs(DATA=data)
# .PCD v0.7 - Point Cloud Data file format 
VERSION 0.7
FIELDS x y z
SIZE 4 4 4
TYPE F F F
COUNT 1 1 1
WIDTH 47286
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 47286
DATA ascii
5075.773 3756.887 107.923
5076.011 3756.876 107.865
5076.116 3756.826 107.844
5076.860 3756.975 107.648
5077.045 3756.954 107.605
5077.237 3756.937 107.559
5077.441 3756.924 107.511
5077.599 3756.902 107.474
5077.780 3756.885 107.432
5077.955 3756.862 107.391
...
{
    "data": {
        "project_id": "",
        "reference_id": "001",
        "job_id": "fde54589-ebty-48lp-677a-03a0428ca836",
        "batch_id": "b99d241a-bb80-ghyi-po90-c37d4fead593",
        "status": "completed",
        "tag": "sample_project",
        "priority_weight": 5,
        "result": "https://playment-data-uploads.s3.ap-south-1.amazonaws.com/sample-result.json"
    },
    "success": true
}
{
          "data": {
            "annotations": [
              {
                "label": "car",
                "state": "editable",
                "coordinates": [
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  },
                  {
                    "x": 0,
                    "y": 0
                  }
                ],
                "showLabel": false,
                "_id": "d7c444e2-f041-46cc-bda0-25ae35801dac",
                "attributes": {},
                "sensor_id": "front_main",
                "frame_id": "0001",
                "track_id": "d7c444e2-f041-46cc-bda0-25ae35801dac",
                "type": "rectangle",
                "source": "images",
                "source_type": "video2d"
              },
              {
                "dimensions": {
                  "length": 3.2802,
                  "width": 1.352,
                  "height": 1.7
                },
                "center": {
                  "x": 9.8138,
                  "y": 0.3344,
                  "z": 0.9557
                },
                "rotation": {
                  "x": 0,
                  "y": 0,
                  "z": -1.6222,
                  "order": "XYZ"
                },
                "_id": "d7c444e2-f041-46cc-bda0-25ae35801dac",
                "label": "car",
                "color": "rgb(151,22,15)",
                "attributes": {},
                "keyFrame": true,
                "sensor_id": "lidar",
                "frame_id": "0001",
                "track_id": "d7c444e2-f041-46cc-bda0-25ae35801dac",
                "type": "cuboid3d",
                "source": "pointcloud",
                "source_type": "video3d"
              }
            ],
            "frames": [
              {
                "_id": "1574398328680-car_003_051_000000"
              }
            ],
            "tracks": [
              {
                "_id": "d7c444e2-f041-46cc-bda0-25ae35801dac",
                "color": "rgb(151,22,15)",
                "label": "car"
              }
            ],
            "sensors": [
              {
                "sensor_id": "lidar",
                "sensor_type": "LIDAR"
              },
              {
                "sensor_id": "front_main",
                "sensor_type": "CAMERA"
              }
            ]
          }
}

Result JSON Structure

Annotations are linked via common is at data.annotations._id.
All those annotations which have same data.annotaions._id are linked to each other

Response KeyDescription
dataobject having list of annotations, frames, tracks and sensors
data..annotationsList of all the annotation tracks each having -

type : String what type of annotation is this, whether rectangle or cuboid

dimensions : object having length, width, and height of a cuboid if type is cuboid
center : object having x, y, and z coordinate of cuboid's center if type is cuboid
rotations : object having rotational value of cuboid in quaternion (w,x,y,z) if type is cuboid
_id : String unique id of the cuboid
label : String label of the cuboid
color : String color of the cuboid in RGB(r,g,b)
attributes : object attribute dictionary for the given annotation
sensor_id : String id of the sensor
frame_id : String id of the frame in which annotation is done
track_id : String id of the track, this can be used where all the same object is present in other frames
data..framesList of the frames each having
_id : String id of the frame
data.tracksList of all the tracks in this job, each having

_id : String uuid id of the track
label : String label of the track
color : String color of the track in RGB(r,g,b)
data.sensorsList of all the sensors in this job, each having

sensor_id : String
sensor_type : String