How to using my custom semantic segmentation model?

josh19941126 · May 14, 2025, 9:38am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) : RTX 3060
• DeepStream Version : 7.0
• TensorRT Version :
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) : question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

How can I use the custom semantic segmentation model in Deepstream 7.0?

Semantic segmentation model input and output

0 INPUT kFLOAT input.1 3x512x512
1 OUTPUT kFLOAT 1145 1x512x512

custom_config_infer.txt

[property]
gpu-id=0
gie-unique-id=1
interval=0

batch-size=3
net-scale-factor=0.003921569
model-color-format=0
infer-dims=3;512;512

onnx-file=../../../../tritonserver/models/customnet/1/segmentation-efficientnet-b3.onnx
model-engine-file=../../../../tritonserver/models/customnet/1/segmentation-efficientnet-b3.trt

process-mode=1
network-mode=2 # 0: FP32, 1: INT8, 2: FP16
network-type=2 # 0: Detector, 1: Classifier, 2: Segmentation, 3: Instance Segmentation

threshold=0.1
num-detected-classes=2
cluster-mode=4

parse-bbox-func-name=NvDsInferParseCustomDetection
parse-bbox-instance-mask-func-name=NvDsInferParseCustomSegmentation
custom-lib-path=../../gst-plugins/gst-nvinferserver/nvdsinfer_custom_impl_obstacle/obstacle_detection

scaling-filter=1
scaling-compute-hw=1
symmetric-padding=0
maintain-aspect-ratio=1

I found custom parser forms in the official document.

3.1. Custom bounding box parsing function

extern "C" bool NvDsInferParseCustomDetection(
	std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
	NvDsInferNetworkInfo const& networkInfo,
	NvDsInferParseDetectionParams const& detectionParams,
	std::vector<NvDsInferParseObjectInfo>& objectList);

3.2. Custom bounding box and instance mask parsing function

bool NvDsInferParseCustomInstanceMask(
	std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
	NvDsInferNetworkInfo const& networkInfo,
	NvDsInferParseDetectionParams const& detectionParams,
	std::vector<NvDsInferParseObjectInfo>& objectList);

3.3. Custom semantic segmentation output parsing function

extern "C"
bool NvDsInferParseCustomSegmentation(
    std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
    NvDsInferNetworkInfo const& networkInfo, float segmentationThreshold,
    unsigned int numClasses, int* classificationMap,
    float*& classProbabilityMap);

How can I register NvDsInferParseCustomSegmentation in my config_infer.txt? The nvinfer official document does not have anything related to semantic segmentation.

fanzh · May 15, 2025, 1:57am

please refer to this segmentation sample. Here is the configuration of nvinfer. please set custom-lib-path and parse-bbox-instance-mask-func-name for the custom postprocessing.

josh19941126 · May 15, 2025, 6:28am

Thank you for reply!

I revised the config and custom parser by referring to your advice.

infer_config.txt

[property]
gpu-id=0
gie-unique-id=1
interval=0

batch-size=3
net-scale-factor=0.003921569
model-color-format=0
infer-dims=3;512;512

onnx-file=../../../../tritonserver/models/raildet/1/segmentation-efficientnet-b3.onnx
model-engine-file=../../../../tritonserver/models/raildet/1/segmentation-efficientnet-b3.trt

process-mode=1
network-mode=2 # 0: FP32, 1: INT8, 2: FP16
network-type=3 # 0: Detector, 1: Classifier, 2: Segmentation, 3: Instance Segmentation

num-detected-classes=1
cluster-mode=4

parse-bbox-instance-mask-func-name=CUSTOM_SEGMENTATION
custom-lib-path=../../gst-plugins/gst-nvinferserver/nvdsinfer_custom_impl_obstacle/obstacle_detection

segmentation-threshold=0.3

output-instance-mask=1
output-blob-names=1145
scaling-filter=1
scaling-compute-hw=1
symmetric-padding=0
maintain-aspect-ratio=1

custom_parser.cpp

void getMaskDimension(float* buf, int w, int h, int& left, int& top, int& width, int& height)
{
    int right = 0, bottom = 0;
    left = w, top = h;
    bool has_mask = false;

    for(int y = 0; y < h; y++) {
        for(int x = 0; x < w; x++) {
            float val = buf[y * w + x];
            if(val >= 0.0f) {
                has_mask = true;
                if (x < left) left = x;
                if (x > right) right = x;
                if (y < top) top = y;
                if (y > bottom) bottom = y;
            }
        }
    }

    if (!has_mask) {
        left = top = width = height = 0;
        return;
    }

    width = right - left + 1;
    height = bottom - top + 1;
}

void copy_mask(float* dst, float* src, int w, int h,
               int mask_left, int mask_top, int mask_width, int mask_height) {
    for (int y = 0; y < mask_height; y++) {
        for (int x = 0; x < mask_width; x++) {
            int src_x = mask_left + x;
            int src_y = mask_top + y;
            float val = src[src_y * w + src_x];

            dst[y * mask_width + x] = (val > 0.0f) ? 1.0f : 0.0f;
        }
    }
}


extern "C" bool CUSTOM_SEGMENTATION(
	std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
	NvDsInferNetworkInfo const& networkInfo,
	NvDsInferParseDetectionParams const& detectionParams,
	std::vector<NvDsInferInstanceMaskInfo>& objectList);

extern "C" bool CUSTOM_SEGMENTATION(
    const std::vector<NvDsInferLayerInfo> &outputLayersInfo,
    const NvDsInferNetworkInfo  &networkInfo,
    const NvDsInferParseDetectionParams &detectionParams,
    std::vector<NvDsInferInstanceMaskInfo> &objectList) {

    const NvDsInferLayerInfo* mask_layer = nullptr;

    for (const auto& layer : outputLayersInfo) {
        if (layer.layerName && std::string(layer.layerName) == "1145") {
            mask_layer = &layer;
            break;
        }
    }

    if (!mask_layer) {
        std::cerr << "ERROR: Output layer '1145' not found.\n";
        return false;
    }

    int channels = mask_layer->inferDims.d[0];
    int height = mask_layer->inferDims.d[1];
    int width = mask_layer->inferDims.d[2];

    if (channels != 1) {
        std::cerr << "ERROR: Expected output shape [1 x H x W], got [" 
                  << channels << " x " << height << " x " << width << "]\n";
        return false;
    }

    float* mask_data = static_cast<float*>(mask_layer->buffer);

    int left, top, mwidth, mheight;
    getMaskDimension(mask_data, width, height, left, top, mwidth, mheight);

    if (mwidth <= 0 || mheight <= 0) return true;

    float* new_mask = new float[mwidth * mheight];
    copy_mask(new_mask, mask_data, width, height, left, top, mwidth, mheight);

    NvDsInferInstanceMaskInfo obj;
    obj.left = left;
    obj.top = top;
    obj.width = mwidth;
    obj.height = mheight;
    obj.classId = 0;
    obj.detectionConfidence = 1.0f;
    obj.mask = new_mask;
    obj.mask_size = sizeof(float) * mwidth * mheight;
    obj.mask_width = mwidth;
    obj.mask_height = mheight;

    objectList.push_back(obj);

    return true;
}

nvosd config.yml

osd:
  enable: 1
  gpu-id: 0
  border-width: 1
  display-text: 0
  text-size: 15
  text-color: 1;1;1;1
  text-bg-color: 0.3;0.3;0.3;1
  font: Serif
  show-clock: 0
  clock-x-offset: 800
  clock-y-offset: 820
  clock-text-size: 12
  clock-color: 1;0;0;0
  display-mask: 1
  nvbuf-memory-type: 0

After the above modification, the custom parser function runs well and the mask value is output. However, the mask is not drawn in the nvosd result. What is the problem?

fanzh · May 15, 2025, 7:37am

which sample are you testing or referring to? what is the complete media pipeline?
2.nvosd is responsible for drawing instance segmentation mask. please refer to the sample in my last comment.

josh19941126 · May 15, 2025, 7:57am

I’m using deepstream-parallel-inference-app
How can i use segvisual in my case?

fanzh · May 15, 2025, 8:16am

deepstream-parallel-inference-app supports doing inference in parallel and merging metadata. if you are testing one instance segmentation mdoel, please use tao_segmentation above instead. if not, what are the models used to do respectfully? what is the media pipeline? do you need to merge the metadata from different models? Thanks!

josh19941126 · May 15, 2025, 8:32am

I use two types of primary detectors

Object Detector (YOLOv7)
Railway Detector (TepNet)

This is my media pipeline

josh19941126 · May 15, 2025, 10:24am

Thank you for reply!

Output of the tepnet is binary mask which fill with probability. For pixels that exceed the threshold, I want to draw a railway on osd.

source3_config.yml

osd:
  enable: 1
  gpu-id: 0
  border-width: 1
  display-text: 1
  text-size: 15
  text-color: 1;1;1;1
  text-bg-color: 0.3;0.3;0.3;1
  font: Serif
  show-clock: 0
  clock-x-offset: 800
  clock-y-offset: 820
  clock-text-size: 12
  clock-color: 1;0;0;0
  display-mask: 1
  nvbuf-memory-type: 0

custom_parser.cpp

extern "C" bool TEPNET_SEGMENTATION (
    const std::vector<NvDsInferLayerInfo> &outputLayersInfo,
    const NvDsInferNetworkInfo  &networkInfo,
    const NvDsInferParseDetectionParams &detectionParams,
    std::vector<NvDsInferInstanceMaskInfo> &objectList) {

    const NvDsInferLayerInfo* mask_layer = nullptr;

    for (const auto& layer : outputLayersInfo) {
        if (layer.layerName && std::string(layer.layerName) == "1145") {
            mask_layer = &layer;
            break;
        }
    }

    if (!mask_layer) {
        std::cerr << "ERROR: Output layer '1145' not found.\n";
        return false;
    }

    int channels = mask_layer->inferDims.d[0];
    int height = mask_layer->inferDims.d[1];
    int width = mask_layer->inferDims.d[2];

    if (channels != 1) {
        std::cerr << "ERROR: Expected output shape [1 x H x W], got [" 
                  << channels << " x " << height << " x " << width << "]\n";
        return false;
    }

    float* mask_data = static_cast<float*>(mask_layer->buffer);

    int left, top, mwidth, mheight;
    getMaskDimension(mask_data, width, height, left, top, mwidth, mheight);

    if (mwidth <= 0 || mheight <= 0) return true;

    float* new_mask = new float[mwidth * mheight];
    copy_mask(new_mask, mask_data, width, height, left, top, mwidth, mheight);

    NvDsInferInstanceMaskInfo obj;
    obj.left = left;
    obj.top = top;
    obj.width = mwidth;
    obj.height = mheight;
    obj.classId = 1;
    obj.detectionConfidence = 1.0f;
    obj.mask = new_mask;
    obj.mask_size = sizeof(float) * mwidth * mheight;
    obj.mask_width = mwidth;
    obj.mask_height = mheight;

    objectList.push_back(obj);

    return true;
}

config_infer.txt

[property]
gpu-id=0
gie-unique-id=1
interval=0

batch-size=3
net-scale-factor=0.003921569
model-color-format=0
infer-dims=3;512;512

onnx-file=../../../../tritonserver/models/tepnet/1/segmentation-efficientnet-b3.onnx
model-engine-file=../../../../tritonserver/models/tepnet/1/segmentation-efficientnet-b3.trt

process-mode=1 # 1: Primary, 2: Secondary
network-mode=2 # 0: FP32, 1: INT8, 2: FP16
network-type=3 # 0: Detector, 1: Classifier, 2: Segmentation, 3: Instance Segmentation

num-detected-classes=2
cluster-mode=4

parse-bbox-instance-mask-func-name=TEPNET_SEGMENTATION
custom-lib-path=../../gst-plugins/gst-nvinferserver/nvdsinfer_custom_impl_obstacle/obstacle_detection

segmentation-threshold=0.3

output-instance-mask=1
symmetric-padding=0
maintain-aspect-ratio=1

fanzh · May 15, 2025, 3:28pm

could you share configuration of deepstream_parallel_inference_app? like source4_1080p_dec_parallel_infer.yml.

josh19941126 · May 16, 2025, 12:57am

I fixed the problem, the problem was that the input image was adjusted to 512x288 size due to the main-aspect-ratio parameter set in infer_config.

When the output of the mask is mapped within the range of 288 by adjusting the parameters, it is visualized succesfully.

Thank you for your help :)

fanzh · May 16, 2025, 2:44am

Glad to know you fixed it, Is this still an DeepStream issue to support? Thanks!

system · May 30, 2025, 2:45am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Trouble running own model with Deepstream Graph Composer DeepStream SDK deepstream , graph-composer	11	76	January 21, 2025
Getting blank output while testing deepstream-segmentation with mask_rcnn model DeepStream SDK deepstream	4	19	January 24, 2025
OTA Update in deepstream-test5 model update segmentation fault DeepStream SDK	11	1036	June 23, 2022
Show result detection DeepStream SDK camera , gstreamer	8	743	September 18, 2021
DeepStream 7.1 nvinferserver tensor clone error DeepStream SDK deepstream	12	100	November 29, 2024
Issues with running inference on multiple rtsp streams in deepstream-imagedata-multistream DeepStream SDK jetson-inference	24	684	August 7, 2024
Nvidia Deepstream6.3 custom segmentation model inference error: Aborted (core dumped) DeepStream SDK python	10	225	July 23, 2024
How to use object detection and segmentation models in the same pipeline and get the outputs in the screen DeepStream SDK	13	2255	October 12, 2021
Deepstream segmask sample with dashcamnet error DeepStream SDK	12	386	July 21, 2023
How to append DeepStream Metadata in Python without using Streammux / nvinfer for parallel branch? DeepStream SDK	21	719	March 12, 2024

How to using my custom semantic segmentation model?

Related topics