Hi,
I have this model (for image classification) generated by DeepStream (mymodel.onnx_b1_gpu0_fp32.engine
), and I want to test it with images outside of DeepStream using TensorRT. How can I test it with TensorRT?
(Tensorrt Version: 8.6.1)
Hi,
I have this model (for image classification) generated by DeepStream (mymodel.onnx_b1_gpu0_fp32.engine
), and I want to test it with images outside of DeepStream using TensorRT. How can I test it with TensorRT?
(Tensorrt Version: 8.6.1)
import pycuda.driver as cuda
import pycuda.autoinit
import tensorrt as trt
import numpy as np
from PIL import Image
import requests
from io import BytesIO
logger = trt.Logger(trt.Logger.WARNING)
with open(“torch_yolov11.onnx_b1_gpu0_fp32.engine”, “rb”) as f:
engine_data = f.read()
runtime = trt.Runtime(logger)
engine = runtime.deserialize_cuda_engine(engine_data)
context = engine.create_execution_context()
image_url = “https://ultralytics.com/images/bus.jpg”
response = requests.get(image_url)
image = Image.open(BytesIO(response.content)).convert(“RGB”)
image = image.resize((224, 224))
image = np.array(image).astype(np.float32)
image = image.transpose((2, 0, 1)) # Change to CHW format
image = np.expand_dims(image, axis=0) # Add batch dimension
input_data = cuda.mem_alloc(image.nbytes)
#output_data = cuda.mem_alloc(engine.max_batch_size * np.prod(engine.get_binding_shape(1)) * np.float32().itemsize)
output_data = cuda.mem_alloc(int(engine.max_batch_size * np.prod(engine.get_binding_shape(1)) * np.float32().itemsize))
image = np.ascontiguousarray(image)
cuda.memcpy_htod(input_data, image)
context.execute_v2([int(input_data), int(output_data)])
output = np.empty(shape=engine.get_binding_shape(1), dtype=np.float32)
cuda.memcpy_dtoh(output, output_data)
print(“Inference result:”, output)
I used this code, but when I print the output, it displays:
[01/20/2025-16:53:44] [TRT] [E] 2: [softMaxV2Runner.cpp::execute::213] Error Code 2: Internal Error (Assertion y != nullptr failed. )
Inference result: [[-0.74447846 0.75425875]]
/tmp/ipykernel_46848/2110792327.py:5: DeprecationWarning: Use get_tensor_shape instead.
output = np.empty(shape=engine.get_binding_shape(1), dtype=np.float32)
Why does the result not display as probabilities (SoftMax output)?
Hi, I get this error when I try to process more than one image. The first execution runs correctly, but when I run the next execution, this error is displayed. How can I resolve this problem?
------------------------------------------------error--------------------------------------------------------------------------------
LogicError: cuMemcpyDtoH failed: an illegal memory access was encountered
---------------------------------------------code---------------------------------------------------------------------------------
input_data = cuda.mem_alloc(image.nbytes)
output_shape = engine.get_binding_shape(1)
output_size = int(np.prod(output_shape) * np.float32().itemsize)
output_data = cuda.mem_alloc(output_size)
# Ensure the input image is contiguous in memory
image = np.ascontiguousarray(image)
# Transfer input data to GPU
cuda.memcpy_htod(input_data, image)
# Run inference
context.execute_v2([int(input_data), int(output_data)])
# Allocate output buffer
output = np.empty(output_shape, dtype=np.float32)
# Retrieve output from GPU
cuda.memcpy_dtoh(output, output_data)
cuda.Context.synchronize()
# Free memory after inference
del input_data # This deletes the reference to input_data, not the memory itself
del output_data # Same for output_data
any response !?
Hi @amine.ghamgui.anavid thanks for you’re patience - I’ve routed it to the right team and will get back to you with a response. Now moving your post to the DeepStream forum so you can get help from the right team!
The engine file generated by DeepStream nvinfer is TensorRT engine file. There are samples for how to do inferencing with python TensorRT. TensorRT/samples/python at release/10.7 · NVIDIA/TensorRT