-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changes to support CUDA build w/o GPU in runtime #13674
base: development
Are you sure you want to change the base?
changes to support CUDA build w/o GPU in runtime #13674
Conversation
src/proc/depth-formats-converter.cpp
Outdated
#endif | ||
|
||
namespace librealsense | ||
{ | ||
void unpack_z16_y8_from_sr300_inzi( uint8_t * const dest[], const uint8_t * source, int width, int height, int actual_size) | ||
void unpack_z16_y8_from_sr300_inzi(uint8_t* const dest[], const uint8_t* source, int width, int height, int actual_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to avoid unneeded code changes, I saw it in several files.
Please go over formatting changes line and revert to only see the new actual changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, will do. these changes are following invoking the coding formatting I got from you. but will revert the changes which are not part of the fix. NP
in += count; | ||
} | ||
#endif | ||
#ifndef RS2_USE_CUDA | ||
for (int i = 0; i < count; ++i) *out_ir++ = *in++ >> 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want this code if no Cuda is found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we want to run this code if CUDA is selected AND nvidia GPU available - fixed the code.
in += count; | ||
} | ||
#endif | ||
#ifndef RS2_USE_CUDA | ||
for (int i = 0; i < count; ++i) *out_ir++ = *in++ << 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we want to run this code if CUDA is selected AND nvidia GPU available - fixed the code.
src/proc/pointcloud.cpp
Outdated
@@ -23,6 +23,7 @@ | |||
#include "proc/sse/sse-pointcloud.h" | |||
#endif | |||
#include "proc/neon/neon-pointcloud.h" | |||
#include "rsutils/rsutilgpu.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the file path to
#include <rsutils/accelerators/gpu.h>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
rscuda::split_frame_y16_y16_from_y12i_cuda(dest, count, reinterpret_cast<const rscuda::y12i_pixel_mipi*>(source)); | ||
} | ||
#endif | ||
#ifndef RS2_USE_CUDA | ||
split_frame(dest, count, reinterpret_cast<const y12i_pixel_mipi*>(source), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want this when CUDa is not available right?
This is pure MIPI (Jetson code) so we need to make sure we run on Jetson with D457 too.
We also need specifically to check Y16 depth format is working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made a check on vtgjetson1, the Y16 is working fine.
src/proc/y12i-to-y16y16-mipi.cpp
Outdated
@@ -27,12 +32,12 @@ namespace librealsense | |||
y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(int left_idx, int right_idx) | |||
: y12i_to_y16y16_mipi("Y12I to Y16L Y16R Transform", left_idx, right_idx) {} | |||
|
|||
y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(const char * name, int left_idx, int right_idx) | |||
y12i_to_y16y16_mipi::y12i_to_y16y16_mipi(const char* name, int left_idx, int right_idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert changes at lines 35,37,40...
rscuda::split_frame_y16_y16_from_y12i_cuda(dest, count, reinterpret_cast<const rscuda::y12i_pixel*>(source)); | ||
} | ||
#endif | ||
#ifndef RS2_USE_CUDA | ||
split_frame(dest, count, reinterpret_cast<const y12i_pixel*>(source), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this right?
You can try to check if Y16 is working in USB and see the result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed the code. can you explain about the USB? how to check it? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check on a linux machine with / without cuda that Y16 format works
src/proc/y12i-to-y16y16.cpp
Outdated
#endif | ||
} | ||
|
||
y12i_to_y16y16::y12i_to_y16y16(int left_idx, int right_idx) | ||
: y12i_to_y16y16("Y12I to Y16L Y16R Transform", left_idx, right_idx) {} | ||
|
||
y12i_to_y16y16::y12i_to_y16y16(const char * name, int left_idx, int right_idx) | ||
y12i_to_y16y16::y12i_to_y16y16(const char* name, int left_idx, int right_idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert please..also line 31,34.. please go over all the PR
rscuda::split_frame_y8_y8_from_y8i_cuda(dest, count, reinterpret_cast<const y8i_pixel*>(source)); | ||
} | ||
#endif | ||
#ifndef RS2_USE_CUDA | ||
split_frame(dest, count, reinterpret_cast<const y8i_pixel*>(source), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
Check Y8 IR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed code. Y8 is working on vtgjetson1 (still having issue with vtgjetson4 - CUDA driver version is insufficient for CUDA runtime version_ - WIP
@@ -0,0 +1,18 @@ | |||
// License: Apache 2.0. See LICENSE file in root directory. | |||
// Copyright(c) 2023 Intel Corporation. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025
#pragma once | ||
|
||
#ifdef __cplusplus | ||
extern "C" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the extern "c"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a must, removed.
@@ -0,0 +1,43 @@ | |||
// License: Apache 2.0. See LICENSE file in root directory. | |||
// Copyright(c) 2023 Intel Corporation. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025
{ | ||
cudaGetDeviceCount(&gpuDeviceCount); | ||
retVal = gpuDeviceCount > 0; | ||
if (retVal == false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to get this log only once and not on every check right?
I would call this function at initialization process and log it and here only return true/false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code modified. we will log only once (when gpuDeviceCout == -1). once it is initialized to 0 (as no GPU or due to an error) or >0, we won't log anymore.
No description provided.