-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VP FPGA Lenet Error ((DLA_TEST) Error 0x00000002 and NvDlaSubmit: Error IOCTL failed (No such process)) #58
Comments
These are the steps I am taking: I first built an instance with the nvdla_vp_fpga_ami_ubuntu AMI. I chose the f1.2xlarge instance type https://github.com/aws/aws-fpga.git ubuntu@ip-172-31-78-73:~/aws-fpga/sdk/linux_kernel_drivers/edma$ lsmod | grep edma cd ubuntu@ip-172-31-78-73:~$ sudo fpga-clear-local-image -S 0 ubuntu@ip-172-31-78-73:~$ sudo fpga-load-local-image -S 0 -I agfi-09c2a21805a8b9257 ubuntu@ip-172-31-78-73:~$ sudo fpga-clear-local-image -S 0 ubuntu@ip-172-31-78-73:~$ sudo fpga-describe-local-image -S 0 -H ubuntu@ip-172-31-78-73:~$ sudo fpga-load-local-image -S 0 -I agfi-05d68b424ef03f66e ubuntu@ip-172-31-78-73:~$ sudo fpga-describe-local-image -S 0 -R -H sudo rmmod edma-drv sudo insmod $SDK_DIR/linux_kernel_drivers/edma/edma-drv.ko cd nvdla/vp ubuntu@ip-172-31-78-73:~/nvdla/vp$ sudo ./aarch64_toplevel -c aarch64_nvdla.lua --fpga
Error to open/read the config file: aarch64_nvdla.lua ubuntu@ip-172-31-78-73:~/nvdla/vp$ sudo ./aarch64_toplevel -c conf/aarch64_nvdla.lua --fpga
No sc_log specified, will use the default setting In the aarch64_nvdla.lua I changed two pathes: -kernel images/linux-4.13.3/Image ---> -kernel /home/ubuntu/nvdla/sw/prebuilt/linux/Image ubuntu@ip-172-31-78-73:~/nvdla/vp$ sudo ./aarch64_toplevel -c conf/aarch64_nvdla.lua --fpga This commands enter the VP and I login in with root and nvdla. mount -t 9p -o trans=virtio r /mnt cd ../mnt There is no sw/prebuilt/linux/ folders. I copy the sw/prebuilt/linux folder to /home/ubuntu/nvdla/vp/Myspace path which I can have access throught the mnt folder. cd /Myspace/linux insmod drm.ko chmod a+x nvdla_runtime ./nvdla_runtime --loadable fast-math.nvdla --rawdump --image ../digits/three_i creating new runtime context... |
I also ran this sanity test, and the test pass: insmod opendla_small.ko[ 51.653664] opendla: loading out-of-tree module taints kernel. ./nvdla_runtime --loadable ../regression/flatbufs/kmd/CDP/CDP_L0_0_small_fbufcreating new runtime context... |
@Hassan313 Did you finally fix it? |
@minils Hi, I am not sure. Because of the errors I was getting, I did not continue with FPGAs. |
I have used the nv_small: agfi-05d68b424ef03f66e, and tried to run an inference on an digit image with the Lenet network which I have been successfully doing it on a VP with QEMU, however when I tried with the sub steps mentioned all in step 2 (2.1 to 2.4 ) from the http://nvdla.org/vp_fpga.html, I am getting these errors:
./nvdla_runtime --loadable fast-math.nvdla --rawdump --image ../digits/three_invert.pgm
creating new runtime context...
Emulator starting
ppgminfo 1 28 28
pgm2dimg 1 28 28 1 32 896 896
(DLA_TEST) Error 0x00000002: Unexpected surface format 37, defaulting to D_F16_CxHWx_x16_F (in TestUtils.cpp, function Tensor2DIMG(), line 85)
submitting tasks...
[ 423.492297] Enter:dla_read_network_config
[ 423.493706] Exit:dla_read_network_config status=0
[ 423.494029] Enter: dla_initiate_processors
[ 423.494858] Enter: dla_submit_operation
[ 423.495108] Prepare Convolution operation index 0 ROI 0 dep_count 1
[ 423.495460] Enter: dla_prepare_operation
[ 423.495866] processor:Convolution group:0, rdma_group:0 available
[ 423.496270] Enter: dla_read_config
[ 423.506511] Exit: dla_read_config
[ 423.506825] Exit: dla_prepare_operation status=0
[ 423.507146] Enter: dla_program_operation
[ 423.507406] Program Convolution operation index 0 ROI 0 Group[0]
[ 423.512985] no desc get due to index==-1
[ 423.513954] no desc get due to index==-1
[ 423.514194] no desc get due to index==-1
[ 423.514428] no desc get due to index==-1
[ 423.514675] no desc get due to index==-1
[ 423.514946] Enter: dla_op_programmed
[ 423.515273] Update dependency operation index 1 ROI 0 DEP_COUNT=2
[ 423.515624] Update dependency operation index 64 ROI 0 DEP_COUNT=1
[ 423.515977] enable SDP in dla_update_dependency as depdency are resolved
[ 423.516358] Enter: dla_enable_operation
[ 423.518226] exit dla_enable_operation without actual enable due to processor hasn't been programmed
[ 423.518768] Exit: dla_enable_operation status=0
[ 423.519089] Exit: dla_op_programmed
[ 423.519314] Exit: dla_program_operation status=0
[ 423.519607] Exit: dla_submit_operation
[ 423.519952] Enter: dla_dequeue_operation
[ 423.520233] Dequeue op from Convolution processor, index=1 ROI=0
[ 423.522044] Enter: dla_submit_operation
[ 423.522321] Prepare Convolution operation index 1 ROI 0 dep_count 1
[ 423.522660] Enter: dla_prepare_operation
[ 423.522990] processor:Convolution group:1, rdma_group:0 available
[ 423.523338] Enter: dla_read_config
[ 423.532283] Exit: dla_read_config
[ 423.532652] Exit: dla_prepare_operation status=0
[ 423.533081] Enter: dla_program_operation
[ 423.533331] Program Convolution operation index 1 ROI 0 Group[1]
[ 423.536338] no desc get due to index==-1
[ 423.538459] no desc get due to index==-1
[ 423.538736] no desc get due to index==-1
[ 423.538971] no desc get due to index==-1
[ 423.539202] no desc get due to index==-1
[ 423.539432] Enter: dla_op_programmed
[ 423.539654] Update dependency operation index 2 ROI 0 DEP_COUNT=2
[ 423.539988] Update dependency operation index 65 ROI 0 DEP_COUNT=2
[ 423.540325] Exit: dla_op_programmed
[ 423.541982] Exit: dla_program_operation status=0
[ 423.542286] Exit: dla_submit_operation
[ 423.542544] Exit: dla_dequeue_operation
[ 423.542821] Enter: dla_submit_operation
[ 423.543060] Prepare SDP operation index 64 ROI 0 dep_count 0
[ 423.543372] Enter: dla_prepare_operation
[ 423.543728] processor:SDP group:0, rdma_group:0 available
[ 423.544030] Enter: dla_read_config
[ 423.553398] Exit: dla_read_config
[ 423.553706] Exit: dla_prepare_operation status=0
[ 423.554005] Enter: dla_program_operation
[ 423.554246] Program SDP operation index 64 ROI 0 Group[0]
[ 423.557711] no desc get due to index==-1
[ 423.557987] no desc get due to index==-1
[ 423.558675] no desc get due to index==-1
[ 423.558912] no desc get due to index==-1
[ 423.559169] Enter: dla_op_programmed
[ 423.559398] Update dependency operation index 65 ROI 0 DEP_COUNT=1
[ 423.559733] enable SDP in dla_update_dependency as depdency are resolved
[ 423.560085] Enter: dla_enable_operation
[ 423.560326] exit dla_enable_operation without actual enable due to processor hasn't been programmed
[ 423.562371] Exit: dla_enable_operation status=0
[ 423.562672] Exit: dla_op_programmed
[ 423.562901] Exit: dla_program_operation status=0
[ 423.563179] Enter: dla_enable_operation
[ 423.563456] Enable SDP operation index 64 ROI 0
[ 423.563970] Enter: dla_op_enabled
[ 423.564198] Update dependency operation index 0 ROI 0 DEP_COUNT=1
[ 423.565899] enable Convolution in dla_update_dependency as depdency are resolved
[ 423.566336] Enter: dla_enable_operation
[ 423.566571] Enable Convolution operation index 0 ROI 0
[ 423.567378] Enter: dla_op_enabled
[ 423.567602] Exit: dla_op_enabled
[ 423.567813] Exit: dla_enable_operation status=0
[ 423.568073] Exit: dla_op_enabled
[ 423.568275] Exit: dla_enable_operation status=0
[ 423.570124] Exit: dla_submit_operation
[ 423.570390] Enter: dla_dequeue_operation
[ 423.570632] Dequeue op from SDP processor, index=65 ROI=0
[ 423.570943] Enter: dla_submit_operation
[ 423.571176] Prepare SDP operation index 65 ROI 0 dep_count 0
[ 423.571480] Enter: dla_prepare_operation
[ 423.571845] processor:SDP group:1, rdma_group:1 available
[ 423.572155] Enter: dla_read_config
[ 423.581675] Exit: dla_read_config
[ 423.581930] Exit: dla_prepare_operation status=0
[ 423.582195] Enter: dla_program_operation
[ 423.582432] Program SDP operation index 65 ROI 0 Group[1]
[ 423.584106] no desc get due to index==-1
[ 423.584348] no desc get due to index==-1
[ 423.586586] no desc get due to index==-1
[ 423.586868] no desc get due to index==-1
[ 423.587105] Enter: dla_op_programmed
[ 423.587324] Update dependency operation index 66 ROI 0 DEP_COUNT=2
[ 423.587660] Exit: dla_op_programmed
[ 423.587874] Exit: dla_program_operation status=0
[ 423.588134] Enter: dla_enable_operation
[ 423.588410] Enable SDP operation index 65 ROI 0
[ 423.590315] Enter: dla_op_enabled
[ 423.590561] Update dependency operation index 1 ROI 0 DEP_COUNT=1
[ 423.590902] enable Convolution in dla_update_dependency as depdency are resolved
[ 423.591368] Enter: dla_enable_operation
[ 423.591597] Enable Convolution operation index 1 ROI 0
[ 423.592303] Enter: dla_op_enabled
[ 423.593782] Exit: dla_op_enabled
[ 423.594027] Exit: dla_enable_operation status=0
[ 423.594288] Exit: dla_op_enabled
[ 423.594490] Exit: dla_enable_operation status=0
[ 423.594759] Exit: dla_submit_operation
[ 423.594986] Exit: dla_dequeue_operation
[ 423.595212] Enter: dla_submit_operation
[ 423.595443] Prepare PDP operation index 128 ROI 0 dep_count 1
[ 423.595755] Enter: dla_prepare_operation
[ 423.596089] processor:PDP group:0, rdma_group:0 available
[ 423.596386] Enter: dla_read_config
[ 423.606259] Exit: dla_read_config
[ 423.606543] Exit: dla_prepare_operation status=0
[ 423.606840] Enter: dla_program_operation
[ 423.607080] Program PDP operation index 128 ROI 0 Group[0]
[ 423.607405] group id 0 rdma id 0
[ 423.607928] Invalid SrcInput Cude[W: 13824, H: 0, C: 24]
[ 423.607985] Exit: dla_program_operation status=-3
[ 423.610009] Exit: dla_submit_operation
[ 423.610296] Failed to submit PDP op from index 128
[ 423.610589] Exit: dla_initiate_processors status=-3
[ 423.610912] Task execution failed
NvDlaSubmit: Error IOCTL failed (No such process)
(DLA_RUNTIME) Error 0x0003000f: (propagating from Runtime.cpp, function submitInternal(), line 610)
(DLA_TEST) Error 0x00000004: runtime->submit() failed (in RuntimeTest.cpp, function runTest(), line 295)
(DLA_TEST) Error 0x00000004: (propagating from RuntimeTest.cpp, function run(), line 320)
(DLA_TEST) Error 0x00000004: (propagating from main.cpp, function launchTest(), line 87)
Can you kindly help me with this issue?
Thank you very much.
The text was updated successfully, but these errors were encountered: