Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler & runtime crash in docker VP #45

Open
BigFatFlo opened this issue Jul 19, 2019 · 9 comments
Open

Compiler & runtime crash in docker VP #45

BigFatFlo opened this issue Jul 19, 2019 · 9 comments

Comments

@BigFatFlo
Copy link

Hi,

I'm using the docker image provided on DockerHub to run the virtual platform.
When I try to use nvdla_compiler to generate a loadable from my LeNet model, it crashes with this message:

./nvdla_compiler --prototxt lenet.prototxt --caffemodel lenet_iter_10000.caffe
model --configtarget nv_small --cprecision int8 --profile basic --calibtable cal
ib_table.json
./nvdla_compiler: line 2: syntax error: unexpected redirection
# ./nvdla_compiler: line 1: ELF�����: not found

If I compile my model on my machine using the nvdla_compiler from nvdla/sw/prebuilt/linux, it compiles into a loadable with no problem.
Unfortunately when I try to use that model in the nvdla_runtime in the docker vp, the runtime starts running but then crashes and exits the QEMU instance:

# ./nvdla_runtime --loadable nvdla/lenet_model_nosoftmax_nocalib_basic.nvdla --i
mage nvdla/digits/four_inv.pgm
creating new runtime context...
[  460.522980] random: crng init done
Emulator starting
pgm2dimg 1 28 28 1 224 6272 401408
submitting tasks...
[  465.391729] Enter:dla_read_network_config
[  465.392504] Exit:dla_read_network_config status=0
[  465.392763] Enter: dla_initiate_processors
[  465.393181] Enter: dla_submit_operation
[  465.393407] Prepare Convolution operation index 0 ROI 0 dep_count 1
[  465.393722] Enter: dla_prepare_operation
[  465.394232] processor:Convolution group:0, rdma_group:0 available
[  465.394657] Enter: dla_read_config
[  465.396656] Exit: dla_read_config
[  465.396903] Exit: dla_prepare_operation status=0
[  465.397186] Enter: dla_program_operation
[  465.397424] Program Convolution operation index 0 ROI 0 Group[0]
root@545d7e33f93c:/usr/local/nvdla# 

Any suggestions on getting the compiler and the runtime to work in docker?

Thank you

@fisherxue
Copy link

Run compiler directly in the Docker image. No need to run aarch64_toplevel -c aarch64_nvdla.lua first.

@shazib-summar
Copy link

Hey, @BigFatFlo. Did you find a solution? I am facing the same issue as you described in the second terminal log. Do you have any updates? Thanks a lot.

@shazib-summar
Copy link

Run compiler directly in the Docker image. No need to run aarch64_toplevel -c aarch64_nvdla.lua first.

@fisherxue at the time of writing, nvdla_runtime is available only for aarch64. So, you have to run the aarch64_toplevel -c aarch64_nvdla.lua command to emulate a Cortex A57 processor. Otherwise you wont be able to run nvdla_runtime

@BigFatFlo
Copy link
Author

@killerzula if I remember correctly, I "fixed" it by writing a new Dockerfile myself to get a docker image on which to run the virtual platform.
One trick was to make sure you have compatible versions of the compiler, runtime and platform, to handle nvdla_full small or large, otherwise the runtime will crash.
However I think @fisherxue was right, you can just run the compiler on your host system, not inside the docker container.

@shaumik1
Copy link

currently I face the same runtime issue.. I am using the latest Dockerfile (https://hub.docker.com/r/nvdla/vp/) and the latest prebuilt binaries (https://github.com/nvdla/sw/tree/master/prebuilt/arm64-linux).. so I hope platform and runtime versions should be compatible.. and the fbuf is also a pre-compiled file

   //Start with mounting the directory to /mnt, followed by insmod drm.ko and insmod opendla_2.ko
   # ./nvdla_runtime --loadable ../../regression/flatbufs/kmd/NN/NN_L0_0_fbuf --image  ../../regression/images/digits/seven.pgm --rawdump
   creating new runtime context...
   Emulator starting
   pgm2dimg 1 28 28 1 896 25088 25088
   submitting tasks...
   Work Found!
   Work Done
   [  371.318462] Enter:dla_read_network_config
   [  371.319911] Exit:dla_read_network_config status=0
   [  371.320349] Enter: dla_initiate_processors
   [  371.321021] Enter: dla_submit_operation
   [  371.321376] Prepare Convolution operation index 0 ROI 0 dep_count 1
   [  371.321932] Enter: dla_prepare_operation
   [  371.324673] processor:Convolution group:0, rdma_group:0 available
   [  371.325445] Enter: dla_read_config
   [  371.326404] Exit: dla_read_config
   [  371.329340] Exit: dla_prepare_operation status=0
   [  371.329909] Enter: dla_program_operation
   [  371.330336] Program Convolution operation index 0 ROI 0 Group[0]
   root@ab18aa4f023d:/usr/local/nvdla#

@BigFatFlo what modification in the dockerfile helped you 'fix' the issue, could you provide some more details?
@HaiqingSun @jarodw0723 any suggestions or hints to fix this?

@BigFatFlo
Copy link
Author

@shaumik1 I haven't touched NVDLA in a while, but here's the Dockerfile I used for the virtual platform.

FROM nvdla_tools:1.0.0

WORKDIR /nvdla

ARG nvdla_version=nv_full

COPY nvdla_hw /nvdla/hw/
COPY vp /nvdla/vp/

WORKDIR /nvdla/hw

RUN git checkout $nvdla_version && \
    echo "PROJECTS := $nvdla_version" > tree.make && \
    echo "COVERAGE := 0" >> tree.make && \
    echo "USE_DESIGNWARE := 0" >> tree.make && \
    echo "CPP := /usr/bin/cpp-4.9" >> tree.make && \
    echo "GCC := /usr/bin/g++-4.9" >> tree.make && \
    echo "PERL := /usr/bin/perl" >> tree.make && \
    echo "JAVA := /usr/bin/java" >> tree.make && \
    echo "SYSTEMC := /usr/local/systemc-2.3.0/" >> tree.make && \
    echo "PYTHON := /usr/bin/python3" >> tree.make && \
    echo "VERILATOR := verilator" >> tree.make && \
    echo "CLANG := clang" >> tree.make && \
    tools/bin/tmake -build cmod_top

WORKDIR /nvdla/vp
RUN cmake -DCMAKE_INSTALL_PREFIX=build \
          -DSYSTEMC_PREFIX=/usr/local/systemc-2.3.0/ \
          -DNVDLA_HW_PREFIX=/nvdla/hw \
          -DNVDLA_HW_PROJECT=$nvdla_version && \
          make && \
          make install

WORKDIR /nvdla/vp
RUN mkdir -p images/linux-4.13.3

COPY sw /nvdla/sw/

WORKDIR /nvdla/sw
RUN git checkout $nvdla_version && \
    cp ./prebuilt/linux/Image /nvdla/vp/images/linux-4.13.3/. && \
    cp ./prebuilt/linux/rootfs.ext4 /nvdla/vp/images/linux-4.13.3/. && \
    cp -r prebuilt /nvdla/vp/.

ENV SC_SIGNAL_WRITE_CHECK DISABLE

WORKDIR /nvdla/vp

The nvdla_tools image is just a docker container with all the required prerequisites, built using this Dockerfile:

FROM ubuntu:14.04

RUN sudo apt-get update && \
    sudo apt-get install -y software-properties-common && \
    sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test && \
    sudo apt-get update && \
    sudo apt-get install -y cmake libboost-dev python-dev libglib2.0-dev \
                            libpixman-1-dev liblua5.2-dev swig libcap-dev \
                            libattr1-dev && \
    sudo apt-get install -y gcc-4.9 && \
    sudo apt-get install -y g++-4.9 && \
    sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 60 && \
    sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 60 && \
    sudo apt install -y git && \
    sudo add-apt-repository -y ppa:openjdk-r/ppa && \
    sudo apt update && \
    sudo apt install -y openjdk-8-jdk && \
    sudo apt install -y wget

WORKDIR /tmp
RUN wget -O systemc-2.3.0a.tar.gz \
    http://www.accellera.org/images/downloads/standards/systemc/systemc-2.3.0a.tar.gz && \
    tar xf systemc-2.3.0a.tar.gz && \
    rm -f systemc-2.3.0a.tar.gz

WORKDIR /tmp/systemc-2.3.0a
RUN sudo mkdir -p /usr/local/systemc-2.3.0 && \
    mkdir objdir && \
    cd objdir/ && \
    ../configure --prefix=/usr/local/systemc-2.3.0 && \
    make && \
    sudo make install

WORKDIR /tmp
RUN wget -O IO-Tee-0.65.tar.gz \
    http://search.cpan.org/CPAN/authors/id/N/NE/NEILB/IO-Tee-0.65.tar.gz && \
    tar xf IO-Tee-0.65.tar.gz && \
    rm -f IO-Tee-0.65.tar.gz

WORKDIR /tmp/IO-Tee-0.65
RUN perl Makefile.PL && \
    make && \
    sudo make install

WORKDIR /tmp
RUN wget -O YAML-1.24.tar.gz \
    http://search.cpan.org/CPAN/authors/id/T/TI/TINITA/YAML-1.24.tar.gz && \
    tar xf YAML-1.24.tar.gz && \
    rm -f YAML-1.24.tar.gz

WORKDIR /tmp/YAML-1.24
RUN perl Makefile.PL && \
    make && \
    sudo make install

WORKDIR /tmp
RUN wget -O Capture-Tiny-0.48.tar.gz \
    http://search.cpan.org/CPAN/authors/id/D/DA/DAGOLDEN/Capture-Tiny-0.48.tar.gz && \
    tar xf Capture-Tiny-0.48.tar.gz && \
    rm -f Capture-Tiny-0.48.tar.gz

WORKDIR /tmp/Capture-Tiny-0.48
RUN perl Makefile.PL && \
    make && \
    sudo make install

WORKDIR /tmp
RUN wget -O XML-Simple-2.25.tar.gz \
    http://search.cpan.org/CPAN/authors/id/G/GR/GRANTM/XML-Simple-2.25.tar.gz && \
    tar xf XML-Simple-2.25.tar.gz && \
    rm -f XML-Simple-2.25.tar.gz

WORKDIR /tmp/XML-Simple-2.25
RUN perl Makefile.PL && \
    make && \
    sudo make install

WORKDIR /tmp
RUN wget -O XML-Parser-2.44.tar.gz \
    http://search.cpan.org/CPAN/authors/id/T/TO/TODDR/XML-Parser-2.44.tar.gz && \
    tar xf XML-Parser-2.44.tar.gz && \
    rm -f XML-Parser-2.44.tar.gz

WORKDIR /tmp/XML-Parser-2.44
RUN perl Makefile.PL && \
    make && \
    sudo make install

WORKDIR /nvdla

RUN rm -rf /tmp/*

COPY ./Dockerfile /.

Hope it helps.

@shaumik1
Copy link

@BigFatFlo thanks a lot for the details! much appreciated!!
For me the issue seems to resolve when I pull an older commit of nvdla/sw (here) to insert module insmod opendla.ko and use the prebuilt ./nvdla_runtime

It has the good old opendla.ko in the prebuilt/linux/ directory. The later versions have opendla_1.ko and opendla_2.ko which seem to cause this issue. (Probably the int8 support messes something up!)

@jinyl777
Copy link

jinyl777 commented Jun 18, 2020

i can not find opendla.ko ,and,when i run the ./compiler ,i got ./nvdla_compiler: line 2: syntax error: unexpected redirection

./nvdla_compiler: line 1: ELF�����: not found

can you help me

@singhae
Copy link

singhae commented Apr 6, 2023

opendla.ko -> opendla_1.ko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants