Skip to content

Commit

Permalink
added report
Browse files Browse the repository at this point in the history
  • Loading branch information
0BAB1 committed May 9, 2024
1 parent 5f8979c commit c01fb18
Show file tree
Hide file tree
Showing 15 changed files with 24,053 additions and 3 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,16 @@ into a tighly coupled co-processor using SRAM. This SRAM should be as small a sp
- You HAVE to specify the max kernel size in ./core/cvxif_example/cvxif_example_coprocessor.sv uder the "Nb_of_regs" parameter.

Then, to take full advantage of this design based on a home-made TPU (Tensor Processing Unit), you need to tell
the compiler a few specific things using assembly
the compiler a few specific things using inline assembly :

- First : load data in CVXIF using ... and ...
- Then : Lunch ...
- First : load data in CVXIF using LBC and LBCU instructions
- Then : Lunch MAC instruction (this will read tensor operation result form CV-X-IF and clear all of its registers)
- Finally : to further push performances, you can add load checks to avoid re-loading an already loaded kernel in memory

In this example, we modified the MNIST program (./sw/app/mnist/NetworkPropagate.c).

More infos in ./REPORT (See pdf article).

### Building the software binaries

RISC-V binaries are built using GCC and bintuils. Make sure to use the right toolchain by build a docker container using
Expand Down
Binary file added REPORT/CVA6_With_custom_mem-1.pdf
Binary file not shown.
19 changes: 19 additions & 0 deletions REPORT/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# REPORT

TEAM : Supaero
Member : Hugo BABIN-RIBY
Coach : Arnaud DION

# A word on the report article

I made 3 designs, this repo only contains the fastest but the reporting article talks about all 3 of them because I though it was a good thing to make comparisons ansd to show some example of what we can do on CVA6 and CV-X-IF.
NB : the video will not talkk about the first two designs and will only focus an the last one (ie the one in this repo)

# this folder contains :

- 6-page (more like 8...) report written as a scientific paper
- Simulation log (uart)
- P&R report with the maximum frequency (corev_apu/fpga/report_cva6_fpga_impl/cva6_fpga.timing.rpt)
- Report of resources (corev_apu/fpga/report_cva6_fpga_impl/cva6_fpga.utilization.rpt)
- Log of the execution on the FPGA board (copy or screenshot of the hyperterminal output)
- Synth reports
Binary file added REPORT/fpga_exec_log.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
126 changes: 126 additions & 0 deletions REPORT/reports_cva6_fpga_impl/cva6_fpga.check_timing.rpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
Copyright 1986-2022 Xilinx, Inc. All Rights Reserved. Copyright 2022-2023 Advanced Micro Devices, Inc. All Rights Reserved.
---------------------------------------------------------------------------------------------------------------------------------------------
| Tool Version : Vivado v.2023.2 (lin64) Build 4029153 Fri Oct 13 20:13:54 MDT 2023
| Date : Mon Apr 29 18:29:28 2024
| Host : rootmin-Nitro-AN515-57 running 64-bit Ubuntu 22.04.4 LTS
| Command : check_timing -file reports_cva6_fpga_impl/cva6_fpga.check_timing.rpt
| Design : cva6_zybo_z7_20
| Device : 7z020-clg400
| Speed File : -1 PRODUCTION 1.12 2019-11-22
| Design State : Routed
---------------------------------------------------------------------------------------------------------------------------------------------


check_timing report

Table of Contents
-----------------
1. checking no_clock (480)
2. checking constant_clock (0)
3. checking pulse_width_clock (0)
4. checking unconstrained_internal_endpoints (32)
5. checking no_input_delay (2)
6. checking no_output_delay (1)
7. checking multiple_clock (0)
8. checking generated_clocks (0)
9. checking loops (0)
10. checking partial_input_delay (3)
11. checking partial_output_delay (0)
12. checking latch_loops (0)

1. checking no_clock (480)
--------------------------
There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][0]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][1]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][2]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][3]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][4]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][5]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[0][operation][6]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][0]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][1]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][2]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][3]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][4]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][5]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/ex_stage_i/lsu_i/i_load_unit/ldbuf_q_reg[1][operation][6]/Q (HIGH)

There are 32 register/latch pins with no clock driven by root clock pin: i_ariane/i_cva6/gen_cache_wt.i_cache_subsystem/i_wt_dcache/gen_rd_ports[1].i_wt_dcache_ctrl/id_q_reg[0]/Q (HIGH)


2. checking constant_clock (0)
------------------------------
There are 0 register/latch pins with constant_clock.


3. checking pulse_width_clock (0)
---------------------------------
There are 0 register/latch pins which need pulse_width check


4. checking unconstrained_internal_endpoints (32)
-------------------------------------------------
There are 32 pins that are not constrained for maximum delay. (HIGH)

There are 0 pins that are not constrained for maximum delay due to constant clock.


5. checking no_input_delay (2)
------------------------------
There are 2 input ports with no input delay specified. (HIGH)

There are 0 input ports with no input delay but user has a false path constraint.


6. checking no_output_delay (1)
-------------------------------
There is 1 port with no output delay specified. (HIGH)

There are 0 ports with no output delay but user has a false path constraint

There are 0 ports with no output delay but with a timing clock defined on it or propagating through it


7. checking multiple_clock (0)
------------------------------
There are 0 register/latch pins with multiple clocks.


8. checking generated_clocks (0)
--------------------------------
There are 0 generated clocks that are not connected to a clock source.


9. checking loops (0)
---------------------
There are 0 combinational loops in the design.


10. checking partial_input_delay (3)
------------------------------------
There are 3 input ports with partial input delay specified. (HIGH)


11. checking partial_output_delay (0)
-------------------------------------
There are 0 ports with partial output delay specified.


12. checking latch_loops (0)
----------------------------
There are 0 combinational latch loops in the design through latch input


Loading

0 comments on commit c01fb18

Please sign in to comment.