The error that nvdla dump_mem always dumps all-0s is caused by directly calling axi->read()
here and using its return value as the byte to dump to file. The error is fixed by submitting the gem5 memory read request as if it were a read request submitted by NVDLA during its execution, and ticking some more cycles until the read response has arrived.
Another bug lies in rtlNVDLA::runIterationNVDLA()
(here), where wr->clearOutput()
is originally called after trace->axievent()
and before processOutput(output)
, which will cause the read and write requests submitted by dump_mem and load_mem to be neglected. Here I moved wr->clearOutput()
to the beginning of the function to make it work for the aforementioned situation. If it may cause other errors, please let me know.
After the amendment, it passed sanity3, conv_8x8_fc_int16, sdp_relu_int16 with nv_full in my local test, and googlenet_conv2_3x3_int16 has a csb read mismatch but its dump file matched golden answer, which should be okay because the verilated nv_full shows the same result when being verified together with nvdla's nvdla.cpp.