aboutsummaryrefslogtreecommitdiff
path: root/src/rtl/modexpa7_systolic_multiplier.v
AgeCommit message (Collapse)Author
2018-12-19Use primitives from core/libHEADmasterPavel V. Shatov (Meister)
2017-08-11Minor cleanup.Pavel V. Shatov (Meister)
2017-08-11Work in progress.Pavel V. Shatov (Meister)
2017-08-06 * Moved systolic processing element array into a separate module.Pavel V. Shatov (Meister)
* Finished top-level wrapper module.
2017-07-27Work in progress.Pavel V. Shatov (Meister)
2017-07-25Work in progress.Pavel V. Shatov (Meister)
2017-07-25Wide operand loader needs simplification...Pavel V. Shatov (Meister)
2017-07-25Trying to fix the bug during calculation of SN in systolic multiplier.Pavel V. Shatov (Meister)
2017-07-23Converted pe_t array into a FIFO too. No more nasty messages during ↵Pavel V. Shatov (Meister)
synthesis. Still needs a tiny bit of cleanup.
2017-07-20Converted pe_c_out_mem two-dimensional array into a FIFO.Pavel V. Shatov (Meister)
2017-07-19Fixed bug in systolic multiplier (swapped indices), it onlyPavel V. Shatov (Meister)
worked because the testbench set both NUM_SYSTOLIC_CYCLES = 4 and SYSTOLIC_ARRAY_LENGTH = 4. Now should work with any array power, not only 2.
2017-07-13Systolic multiplier simplified a bit:Pavel V. Shatov (Meister)
* passes testbench tests again * this time synthesizes fine (without major issues) List of things that need polishing in the future: * Parallelized operand loader can be reduced by a factor of 3 to only store one operand at a time: it currently stores B, N_COEFF and N. After B is consumed, it can be overwritten with AB, N_COEFF can be loaded sequentially the same way A is loaded. After that loader can be filled with Q while N will be loaded sequentially. * Turns out QN block memory is not needed at all. After we obtain the next word of QN, we immediately calculate SN. After that QN can be discarded, no need to store it. * Currently there are two wide memories T and PE_C_OUT. XST throws weird warnings about multi-port RAM before finally deciding to implement it using flip-flop. Those memories should be turned into FIFOs to simplify the design and not confuse XST.
2017-07-10 * made separate file for low-level settingsPavel V. Shatov (Meister)
* turned crazy triple multiplier array into one array with input mux
2017-07-04Fixed generic/vendor low-level primitives switch.Pavel V. Shatov (Meister)
2017-07-01Started porting generic multiplier to Xilinx primitives.Pavel V. Shatov (Meister)
2017-06-27Added systolic modular multiplier w/ testbench.Pavel V. Shatov (Meister)
* works in simulator * may have to change how internal operand buffer is pre-loaded (shift register instead of wide mux?) * code needs some cleanup