Age | Commit message (Collapse) | Author | |
---|---|---|---|
2017-08-11 | Minor cleanup, removed unused flag register 'shreg_now_latency'. | Pavel V. Shatov (Meister) | |
2017-08-11 | CRT mode seems to work. Finally. | Pavel V. Shatov (Meister) | |
Strangely enough non-CRT mode continues to work fine(!). One does not simply add a feature without breaking something else. Very suspicious... | |||
2017-08-11 | Work in progress. | Pavel V. Shatov (Meister) | |
2017-08-09 | Added 'modexpa7_' prefix to all the low-level modules in /src/rtl/pe/ to ↵ | Pavel V. Shatov (Meister) | |
prevent clashes with low-level modules in ECDSA multipliers. We should consolidate all the lowel-level stuff across all the math cores in the future. | |||
2017-08-07 | * Added readme filev0.20 | Pavel V. Shatov (Meister) | |
* Enabled vendor-specific primitive usage for compilation | |||
2017-08-06 | * Moved systolic processing element array into a separate module. | Pavel V. Shatov (Meister) | |
* Finished top-level wrapper module. | |||
2017-07-27 | Work in progress. | Pavel V. Shatov (Meister) | |
2017-07-25 | Work in progress. | Pavel V. Shatov (Meister) | |
2017-07-25 | Wide operand loader needs simplification... | Pavel V. Shatov (Meister) | |
2017-07-25 | Trying to fix the bug during calculation of SN in systolic multiplier. | Pavel V. Shatov (Meister) | |
2017-07-24 | Started adding top-level wrapper. | Pavel V. Shatov (Meister) | |
2017-07-23 | Wrote top-level module. 4096-bit core with 16-tap systolic array synthesizes ↵ | Pavel V. Shatov (Meister) | |
just fine: 10% slices 8% block memory 33% DSPs | |||
2017-07-23 | Converted pe_t array into a FIFO too. No more nasty messages during ↵ | Pavel V. Shatov (Meister) | |
synthesis. Still needs a tiny bit of cleanup. | |||
2017-07-20 | Force inference of distributed memory for the simple FIFO used to store carries. | Pavel V. Shatov (Meister) | |
2017-07-20 | Converted pe_c_out_mem two-dimensional array into a FIFO. | Pavel V. Shatov (Meister) | |
2017-07-19 | Fixed bug in systolic multiplier (swapped indices), it only | Pavel V. Shatov (Meister) | |
worked because the testbench set both NUM_SYSTOLIC_CYCLES = 4 and SYSTOLIC_ARRAY_LENGTH = 4. Now should work with any array power, not only 2. | |||
2017-07-19 | Added pre-multiplication step. | Pavel V. Shatov (Meister) | |
Added 512-bit testbench. | |||
2017-07-19 | Finished modular exponentiation module: | Pavel V. Shatov (Meister) | |
* works in simulator * passes synthesis without major issues Started adding pre-multiplication logic... | |||
2017-07-18 | Started adding exponentiator module w/ testbench. | Pavel V. Shatov (Meister) | |
2017-07-13 | Systolic multiplier simplified a bit: | Pavel V. Shatov (Meister) | |
* passes testbench tests again * this time synthesizes fine (without major issues) List of things that need polishing in the future: * Parallelized operand loader can be reduced by a factor of 3 to only store one operand at a time: it currently stores B, N_COEFF and N. After B is consumed, it can be overwritten with AB, N_COEFF can be loaded sequentially the same way A is loaded. After that loader can be filled with Q while N will be loaded sequentially. * Turns out QN block memory is not needed at all. After we obtain the next word of QN, we immediately calculate SN. After that QN can be discarded, no need to store it. * Currently there are two wide memories T and PE_C_OUT. XST throws weird warnings about multi-port RAM before finally deciding to implement it using flip-flop. Those memories should be turned into FIFOs to simplify the design and not confuse XST. | |||
2017-07-10 | * made separate file for low-level settings | Pavel V. Shatov (Meister) | |
* turned crazy triple multiplier array into one array with input mux | |||
2017-07-04 | Fixed generic/vendor low-level primitives switch. | Pavel V. Shatov (Meister) | |
2017-07-04 | Fixing generic/vendor primitive switching... | Pavel V. Shatov (Meister) | |
2017-07-01 | Started porting generic multiplier to Xilinx primitives. | Pavel V. Shatov (Meister) | |
2017-07-01 | Added generic/vendor-specific primitive selector for simulation. | Pavel V. Shatov (Meister) | |
2017-07-01 | Cleaned up Verilog sources | Pavel V. Shatov (Meister) | |
2017-07-01 | Added 512-bit test vector | Pavel V. Shatov (Meister) | |
Cleaned up Verilog a bit | |||
2017-07-01 | Finished modulus-dependent coefficient calculation module: | Pavel V. Shatov (Meister) | |
* fixed bug with latency compensation * cleaned up Verilog source * added 512-bit testbench * works in simulator * synthesizes without warnings Changes: * made latency of generic processing element configurable | |||
2017-06-27 | Added Montgomery modulus-dependent coefficient calculation block | Pavel V. Shatov (Meister) | |
* work in progress | |||
2017-06-27 | Added Montgomery factor calculation block | Pavel V. Shatov (Meister) | |
* works in simulator * passes synthesis w/o warnings * code needs minor cleanup | |||
2017-06-27 | Added systolic modular multiplier w/ testbench. | Pavel V. Shatov (Meister) | |
* works in simulator * may have to change how internal operand buffer is pre-loaded (shift register instead of wide mux?) * code needs some cleanup | |||
2017-06-27 | Added generic processing elements. | Pavel V. Shatov (Meister) | |
2017-06-27 | Start conversion to systolic architecture. | Pavel V. Shatov (Meister) | |
2016-06-14 | Initial commit | Paul Selkirk | |