user/shatov/modexpng - "Next-generation" modular exponentiation using the specialized DSP slices present in the Artix-7 FPGA

Age	Commit message (Collapse)	Author
2020-01-20	Updated uOP engine to match the changes made to the general worker module	Pavel V. Shatov (Meister)
	(modular subtraction was split into three micro-operations instead of one).
2019-11-19	Removed the latch accidentally created while pipelining the uOP engine module.	Pavel V. Shatov (Meister)
	The FSM previously had four states encoded using two bits, so the next state logic didn't have a default case, since all the possible states were used. Addition of the fifth state required one more state bit, so the FSM now has five states out eight possible and a default case is thus necessary.
2019-11-16	The uOP engine didn't compile at 180 MHz. The pipeline had two stages: FETCH	Pavel V. Shatov (Meister)
	and DECODE. Apparently one clock cycle is not enough to entirely decode an instruction, so decoding now takes two clock cycles (DECODE_1 and DECODE_2). This seems to solve the problem. If we run into more timing violations here, we can add an extra DECODE_3 cycle and register the currently combinatorial uop_opcode_* flags at DECODE_2. This fix increases the core's latency by 59/32 clock cycles (CRT/non-CRT mode) plus two extra clock cycles per each bit of the exponent.
2019-10-23	Added missing copyright headers.	Pavel V. Shatov (Meister)

2019-10-23	Added simulation-only code to measure multiplier load.	Pavel V. Shatov (Meister)

2019-10-21	Further work:	Pavel V. Shatov (Meister)
	- added core wrapper - fixed module resets across entire core (all the resets are now consistently active-low) - continued refactoring
2019-10-21	Added support for non-CRT mode. Further refactoring.	Pavel V. Shatov (Meister)

2019-10-21	Entire CRT signature algorithm works by now.	Pavel V. Shatov (Meister)
	Moved micro-operations handler into a separate module file, this way we don't have any synthesized stuff in the top-level module, just instantiations. This is more consistent from the design partitioning point of view. Btw, Xilinx claims their tools work better that way too, but who knows... Added optional simulation-only code to assist debugging. Un-comment the ENABLE_DEBUG `define in 'rtl/modexpng_parameters.vh' to use, but don't ever try to synthesize the core with debugging enabled.