Age | Commit message (Collapse) | Author |
|
|
|
faster than the bus clock now. It can be the same, or say four times faster.
|
|
Moved micro-operations handler into a separate module file, this way we don't
have any synthesized stuff in the top-level module, just instantiations. This
is more consistent from the design partitioning point of view. Btw, Xilinx
claims their tools work better that way too, but who knows...
Added optional simulation-only code to assist debugging. Un-comment the
ENABLE_DEBUG `define in 'rtl/modexpng_parameters.vh' to use, but don't ever
try to synthesize the core with debugging enabled.
|
|
step of the Garner's formula algorithm. Note, that the addition is "uneven" in
the sense, that the first operand is full-size (as wide as the modulus), while
the second one is only half the size. The adder internally banks the second
input port during the second half of the addition.
|
|
regular (not modular) multiplication. We're doing this by telling the modular
multiplier to stop after the "square" step, which computes A*B. The problem is
that the multiplier stores the lower part of the product in the internal bank L
and the upper part in the internal bank H, but we need to be able to do
operations on the product as a whole. MERGE_LH that combines the two halves of
the product into one bank.
|
|
Added modular subtraction micro-operation
|
|
|
|
is basically
a block memory data mover, but it can also do some supporting operations required for the
Garner's formula part of the exponentiation.
|
|
the B input of
the modular multiplier to 1, this is necessary to bring numbers out of Montgomery domain).
|
|
there's
only one instance of input/output values, while storage manager has dual storage space
for P and Q multipliers).
Started working on microcoded layer, added input operation and modular multiplication.
|