Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Here's what looks to be a working version of your model and it's testbench</strong></p> <p>Added (and updated)</p> <p>If you were to make the the matrix multiple take real time (clocks), you'd see DONE delayed by he number of clocks it took to do the matrix multiply. I arbitrarily picked two clocks just to show the benefit of the added register files.</p> <p>I'll comment on the interesting parts of the code.</p> <pre><code>LIBRARY ieee; USE ieee.std_logic_1164.ALL; ENTITY lab4b_tb IS END lab4b_tb; ARCHITECTURE behavior OF lab4b_tb IS signal Clk: std_logic := '0'; -- no reset signal Start: std_logic := '0'; -- no reset signal Din: INTEGER := 0; -- no reset signal Done : std_logic; signal Dout : INTEGER; constant Clk_period : time := 10 ns; BEGIN uut: entity work.DCT_beh -- DCT_beh PORT MAP ( Clk =&gt; Clk, Start =&gt; Start, Din =&gt; Din, Done =&gt; Done, Dout =&gt; Dout ); CLOCK: process begin Clk &lt;= '0'; wait for Clk_period/2; Clk &lt;= '1'; wait for Clk_period/2; end process; STIMULUS: process variable i, j : INTEGER; variable cnt : INTEGER; begin wait until clk = '1' and clk'event; -- sync Start to clk FIRST_BLOCK_IN: Start &lt;= '1','0' after 11 ns; --issued same time as datum 0 for i in 0 to 63 loop if (i &lt; 24) then din &lt;= 255; elsif (i &gt; 40) then din &lt;= 255; else din &lt;= 0; end if; wait until clk = '1' and clk'event; end loop; SECOND_BLOCK_N: Start &lt;= '1','0' after 11 ns; -- with first datum for cnt in 0 to 63 loop din &lt;= cnt; wait until clk = '1' and clk'event; end loop; din &lt;= 0; -- to show the last input datum clearly wait; end process; END ARCHITECTURE; </code></pre> <p>The two input blocks are you new block value and your original block value which provided an index for the first output block. The second block also shows the same answers as originally, validating the DONE handshaking.</p> <p>Note Start is concurrent with the first datum of each block.</p> <p>I also adjusted the input stimulus to start out on a clock boundary to not have the first Start show on falling edges of clocks. </p> <p>Where there are asynchronously generated pulses I extended them a nanosecond to insure they'd be seen on a clock edge, because they weren't generated on a clock edge. </p> <pre><code>LIBRARY ieee; USE ieee.std_logic_1164.ALL; entity DCT_beh is port ( Clk : in std_logic; Start : in std_logic; Din : in INTEGER; Done : out std_logic; Dout : out INTEGER ); end DCT_beh; architecture behavioral of DCT_beh is type RF is array ( 0 to 7, 0 to 7 ) of INTEGER; signal OutBlock: RF; signal InBlock: RF; signal internal_Done: std_logic := '0'; -- no reset signal Input_Ready: std_logic := '0'; -- no reset signal done_detected: std_logic := '0'; -- no reset signal input_rdy_detected: std_logic := '0'; -- no reset signal last_out: std_logic := '0'; -- no reset begin INPUT_DATA: process begin wait until Start = '1'; --Read Input Data for i in 0 to 7 loop for j in 0 to 7 loop wait until Clk = '1' and clk'event; InBlock(i,j) &lt;= Din; if i=7 and j=7 then Input_Ready &lt;= '1', '0' after 11 ns; end if; end loop; end loop; end process; WAIT_FOR_InBlock: process begin wait until clk = '1' and clk'event; input_rdy_detected &lt;= Input_Ready; --InBlock valid after the following rising edge of clk end process; TRANSFORM: process variable InpBlock : RF; constant COSBlock : RF := ( ( 125, 122, 115, 103, 88, 69, 47, 24 ), ( 125, 103, 47, -24, -88, -122, -115, -69 ), ( 125, 69, -47, -122, -88, 24, 115, 103 ), ( 125, 24, -115, -69, 88, 103, -47, -122 ), ( 125, -24, -115, 69, 88, -103, -47, 122 ), ( 125, -69, -47, 122, -88, -24, 115, -103 ), ( 125, -103, 47, 24, -88, 122, -115, 69 ), ( 125, -122, 115, -103, 88, -69, 47, -24 ) ); variable TempBlock : RF; variable A, B, P, Sum : INTEGER; begin if input_rdy_detected = '0' then wait until input_rdy_detected = '1'; end if; InpBlock := InBlock; -- Broadside dump or swap --TempBlock = COSBLOCK * InBlock -- arbitrarily make matrix multiple 2 clocks long wait until clk = '1' and clk'event; -- 1st xfm clock for i in 0 to 7 loop for j in 0 to 7 loop Sum := 0; for k in 0 to 7 loop A := COSBlock( i, k ); B := InpBlock( k, j ); P := A * B; Sum := Sum + P; if( k = 7 ) then TempBlock( i, j ) := Sum; end if; end loop; end loop; end loop; -- Done issued in clk cycle of last TempBlock( i, j ) := Sum; internal_Done &lt;= '1', '0' after 11 ns; wait until clk = '1' and clk'event; -- 2nd xfrm clk -- OutBlock available after last TempBlock value stored OutBlock &lt;= TempBlock; -- Broadside dump or swap end process; Done_BUFFER: Done &lt;= internal_Done; WAIT_FOR_OutBlock: process begin wait until clk = '1' and clk'event; done_detected &lt;= internal_Done; -- Done can come either before the first output_data transfer -- or during the last output data transfer -- this gives us the clock delay to finish the last xfm transfer to -- TempBlock( i, j) -- Technically part of the output process but was too cumbersome to write end process; OUTPUT_DATA: process begin -- OutBlock is valid after clock edge when Done is true for i in 0 to 7 loop for j in 0 to 7 loop if i = 0 and j = 0 then if done_detected = '0' then wait until done_detected = '1'; end if; end if; Dout &lt;= OutBlock(i,j); wait until clk = '1' and clk'event; end loop; end loop; end process; end behavioral; </code></pre> <p>The type definition for RF has been moved to the architecture declarative part to allow inter process communications through signals. The input loop, matrix multiply and output loop are in there own processes. I also added processes for the inter-process handshaking (Input_Ready and input_Done (Done), added signals input_rdy_detect and done_detect.</p> <p>If a process can take 64 clocks a signal showing the last datum process (Input_Ready and potentially Done) are exerted during the last data transaction of the downstream process. It would be very messy to code otherwise and you'd still need the flip flops.</p> <p>There's an added RF between the input process and the multiply process to allow concurrent operation when the matrix multiply takes real time (and it takes 2 clocks in this example, I didn't want to stretch out the waveforms too far).</p> <p>Some of the handshaking delays appear to have been coding style related and cured with the input_rdy_detect and done_detect flip flops.</p> <p>The first waveform diagram shows the first output data following the two clocks the transform process now takes, shown between A and B markers.</p> <p><img src="https://i.stack.imgur.com/ctIOt.jpg" alt="Two Clock Matrix Multiply"></p> <p>You can see the first output datum following immediately following Done is 78540 and not the 110415 shown in your waveform screen capture. One of us shows the wrong value. This version of DCT_beh strictly enforces transfers of RF values only after the last datum is loaded.</p> <p>I did get the 110415 value before cleaning up the handshaking between the input process and multiply process. It'd be a lot of work to trace it through the TempBlock our OutBlock.</p> <p>Now for the good news. The second input block is taken from your original stimulus and the input values make a great index for the output transfers. Those output data values all appear correct.</p> <p><img src="https://i.stack.imgur.com/jBP0M.jpg" alt="2nd Block Done and start of 2nd block output"></p> <p>The signals input_rdy_detect and done_detect happen to show the first transaction in their respective down stream processes. I added a trailing din signal assignment to 0 avoiding confusion at the end of second input block.</p> <p>Here's a screen capture approximating yours, I can't do selected zoom, instead use successive approximation.</p> <p><img src="https://i.stack.imgur.com/gcQiw.jpg" alt="enter image description here"></p> <p>You only need to run the simulation out to 1955 ns to capture the last datum of the 2nd block being out.</p> <p>This was done using Tristan Gingold's ghdl and Tony Bybell's gtkwave on a Mac running OS X 10.8.4.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload