- 12 Apr 2007: [Of interest to people working on Bonus.] Regarding
the performance calculation, if you are setting your flipflops to slower
than your circuit depth, then it is the flip flop setting that is the
`effective' circuit depth from the point of view of calculating time
to perform the benchmark as part of the space times time performance
metric, i.e., if your circuit depth is 95 and you set your flipflops to
100 to catch the Ram cycle better, then you should use 100 as your
circuit depth in your performance calculation (if you want to use 95,
then you should in turn use the number of clock cycles it takes to
perform the task when the flipflops are set to 95). Incidently,
the best working performance value I have seen so far is 6,229,595,000
(number of clocks til halt * effective circuit depth * number of relays).
- 3 Apr 2007: [Of interest to people working on Bonus.]
CircuitSimulator.BkWi06a4.tar.gz is now available. It adds the feature
BACKTRACE_ALONG_DEPTH to the simulator, as discussed in posting 31 Mar 2007.
The anticipated usage is that one does a DUMP_ALL which tells you the depth
of all the wires in your design. You find the wire with the largest depth,
say t_5939 at depth 411.
Then to see the wires that formed a longest path to that wire
you would insert the directive
BACKTRACE_ALONG_DEPTH t_5939
after the first INPUT_DATA and it would dump the 412 wires of depth 411 thru 0
that form a path to t_5939 (it would dump more than 412 wires if JOINS were
used, since JOINs don't increase path length).
- 31 Mar 2007:
In ~webber/CS350 on gaul, there is now a new version of the
simulator called CircuitSimulator.BkWi06a3.tar.gz . It improves a couple
of cases of error checking in the simulator (so a few messages come earlier
and are clearer about what is likely causing the problem). It also implements
a new feature BACKTRACE. This directive can be given at any point after the
OUTPUTS statement (i.e., unlike DUMP_ALL, BACKTRACE doesn't require INPUT_DATA
to succeed). The idea is that you have a wire named t_5893 that is causing
a problem and you would like to see what wires it depends on. Then you
put into your CDN file the command:
BACKTRACE t_5893
and a listing is generated that is a preorder traversal of the tree of wires
that feed ultimately into t_5893. Each wire is labelled with its distance
from t_5893. If t_5893 loops back to itself, an error message is printed.
If there is more than one back path from t_5893 to the same node, then that
node is only traced once (otherwise, the tree could grow exponentially).
Although BACKTRACE may prove useful for debugging, it has limited usefulness
for optimization, because it only processes the first path back to a node, so
if the second path is longer, we never see that. Thus the distance given from
t_5893 is neither the longest (which would be useful for optimization) nor
the shortest, but instead merely the distance along the first path traversed.
To provide a feature to help optimization, I plan on adding
BACKTRACE_ALONG_DEPTH which will show you a longest path from a given wire to
an unspecified source. In order for
this command to work, an INPUT_DATA will have to be
done, since it is INPUT_DATA that sets the depth values on the wires.
This should be a minor modification of BACKTRACE and so hopefully will
appear in a new release of the simulator early this coming week.
Note that the changes that have been made to create version a3 have recieved
relatively little testing, so I have left version a2 available as well.
In terms of actual circuit performance, in the event of a disagreement between
a2 and a3, unless I announce otherwise, a2 will be viewed as definitive.
The changes made in a3 are not error fixes in the sense that I know of no
examples of a properly formatted CDN file that the simulator does not process
properly, but are instead changes to help figure out what went wrong when
the CDN file was not properly constructed and so any change in behavior on
a properly formatted CDN file is unintentional and probably an error on my
part (please let me know soonest if you come across such a situation).
- 30 Mar 2007: Homework 3 has been updated
to take into account that the S, N, and SmallConstant parameters for the
benchmark run are now available. As noted in the original description,
the actual random number seed I use when I do the benchmarking will be
different than the one I have given you in the Gawk scripts. I have also
included some further comments on testing/ debugging. There is a new
H3d.tar.gz file, reflecting the new gawk scripts h3.gawk, h3.gawk.ORIG,
and h3.gawk.Harder. None of the other files were changed from the original
distribution. These files can also be found in the H3 subdirectory on gaul.
Including all the instructions and local variables, my solution took 116 words
of memory. It came to 109 lines of assembly code (without variables), of which
17 were label directives, leaving us with 79 actual assembly opcodes (of which 13 were long format, 2 word instruction opcodes).
- 29 Mar 2007: Error in description of
Homework 3
has been fixed (and the H3 directory updated and new H3c.tar.gz file created)
The error was
print (randint(BigConstant) - SmallConstant);
should have been
print (randint(BigConstant) + BigOffset);
- 28 Mar 2007: The description of
Homework 3 that is currently scattered below
has now been put together in a single writeup (with some elaboration).
- 24 Mar 2007: More on Homework 3: H3a.tar.gz has now been
replaced by H3b.tar.gz . Awk script h3.gawk was created specifically
for this assignment to generate the assembly data statements for test
data for this assignment (using Gawk's random number generator). The
script h3.gawk has 5 variables that can set for testing purposes
(S, N, SmallConstant, BigConstant, and BigOffset). Also, for testing
purposes, the random number seed of 398929 may be changed. The settings
that will be used for the Benchmark have not been determined, but your
program should work for any valid setting of these parameters (see comments
in script for more details). I have also written a Java program H3.java
that reads data in the same format as the Computer01 simulator, computes what
the same changes to the test data as your assembly program should, and
dumps what the contents of the upper half of memory should be after your
program is done (the lower half can have anything you want in them).
Thus, to run the who thing, one would type:
gawk -f h3.gawk | gawk -f asm01.gawk | java H3 | more
To run your h3.asm file on the same test data to see if it does the same
thing, you would type something like:
gawk -f h3.gawk > h3.test
cat h3.asm h3.test | gawk -f asm01.gawk | java Computer01 | more
- 24 Mar 2007: Bonus Assignment: Worth between 0% and 4% of the
course mark. The task specification is the same as that for Homework 2.
The marking scheme will be at most 1 for solutions that don't pass all my
test data, between 2 and 4 for solutions that pass all my test data.
The due time for the solution is noon of the Thursday immediately following
the Monday of the Final Exam. The handin procedure is the same as for
Homework 2, except that the solution file should be called bonus.cdn and
the solution report should be called bonus.txt (b1.jpg, ... , for any figures).
Expect new releases of the simulator in the coming week as I try to add on
some features to make debugging and optimizing this kind of assignment easier.
This bonus was created in appreciation for the feedback I got on the simulator
this semester.
- 24 Mar 2007: Regarding Homework 2: Two circuits were handed in
with reports claiming they didn't work. A third was handed in with a report
claiming that it did work. So, some testing still needs to be done before
marks can be set on that assignment, but the late penalties and other criteria
that are spelled out in the course outline will be implemented as stated there.
By way of compensation, see Bonus Assignment above.
- 24 Mar 2007: Homework 3 update: As there is no error reporting
mechanism in the spec, you are guarenteed that the value of S will be a value
somewhere from 1 to 12 and the value of N will be a value somewhere from 1
to ((2048 - 2)/(S+1)).
- 23 Mar 2007: Homework 3 status: I hope to get a spec put together
and posted this weekend, but for now here is a brief description of what is up:
Homework 3 will involve
programming in a small assembly language to solve a problem while taking
maximum advantage of the registers and cache available to speed up memory
performance. The simulator for this machine can be found on gaul in
the file ~webber/CS350/H3a.tar.gz . The Java files together form the
simulator, which can be invoked by
java Computer01 < h3.ram > h3.result
The file h3.ram corresponds to a memory image of RAM for the machine
holding both your program and the test data. The h3.result file will
include the runtime (measured as number of clock cycles to execute) and
the contents of RAM after the program has completed. The total RAM size
is 4096. Your program and initial data structures have to stay below
address 2048. I will put the test data for your program to run on in the
addresses from 2048 to 4095. If you write your program in the assembly
language defined in the text, then it can be converted to RAM format by:
gawk -f asm01.gawk < h3.asm > h3.ram
You can find more discussion about how this all works in the Homework 3
writeup from last semester.
Last semester the task was sorting. This semester the problem will be
different. The general format for the problem will be:
2048: number of tasks N
2049: size of task S
2050: first entry in first task
The underlying task is to take a list of S numbers and answer the question
of whether or not the numbers can be separated into two groups that each
sum to the same value. For example, if S was 5 and the numbers were 1, 2, 2
2, 5, then we observe that 1 + 5 = 2 + 2 + 2, so these 5 numbers can be split
into two groups that sum to the same amount. In the case were such a sum
is possible, you store a 1 in the word immediately after the number sequence
and where it is not, you store a 0 immediately after. The initial layout
of memory will consume 2 + N*(S+1) words of memory. The place where you report
your answer will initially be set to -1. Thus, if the above task was the only
task being required, then you would see starting at location 2048 the values:
1, 5, 1, 2, 2, 2, 5, -1
If I wanted you to do two such problems, then memory starting at location 2048
would look like:
2, 5, 1, 2, 2, 2, 5, -1, 1, 2, 3, 2, 5, -1
A quick back of the envelope big-O analysis of this problem would show that
it is O(N * (2 ** S)).
To turn the above into a homework spec, I still have to do three things:
1) implement the above algorithm in Java for generating test data; 2) set a
particular configuration of test data that will be used as a benchmark; and
3) implement the above algorithm so that I can set an upperbound on how
slow your solution can be on the benchmark and still be acceptable.
- 21 Mar 2007: Another error was found in my Sha1.java. Where I compute
the 32-bit signature from the 16-bit parts, I didn't take into account that
casting a short to an int would sign extend the short. So, Sha1.java
has been changed again (as of 1:30pm today). The old version is still
in Sha1.java.Monday. If your circuit matches either of these implementations,
then it will be viewed as working.
- 20 Mar 2007: Error found in my implementation of the shift operator
in Sha1.java. It was only doing an 8-bit circular shift. It has now
been fixed to do a 16-bit circular shift. Please copy/download the
new version. It can be identified by the use of the constant 0x8000
rather than 0x80 in the method leftRotate.
- 19 Mar 2007: In response to the question of how do you know how many
clock cycles your circuit will take, consider the following response:
- If you look at the SumMemory example (p276), you will see that it took
that code 899 clock cycles to reach the state that raises halt. It
ran its own flip flop clock at 1000 and its ram clock at 3786. If you
look at the calculation of the variable `enoughCycles' in that
example, you will see I used the formula 7 + 128 * 7. Note that 128
is the number of entries in Ram that the controller would have to
process. If you look at the trace data (page 277), you will see that
I spend 6 cycles in state sendAddress then one cycle in state gotData
and then back for 6 cycles in sendAddress until I have processed all
of memory. With a flipflop cycle of 1000 and a ram cycle of 3786, we
first note that the first cycle in sendAddress sets a cs that isn't
read by ram til the beginning of the next cycle. Then we spend four
cycles where at the beginning of the cycle, the ram still isn't done.
Then, on the 6th cycle, we take the done signal from the ram and use
it to transfer to the gotData state, where we spend on cycle
transferring back into sendAddress.
- 3 Mar 2007: The Homework 2 specification
has been updated again. A for loop error has been fixed by a change from
the comparision of i less than or equal to 10 to comparing i less than 10
(thus preventing running off the end of the w array). Also the || used
in creating signature has been fixed by being changed to |. Also, a full
java implementation of the method is now available in
/gaul/u0/usr/faculty/webber/CS350 under the name Sha1.java . Some discussion
of this method has also been inserted into the Homework spec.
- 2 Mar 2007: The Homework 2 specification
has been updated. In the original pseudocode, I didn't show how the variables
a, b, c, d, and e were initialized. The update shows their initialization
from the h values inside the while loop in the way they are handled by SHA-1.
- 21 Feb 2007: The Homework 2 specification
is now available. While the pseudo-code for the secure hash function that
the assignment will use is include in the spec, the Java implementation
and testing scripts have not yet been prepared.
- 20 Feb 2007: As mentioned in class today, homework 2 will be based
on the idea of computing a secure hash for a portion of the contents in
memory. I will give you Java code for the algorithm that are you to
implement (so that you can figure out what output your circuit should be
generating in specific cases). The algorithm I give you will be loosely
based on SHA-1 (pseudocode for SHA-1 can be found at Wikipedia entry on SHA hash functions).
- 12 Feb 2007: Office hours were changed (Thursday cancelled, new hour
on Friday) as per class discussion. List at top of this page updated.
- 8 Feb 2007: As mentioned in class today, the due date for Homework
1 has been moved from the end of 12 Feb 2007 to the end of 19 Feb 2007.
See revised course outline.
- 2 Feb 2007: The problem noticed in yesterday's class where r * r
was referred to when x * x was meant has been fixed in the
Homework 1 specification. Also, as mentioned
in class yesterday, I have office hours 3:30 to 4:30 today (Friday).
- 30 Jan 2007: The Homework 1
specification is now available. Please take a look at it at your
earliest convenience so that any questions about it can be cleared
up as soon as possible.
- 13 Jan 2007: Office hours announced.
- 9 Jan 2007: The simulator software discussed in today's class
can be found on gaul in the file CircuitSimulator.BkWi06a2.tar in
the directory /gaul/u0/usr/faculty/webber/CS350 . The main for the
simulator is in the file Circuit.java . The circuit-building
libraries and some example
usages are in the subdirectory test_data (since this was test data used
to test the simulator) in the same tar file. To see it all being used,
look at handout1 (pages 269 thru 271 of text), which explores a couple of
different implementations of a 3 input OR.
- 3 Jan 2007: Course outline
on line.
- 3 Jan 2007: Created announcements page.
I can be contacted by email at webber@csd.uwo.ca .
Be sure and include course number in subject line of your email.