- 18 Apr 2008: Homework 5 summary is
available. We had 12 people hand in working homework 5 solutions.
In homework 4, there were 7 working projects; now 2 weeks later,
there are 8 working projects. The slowest solution is now 509
times slower than the fastest solution. This is primarily because
the slowest solution was doing way too many tape moves (it didn't
implement a logarithmic merge pattern) and caching didn't change
the tape overhead. Still, much better to have working code than not.
So, marks for the working code will range from 8 to 10. Marks for
the non-working code will range from 3 to 4. This assignment was
7% of the mark.
- 10 Apr 2008: Last Class Announcements:
- Remember, when sending me email, include CS350 in the subject line.
Otherwise your email may end up in the spam pile where it never gets
looked at.
- With end of semester, regular office hours vanish.
- Monday 3-4pm on 14 Apr and 21 Apr, I plan to be in my office.
Most other afternoons around 3pm I will be in, but you may need to
send email before noon for me to be aware that you are looking for
me. If Star Wars VII comes out, I may decide to not come in some
afternoon.
- Marks as of Monday were passed out in today's class. If you didn't
get them, you can drop by during office hours. If there are problems,
we need to sort them out before the final.
- The final exam is comprehensive (closed book, closed notes, no
calculators like midterm and quiz) and counts for 37% of your mark
(having gotten the extra weight from the cancellation of homework 6).
The majority of the final will be on material already covered by the
midterm, since the midterm came more than halfway through the semester.
New material includes: Pipelining (textbook and example gone over in
Thursday's class), Virtual memory (handout from Heuring and Jordan book),
Multicore and CAS (textbook examples), and Circuitry to implement D-flipflop
(textbook). As with the quiz and midterm, there will be a focus on problem
solving and tracing questions. However, there may be a few short answer
questions to test general knowledge of vocabulary covered this semester.
- Study advice: a lot of people managed not to get various homeworks to
work. Now you have time to go back and sort that out. Remember the exam
is based on the assumption you have the skills of someone who actually did
get the homeworks to work. Also, many people didn't get all the questions
on the quiz and midterm right. Most of those questions can be viewed as
tasks to be coded and put through the Circuit simulator and/or the CPU
simulators (Computer01, Computer02). One of the neat things about computer
science is that often the computer will demonstrate to you that there are
still gaps in your understanding of a solution.
- Possible bonus marks: although with the projects completed, there is
less value to bug reports, properly filed bug reports going to cs350 and
including a short piece of code that pinpoints the problem will still be
processed and some bonus will be rewarded for the belated reporting of these
issues.
- 8 Apr 2008: The H3 stats is revised
as I missed that the test result wasn't right for one of the ones
I built the table from (i.e., we now have 6 groups with working
solutions).
- 8 Apr 2008: PDF file of worksheet
for doing traces of the 6-stage pipeline from page 58 of volume 2 of text.
- 8 Apr 2008: As of Monday midnight, 21 of the 25 people who took
the midterm had handed in homework 5.
- 8 Apr 2008: The H3 stats are available.
- 5 Apr 2008: The H4 stats have been
revised. When I went to look closer into why the H4 solutions weren't
working on H5g, I found that actually only the fastest H4 solution wasn't
working, most of the others seemed fine. So, although still not clear,
it looks more like H5g is ok.
- 2 Apr 2008: Regarding the schedule of the last 2 weeks. Tuesday
we talked about virtual memory and there was a significant handout.
Thursday we will pick up any questions remaining from Tuesday and
move into the coverage of latches and flipflops at the gate level
that we didn't over when we were in Chapter 3 earlier. Since we don't
have a working multicore simulator, there isn't much more detail on
that I can give, so after the circuitry with loops discussion from
Chapter 3, we will move into the pipelining material in part 2 of the
text. There is enough detail there to allow us to trace the executation
of a particular pipelined system.
- 2 Apr 2008: Of the 17 groups that handed in homework 4, 7 seem to
be working. Their stats can be found here.
The fastest on the benchmark was 59,020,185 clock cycles on the h4j
Computer01 implementation. When run on the h5g Computer02 implementation,
it took 24,974,481. None of hte programs when run on h5g seem to produce
sorted tapes. So there is either a timing problem in these codes or a yet
to be announced bug report for h5g. Interestingly, 6 of the 7 produce
identical resulting tapes.
- 26 Mar 2008: H5g.tar is now available. It works the same as H5f.tar,
but if your code defines the same label twice, it gives you an error
message. The previous behavior would have been whatever Java's hash tables
do when you try to enter the same key with a different value (the correspondance
between labels and the location in memory they correspond to is tracked during
the assembly phase of the program by a hash table).
- 26 Mar 2008: As per the vote on Tuesday, Homework 6 has been cancelled
with its weight going to the final exam and Homework 5's due date has
been extended to 7 April. The restriction of testing to 900 sector tapes
that was taken for Homework 4 will not be applied in the case of Homework 5.
We will still be restricted to the tape format and sizes specified in the
original Homework 4 spec.
With regards to the H5f.tar simulator, at the moment it has no known
bugs and I do not intend to make any future changes that would impact
the timing of your solutions. I will, of course, continue to make bug
fixes as the need arises. I also will undertake to improve the interface
when suggestions are made that are consistant with the basic design (for
example, I hope to soon add a check so that it complains if there are
two label definitions of the same symbol, which apparently happens
sometimes when different team members try to merge together the code
they have written).
- 22 Mar 2008: The Homework 5 spec has
been updated to recommend /usr/bin/tr rather than /usr/ucb/tr.
- 22 Mar 2008: H5f.tar is now available for Homework 5. It implements
the fetch instruction properly (i.e., it starts a Ram request, but doesn't
wait for the request to finish). If H5f.tar breaks, report the bug and
fall back to H5e.tar. To test H5f.tar, I created CopyArray.asm,
CopyArrayUnroll.asm, and CopyArrayFetch.asm. CopyArray is a straightforward
implementation for copying a 40 word array from location 104 to 152.
CopyArrayUnroll does the same thing 4 words at a time. CopyArrayFetch
is CopyArrayUnroll that prefetches the block to be used by the next
iteration of the loop. The performance stats for these were:
Clock cycles Clock cycles
Computer01 Computer02
CopyArray 3396 882
CopyArrayUnroll 3006 779
CopyArrayFetch 3348 747
As you can see, in this example, the cache wins big. It turns out that
the code in the copy loop gets loaded once and sits in cache for the
remainder of the execution. Also, some of the copies are still in cache
when the halt is encountered and so are never pay the cost of being
written to ram.
I still would like to fix the dma I/O transfers so that they don't
alter the cache, but I will wait until after marking the midterm to
get back to that.
- 22 Mar 2008: H5e.tar is now available for Homework 5. It implements
a 16 entry 4 word line associative cache (using LRU replacement and
delayed write). Both fetch and dma transfers block access to the cache
until they have completed. The dma code solves the cache coherence problem
by pushing the dma information through cache (thus altering cache contents).
If H5e.tar fails to work,
generate a bug report and fall back to working with H5c.tar. The same
programs should work on both systems, just the timing will be different.
While one might expect that pulling a 4 word line from ram at a cost
of a single ram access would have a significant speedup on programs,
if the programming is busy-waiting, then most of the speedup goes to
executing more idling instructions. For the regressionTest programs,
the change was:
clock cycles clock cycles
program without cache with cache
AddcsBug 196 79
Circular 3000 1181
CopyBlock 2398 1972
CopyTape 4914 3924
CopyTapeTiming 378 81 // note this aborts on icount
CopyTwice 10709 8601
CopyUsingInterrupts 4965 3963 // note: no overlapped i/o
ReadBlockIntoBuffer 1231 1006
Sampler 1940 748
Store5AtLocation200 74 38
It is also worth noting that at the end of the run, some values are
still in cache having never been written to ram. To make debugging easier,
when I dump ram, if there is a value still in cache, I show the value in
ram and mark the line `in cache'.
Note: my preference would be for fetch to not wait for ram to return
and for dma I/Os to not alter the cache. So, expect another release of
the homework 5 simulator that addresses these issues better.
- 20 Mar 2008: Leaving for the day Thursday evening. Didn't finish
caching yet. Got all but one of the regression tests to work, so think
it is close. Plan on bringing it in Saturday.
- 20 Mar 2008: A `reminder': device command registers are mapped to
memory locations 2, 9, 16, 23, 30, 37, 44, and 51. Since no actual
device is associated with 30, 37, 44, and 51 in Assignment 4, attempts
to store values at these locations result in:
Exception in thread "main" java.lang.NullPointerException
at Ram.store(Ram.java:82)
at Computer01.SimulateCpu(Computer01.java:232)
at Computer01.main(Computer01.java:694)
A prettier error message may be generated in future versions, but at
this point I am resolving the issue by documenting the behavior.
- 20 Mar 2008: Agreed in class today that the only test cases I would
run against Homework 4 solutions would be those with tapes the same size
as the benchmark tape.
- 20 Mar 2008: Coming out of class today,
Schedule 02
has been modified to start the discussion of virtual memory next Thursday
(since we don't have that much more to say about threaded programming).
- 20 Mar 2008: It has been pointed out that the homework 4 due date
has gotten very close to the homework 5 due date. So, I have moved
the Homework 5 due date to 1 April. A new
Schedule showing the remainder of the semester is available. So far,
we are holding on to the last three assignments. If Assignment 6 should
end up being cancelled, then, as per the course outline, its weight would
shift to the final exam. I will point this out in class today (Thursday)
and have a closed ballot vote on Tuesday (27th) on whether to keep Assignment
6 in the mix or to abandon it (at this point).
- 20 Mar 2008: Homework 5 spec is
now available.
- 20 Mar 2008: H5c.tar is now available. It contains the fixes of
the H4 series up to and including j and contains Computer02 class for
homework 5. Computer02 has been configured to handle 6 tapes rather than
the 4 tapes of Computer01 (there is also a copy of Computer01 in H5c.tar).
I expect to soon post a Homework 5 spec. There will be a later release
of H5 today as the current release does not support caching. I expect at
least two more releases, one to support caching and another to implement
the fetch instruction properly. However, the six tape version will at least
allow you to start debugging homework 5 algorithms even if it isn't giving
the best timings yet.
- 19 Mar 2008: Taking into account the lastest fix to the simulator,
the due date for assignment 4 has been moved to next Monday, 24 March 2008.
The due dates for assignments 5 and 6 will remain the same.
- 19 Mar 2008: An error was reported with the simulator that
meant that in some cases a store instruction might be given an
address that was longer than 16 bits. While the actual store would
mask out the extra bits, the code that checked for memory-mapped
I/O didn't mask out the extra bits. This has now been fixed in
Hj4.tar in gaul:~webber/CS350.
- 18 Mar 2008: Someone stopped by with problems with their tape
handling code. When I looked at the simulator, I noticed a problem
that was causing the time it takes to do a rewind to be wrong (I think
it was originally not charging for rewinds and now it does). I don't
see how that could cause a problem, but I have fixed it in the H4h.tar
release. Also in the H4h.tar release, I have added a new debug option
called trace_tape. If you include this in your .asm file, then the
debug flag in the Tape.java module gets set to true and a lot of
information about what the Tape module is doing gets dumped at you,
which may help if you are having trouble with the tape. If you have
code that is broken by the change from H4g to H4h, please let me know --
my assumption at the moment is that the change shouldn't alter people's
ongoing work.
- 17 Mar 2008: For those already thinking toward Homework 5, I have
decided to stay with tapes rather than move to a disk drive simulator.
Specifically, rather than the four tapes of homework 4, there will
be 6 tapes (1 data tape and 5 empty tapes). I am currently working
on the cache, which is aimed at being 64 words, 16 four-word blocks
supporting delayed write and LRU replacement. The benchmark tape
will be the same. I am still planning on a multi-core design for
Homework 6, but expect to stay with tape I/O (probably 8 tapes)
rather than disk I/O.
- 17 Mar 2008: From cs350b mailing list:
- Thanks for the bug report. You are correct. H4e.tar and H4f.tar are
now obsolete. There is a new H4g.tar up that contains the patch to
fix problem with addcs noted over the weekend. I have also updated
the regressionTest script to check the fix.
- I am also extending the due date for project 4 from Monday to
Wednesday of this week, with the corresponding ripple effect on
late penalties.
- 11 Mar 2008: H4f.tar has just been put up on gaul. I have changed
the register and memory dump format to show you both the signed and
unsigned version of each value stored. I have also changed all the
asm files to start with capital letters (i.e., sampler.asm was
changed to Sampler.asm) and a new test file Circular.asm was added.
Finally, a script regressionTest was added that reruns all the asm
files and makes sure they still do the same thing they used to.
Note there have been no bug fixes in this release as no bugs are
currently known to exist in H4e, so moving to version H4f is entirely
optional. The only new functionality besides the change in dump
format was the implementation of the fetch instruction (which is
pretty useless since there is no cache for this assignment).
Other than that, both versions should perform the same.
- 11 Mar 2008: There is a new version of the Homework 4 simulator
(H4e.tar) in the usual place. The circular shift operations have now
been coded. Also, the code was refactored in preparation for the Homework 5
implementation -- during the refactoring, I discovered an error in the
implementation of the conditional jump instructions that compared two
registers, which has now been fixed. Thus I now view the H4d.tar
implementation as broken and strongly encourage its replacement with
the H4e.tar version.
This raises the question of if anyone else has tried out the code yet.
It is important that people try out the code as soon as it is available
and report problems as they occur. To encourage this, I will be awarding
bonus points for properly formated `bug reports'. To be a properly formatted
bug report it should:
- be emailed to both cs350b@gaul.csd.uwo.ca and webber@csd.uwo.ca
(that way, everyone in the class is aware of the problem being reported).
- the subject line should start: CS350 bug report:
(that way my spam software will pass it through)
- indicate which release of the software was used (bugs only count
if they are in the most up to date release at the time of reporting,
so be sure and double check the announcements page to make sure you
aren't running an obsolete version. At the moment, the current
release is H4e.tar .)
- it should include a small piece of code that illustrates the bug.
you should show what the simulator does with the code and explain what
it should be doing instead.
(that way it can be easily reproduced and verified)
- it must be a different bug from those previously reported as judged
by the professor.
(that way credit goes to the first reporter -- if it is the product of
group work, be sure and indicate the group among which the credit should
be split)
- a bug report that includes a useful (in the opinion of the prof)
suggestion on how to fix the problem (such as specific code changes)
will be worth more than a bug report that leaves the issue of what in
the code caused the bug up to the prof to track down.
As I process the bug reports and create new software releases, these will
be announced on the course announcements page as usual. This process
applies to homeworks 4, 5, and 6. If you have any short pieces of code
that reproduce bugs in the circuit simulator or libraries used in homework,
they are also worth bonus marks. Note that lack of documentation is not
viewed as a bug.
- 10 Mar 2008: Well, no problems have been reported with the Homework
4 simulator since the 7th, so I guess we are no longer in alpha stage,
but are now up to beta.
- 7 Mar 2008: Homework 4 Spec is
completed now and the simulator in working order as mentioned on
previous annoucement. Please email me if there are any problems.
- 7 Mar 2008: In gaul:~webber/CS350/H4d.tar is the current code
for the homework 4 simulator. It is probably still best viewed as
an alpha version, so if you run across problems, please let me know.
It passed a piece of code that copied tapes using interrupts
CopyUsingInterrupts.asm. It also passed another collection of
random instructino testing sampler.asm. There are still pieces of
code that have never been executed, though. The only instructions
that haven't yet been implemented are the circular shifts (shiftrc,
shiftrccu), the processing communication signal senders (sleep and wake),
the cache manipulating instruction fetch and the thread manipulating
instruction cas. So I figure everything you need for homework 4 is
there (although I will probably have another release with circular
shifts implemented early next week). Also in H4d is the benchmark.tape
with 900 sectors, 3600 items to be sorted for Homework 4. Source to
the program longtapegen.c that was used to generate the tape is included.
To run, benchmark.tape would be copied to t0. So, I am now ready to
writeup Homework 4 and then call it a day. Enjoy the weekend.
- 7 Mar 2008: In gaul:~webber/CS350/H4b.tar is the current code
for the homework 4 simulator. It has now been tested on two more programs
and various errors fixed. The first program is CopyTape.asm which copies
the entire contents of t0 onto t1. The second program is CopyTwice.asm
which first copies t0 to t1, then rewinds t0 and copies t0 onto t2.
One typically runs the simulator by typing something like:
cp shortsampletape t0
cp emptytape t1
cp emptytape t2
cp emptytape t3
# the above to initialize the tapes you are going to be using
java Computer01 < CopyTwice.asm
The debug messages that were tracing the simulator have been turned
off (they are controlled by a private boolean variable debug at the
top of the code for each class). In addition to the trace and
dump_on_halt features mentioned earlier, I have also added max_dump_address
and min_dump_address so that you don't have to dump the entire contents
of RAM if you are only using a small portion of it. To help with infinite
loops as well as debugging when you want to get a memory dump earlier than
the normal halt, I have added abort_icount which cause the program to
behave as if the n'th instruction fetched was a halt. So, for example,
since copying tapes with just two sectors
shouldn't take more than 2000 instruction executions
and all the memory I use is before address 250, I put the lines
max_dump_address 250
abort_icount 2000
into the CopyTwice.asm source listing. All the programs that are
being used to debug the simulator are also included as samples with
the simulator code. I am now starting in on the code for handling
interrupts. After that, there are various instructions that have so
far not been implemented, like sub, that I will attend to.
- 6 Mar 2008: In gaul:~webber/CS350/H4a.tar is the current code
for the simulator. It has been tested on 3 programs so far (included
in the directory). The first one is Store5AtLocation200.asm .
It contains code that stores the number 5 at location 200. All the
debug options are turned on and currently extra debug options are
in the source handled by a global flag debug. This program works.
The next program tested was ReadBlockIntoBuffer.asm which reads
32 bytes from tape t0 into a buffer starting at location 200 in
Ram. This also works. The third program is CopyBlock.asm. This
reads 32 bytes from tape t0 and writes them to tape t1. This also
works. Perhaps everything else works, but since it is late, I am
doubtful. Certainly these three programs would be worth looking
at. When I get in Friday, I plan on writing program to copy a whole
tape (both not using interrupts and using interrupts -- the three
programs above all use busy waits rather than interrupts). The
format of a tape is to have 32 byte blocks separated by the letter G,
with a G at the beginning of the tape and an E at the end of the tape.
Thus, my shortsampletape looks like:
G 5234567 7234567 6234567 8234567G 1234567 2234567 3234567 4234567E
holding two sectors of data (although the above look like numbers,
they are actually asci bytes). An emptytape looks like:
E
Running the third program above, I started out with t1 being an
empty tape. After running the program, t1 held:
G 5234567 7234567 6234567 8234567E
which is good.
For your assignments, the tapes are going to look like the above.
I am going to view a sector as containing 4 8-byte entries. It is
these 8-byte entries that you are sorting in ascending numeric order
(across the whole tape, not just within a sector). The intial setup
with be t0 holding the unsorted data and t1, t2, and t3 being
empty tapes. The solution will be to have t1 contain exactly the
data from t0, but in ascending numeric order. For example, if
asked to sort the tape
G 5234561 7234562 4234563 8234564G 1234565 2234566 3234567 6234568E
your answer would be
G 1234565 2234566 3234567 4234563G 5234561 6234568 7234562 8234564E
One other thing I noticed when writing the simulator was that I had
left out an instruction. So a new instruction, ireturn with opcode
106 has been created to handle returning from an interrupt handler
(which is different from returning from a regular procedure call).
Well, that is all for now. More Friday afternoon (incidently, I have
regular office hours Friday afternoon as well).
- 3 Mar 2008: Part 2 of the course text (84 pages) was handed
over to the bookstore today. They say it should be available Thursday.
Regarding Homework 3, it is worth noting that the CPU class example
(in text and test_data directory) is a large example of what homework
3 is a smaller version of. The CPU class example is a bit complicated,
so the SumMemory.java example in the test_data directory is probably
a better place to start. The benchmark data, h3bench.java was announced
16 Feb 2008; the project task has been fully described since 9 Feb 2008.
Regarding debugging, remember you can always put a wire name on the output
list to see what value it takes on as you change states. The wires that
correspond to state names in your controller can be particularly interesting
to trace.
- 19 Feb 2008: For people looking for some writeup of polyphase
mergesort, you might start with pages 30 through 32 of the 36 page document:
http://www.cs.auckland.ac.nz/software/AlgAnim/niemann/s_man.pdf.
Also, this material at http://distancelearning.ksi.edu/demo/501/cis501.htm looks promising,
although my computer can't process the videos included.
The most extensive discussion of it I have found online is
pages 59 [63] through 80 [84]of
http://www.oberon.ethz.ch/WirthPubl/AD.pdf (which is an online copy
of Nicholas Wirth's 1985 textbook Algorithms and Data Structures (revised
August 2004 to use Oberon system). [Nicholas Wirth was the inventor of
Pascal and Modula which were the main programming languages taught in
first and second year here before Java. There is a wikipedia entry for him. Famous quotes include:
"Software gets slower faster than hardware gets gets faster" (sometimes called
Wirth's Law); "In our profession, precision and perfection are not a dispensable luxury, but a simple necessity."; and
"Reliable and transparent programs are usually not in the interest of the designer."]
- 17 Feb 2008: Homework 2 marking is done. Results will be passed
back at end of Tuesday's class. Some people are still handing in files
I can't figure out -- so if your homework is marked as not working and
you think it is working, be sure to see me to straighten it out.
- 16 Feb 2008: The Homework 3 description
has been updated (and completed). The additions were to describe how
you should report the performance of your circuit in your report and
a discussion of how performance is being measured for marking.
- 16 Feb 2008: In gaul:CS350, there are two new files of particular
interest. One is Shift.java which implements the shift operation part
of assignment 2 using only the components library. The primary use
for this is for people who weren't able to get Shift implemented properly
in their assignment 2 solution (remember my assignment 2 test data sets
can be found in Tester.java and h2test2.cdn2). The other file of note
is h3bench.java. This creates the RAM for the benchmark program for
the homework 3 design. The particular program your assignment will be
interpretting for performance measurement scans through an array and
outputs the minimum value in that array. In order to do this with
the instruction set of homework 3, the program was written as self-modifying
code. In testing it, I found an error in the code that was
being used to test the CPU in h3.java and h3a.java. The problem was
that when a min operation was being done, negative numbers weren't being
handled correctly. This has now been fixed in the printh2 method in
all three of these files on gaul. This change does not impact the
homework 2 tester, which had been coded differently. With these files
completed, expect the discussion of the benchmark and handin material to
be added soon to the homework 3 assignment description, thus finalizing
that document.
- 11 Feb 2008: Looking forward, I have plotted out the following
schedule for the remainder of the course.
As you can see from it, next Tuesday we will be into material that isn't
in Part 1 of the textbook. I am still in the process of creating simulators
to support assignments 4, 5, and 6 and then integrating them into the
Part 2 text. As soon as it is ready, I will let you know. Til then,
we will have to rely mostly on class notes and handouts.
- 11 Feb 2008: The test data generator that I am using to mark
Homework 2 can be found in gaul:~webber/CS350 and is called Tester.java .
- 9 Feb 2008: The Homework 3 description
has been updated so that it now has enough to let you make substantial
progress on the task. The is also a new h3a.java file on gaul
that is discussed in the description. Finally, there is a new section
on how to organize your approach to getting this homework done incrementally
and hints about
how to improve performance once you got it working. Performance should
be particularly important this time as the due date change announced on
the 8th gives significantly more time to work on this homework.
- 8 Feb 2008: In putting together the homework 3 tester described
below, I was thinking it would be good to talk a bit about it in class
(which would be this Thursday since we have a quiz on Tuesday). Since
homework 3 is the capstone project for the course in the sense that it
gives you the opportunity to put together a CPU for yourself, I have
decided to extend the due date a week, which pushes us into Reading Week,
yielding a due date for homework 3 of Monday 3 March 2008. This will
probably push assignments 4 and 5 a week later and still give us 2 weeks
before the homework 6 due date.
- 8 Feb 2008: The homework 3 description is still under construction.
I hope to finish it up Saturday (weather permitting). The tester code
that I promised for homework 3 can be found in gaul:~webber/CS350 under
the name h3.java. It is meant to be compiled in the test_data directory
with the components there. As it sits, it lets you do things like
write a program that:
0: adds location 31's contents to location 32's contents
1: jump directly to the instruction at address 4
2: put the contents of location 31 on output
3: halt
4: add location 31's contents to itself
5: skip next instruction if contents of location 32 less than contents
of location 31
6: add contents of location 31 to itself
7: skip next instruction if contents of location 31 less than contents
of location 32
8: shift contents of location 31 by contents of location 33
9: negate contents of location 31
10: put contents of location 31 on output
11: make conents of location 32 the minimum of the contents of
locations 32 and 33
12: halt
31: initialize location 31 to 7
32: initialize location 32 to -1
33: initialize location 33 to 3
with a notation like:
String[][] data = zeros(awidth,dwidth);
datum(data, 0, addI(32,31));
datum(data, 1, jumpI(4));
datum(data, 2, outputI(31));
datum(data, 3, haltI());
datum(data, 4, addI(31,31));
datum(data, 5, skipI(32,31));
datum(data, 6, addI(31,31));
datum(data, 7, skipI(31,32));
datum(data, 8, shiftI(31,33));
datum(data, 9, negI(31));
datum(data, 10, outputI(31));
datum(data, 11, minI(32,33));
datum(data, 12, haltI());
datum(data, 31, 7);
datum(data, 32, -1);
datum(data, 33, 3);
This array data can then be used to pre-initialize RAM. To find out
what the program does, one can:
printh2(data, awidth, dwidth);
and get a bunch of prints starting with // that tell you what the
contents of memory was before the simulation, that output would contain
-112 on halt, and what the contents of memory would be after the simulation.
This routine changes data when run, so RAM should be initialized before
this routine is called.
- 8 Feb 2008: In gaul:~webber/CS350 there is a file called h2test2.cdn2
that contains test data (stuff that can be put after the INPUTS and OUTPUTS
statements) that breaks the circuits whose performance doesn't appear on
the h2 preliminary results list.
- 7 Feb 2008: As mentioned in class today, I will have office hours
on Monday 11 Feb 2008 from 3pm to 5pm.
- 7 Feb 2008: Found an error in the processing of one of the
handins and revised the preliminary results on h2
accordingly (leaving us with 10 working handins and 4 not working).
As noted, these are preliminary and hope to be able to post some initial
test data for your soon.
- 6 Feb 2008: Reminder, 1 Feb 2008, I announced Thursday 7 Feb
office hours would be cancelled. Class will still meet at usual time.
- 6 Feb 2008: preliminary results on h2
marking. Roughly a factor of 8 between best and most not best this
time, with 14 distinct circuits handed of which 9 passed preliminary
testing -- more testing to come.
- 6 Feb 2008: Homework 3 description
is now available. It is incomplete in that I haven't given you
some Java code to make testing easier nor spelled out how to do the
report for this project. But otherwise it is stable.
- 6 Feb 2008: For those with more general interests in computer
design, there is a nice article at Ars Technica
(
Sun: Can you smell what the Rock is cookin'? by Jon Stokes on Sun's
latest 16 core (32 thread) high-end SPARC processor design called Rock.
Relevant Wikipedia entries would be:
- 4 Feb 2008: A few people have asked about addition overflow.
The homework 2 spec does not create any special behavior for the
situation that you would interpret as an addition overflow. It just
takes the lower 16 bits of the result.
- 1 Feb 2008: Office hours on Thursday 7 Feb are cancelled due to
a conflicting meeting.
- 1 Feb 2008:
- A few people have mentioned that the circular shift
implementation that I give in the text doesn't have the same behavior
as the homework spec for large values or negative values. This is
right. (For that matter, the shift operator in Java does not behave the
way specified in the homework spec either (there are many ways to implement
shifting).) In order to do assignment 2 you will need to do a bit of
rewriting on that component.
- Another thing people have mentioned
is problems relating to the mod function returning a negative value
when applied to a negative input in Java. This does indeed happen.
When I want the low order byte of a word (unsigned), instead of taking
its value by x mod 256, I use x & 0xFF, which directly masks out the
higher order bits of the word.
- A third comment is that there are a number of test cases given
in the homework spec. It would be poor judgement to had in your circuit
thinking it worked and never having tried the example test cases.
- 28 Jan 2008: Regarding marking of assignment 1, it will be handed
back at the end of Tuesday's class. From the reports, it looks like
everyone who handed in a non-working assignment knew they were doing that
when they wrote their reports. Most of the marks were 9 or above (being
4.443 times slower than best got you a 9). So, a good job was done on
most of the first assignments. People who had trouble for Assignment 1
should be sure and drop by office hours and get things straightened out
since Assignments 2 and 3 build on the same skills (and count 20% of your
mark).
- 28 Jan 2008: The course outline lists the due date for Assignment 3
as Monday, February 18th (President's Day). A rather obscure US holiday,
but apparently you all want to celebrate it in solidarity with us, so
I am moving the due date to Tuesday, February 19th in appreciation.
- 23 Jan 2008: I have started evaluating the assignment 1s handed in
so far. Based on just one test case, there were 16 distinct performance
values presented in the table. The best
performance was 8016 (334 * 24). These results are preliminary as
more testing needs to be done to verify that these really are working
circuits (but things look good so far for them). Note: two people's
handins I was not able to decrypt -- email has been sent to them and
hopefully they will drop by soon to sort out the problem with their
encryption keys.
- 22 Jan 2008: At 5:30 pm Tuesday, 23 people had handed in something for
homework 1, whereas I have encryption key forms for 26 people.
- 22 Jan 2008: Homework 2 description
now contains a full set of examples. I don't expect to add anymore unless
a problem arises over the next few weeks with it. Be sure and look at all
the boundary cases when testing (using signed 2's complement values and two
control signals, there
are more special cases than the first homework).
- 22 Jan 2008: Homework 2 description
is now available. It is incomplete in that I haven't put examples of
the usage of all the options in it. But otherwise it is stable.
- 22 Jan 2008: Referring to the Homework 2 overview posted on 21 Jan 2008,
note that y is a 16-bit 2's complement value, so when c == 1, you have to
worry about shifts that are both positive (to the left) and negative (to the
right). The kind of shift is logical, i.e., 0s come into the word from the
opposite direction as the bits of the word leave. For example, 8 shifted left
by 1 is 16, but 8 shifted left by -1 is 4. Also, -1 shifted left by 1 is
-2 and -1 shifted left by -1 is 32767. By the way, I am expecting homework 2's
solution to play a role in homework 3.
- 21 Jan 2008: I expect to have a writeup for Homework 2 soon. For
those who want to start thinking about it, I am thinking in terms of
a circuit that has 3 inputs: a 16-bit 2's complement number we will
think of as x, a 16-bit 2's complement number we will think of as y,
and a 2-bit unsigned number we will think of as c. The circuit will have one
output: a 16-bit 2's complement number we will think of as r. The
circuit will compute the following:
if c == 0 then r = x + y
if c == 1 then r = x << y
if c == 2 then r = -x
if c == 3 and x < y then r = x
if c == 3 and x >= y then r = y
- 21 Jan 2008: Well, so far (5:30 pm)
6 people have handed in something online. 26 people have handed in the
encryption key form. The registrar's office
has 33 people enrolled in this course. So I guess gaul will be busy tonight.
For the people who haven't yet handed in their encryption key form -- remember,
you can go ahead and submit something online using a key that you haven't
yet sent me at a 5% penalty providing you get the key and an encryption form
and given to me (along with verifying who you are via your student id) within
a week of the handin. Also, remember that in order to get credit for the
assignment, each person has to do their own handin (even if they are working
in groups).
- 18 Jan 2008: Friday 2-3pm added to the regular weekly office
hours times.
- 17 Jan 2008: Taking into account Monday's 340 class, I have extended
my office hours for the next couple of days to:
Friday (18th) from 2-3pm and Monday (21st) from 3-5pm.
- 17 Jan 2008: For this upcoming assignment, I will be in my office
Friday (18th) from 2-3pm and Monday (21st) from 3-4pm.
- 16 Jan 2008: Clarification on group work:
- Each group member is responsible for handing in a copy of the assignment.
The copies should all be the same. The mark for each member of the group
would be the same, except for possible differences in late penalties if
all members aren't handing it in by the same deadline.
- The report in the handin should contain a list of all group members.
It should NOT contain anyone's student ID number.
- 16 Jan 2008: By way of clarification on project implementation:
- Your h1.cdn file should contain the circuit (including table definitions,
INPUTS and OUTPUTS commands, as well as the relays, power, ground, and join
commands).
- What is happening is
that your circuit will have 32 distinct input wires named x[31], x[30],
..., x[0]. Normally how you would do this is by having a method like
String pHomework1(String base, int width, String result)
which you would invoke with pHomework1("x", 32, "r").
- Inside the method, when you want to refer to wire x[31], you would do it
by iW(base, 31). More generally, the i'th wire's name would be the string
returned by iW(base,i) .
- For example, if the assignment had been to have r[1] be 1 if all the 32 bits
of x was 1 and r[0] to be 1 if at least 1 of the 32 bits of x was 1, then
a solution might be:
String pExample1(String base, int width, String result) {
int i = width;
String chain1 = iW(base,0);
String chain2 = iW(base,0);
for (i = 1; i < width; i++) {
chain1 = pOr(chain1, iW(base,i));
chain2 = pAnd(chain2, iW(base,i));
}
pJoin(chain1, iW(result,0));
pJoin(chain2, iW(result,1));
return result;
}
then to create the Example.cdn file, you would do
pStandardTables();
pExample1("x", 32, "r");
p("INPUTS " + wireNames("x",32));
p("OUTPUTS " + wireNames("r",2));
p("CLEAR_CIRCUIT U");
p("INPUT_DATA " + binaryDigits(193,32)
p("END_OF_SIMULATION");
- 15 Jan 2008: Although I haven't set them as regular hours yet, I
will be in my office from 3-4pm 16 Jan 2008, if people have questions
they want to discuss or forms to hand in.
- 14 Jan 2008: Office hours have been announced (see top of this
page). They start this week. I am still settling the rest of my
schedule.
- 14 Jan 2008: A form for collecting encryption keys has been
created. Copies of it will be passed out in Tuesday's class. For
reference, the form is:
Note, if you need to change your encryption keys (for example, because
you lost or forgot them), print off another copy of the above form
and fill it out (and bring it to the professor along with your student
ID). You cannot change your encryption keys via email.
- 11 Jan 2008: General advice regarding working on Homework 1
- Since this is the first circuit you will have built in the system,
I would advise you to work bottom-up rather than top down. I.e.,
pick a small component you think will fit into the structure,
design it, build it, and then test it on the simulator and see if
it has the properties you expected. Then try and connect it to another
to build something closer to what you want. And so on. Once
you got something working, then you want to look at its structure to
see where the longest path in it is, so you can think about possible
alternatives that might shorten that path and hence the depth of the
whole circuit. You would also want to look at which subcomponents used
the most parts and think about other ways of building them or if they
are actually necessary.
- With regards to what is a `good' value of nodes x depth for this
assignment, I used to give out values that I have achieved, but
they turned out to cause more trouble than they solved because
on the one hand there is no guarantee anyone will find the approach
that I did and so a worse result might still get a good mark and
on the other hand there is no guarantee that someone else might
not find a better approach than I did in which case the people
who gave up once they reached my number would still end up with
a lower mark than they expected. From a general point of view,
we have 32 inputs and 2 outputs that are dependent on all 32
values. Thus, as a lower bound, we can observe that in order to
use 32 inputs, it would take 16 relays to reduce 32 inputs to 16
outputs, 8 more relays to reduce to 8 outputs, 4 more relays to
reduce to 4 outputs, 2 more relays to reduce to 2 outputs, and
1 more relay to reduce to one output, thus the least value to produce
one of the outputs would be 16 + 8 + 4 + 2 + 1 = 31 relays with a
depth of 5 or a performance of 31 * 5 = 155. How many more relays
are necessary to produce the other output is difficult to tell. If
we assume no sharing between the relay networks for the two different
outputs, then we would be looking at 62 (31 + 31) relays with a depth
of 5 or a performance value of 310. As an upper bound, I would say
that this problem requires less resources than addition of two 16-bit
numbers. While the book addition is not optimal, we can still look
at it as an upperbound on where this assignment should be. The simulator
shows that the 16-bit addition version of pAdd contains 368 nodes and
has a depth of 100, for a performance of 36,800. Thus, I expect
people's results to fall between 310 and 36,800 (although there is
an outside chance of a value smaller than 310 but more than 155).
- 9 Jan 2008: Homework 1 description.
- 9 Jan 2008: NOTE: All references to 2007 in the course outline,
specifically, class schedule section, have now been changed to 2008. Sigh.
- 9 Jan 2008: Regarding student information privacy concerns, the
following links may be of interest:
- 9 Jan 2008: The Java libraries and circuit simulator used for the first
three computer assignments can be found at
gaul:~webber/CS350/CircuitSimulator.BkWi06a4.tar.gz
- 7 Jan 2008:
Course Outline done
- 7 Jan 2008: Created announcements page.
I can be contacted by email at webber@csd.uwo.ca .
Be sure and include course number in subject line of your email.