As decided in class today, the due date for H3
has been moved from Monday 21 Nov 2011 to Monday 28 Nov 2011.
H3.html has been updated accordingly.
14 Nov 2011: 9:45pm: H2 and Practice final marks have been emailed
out along with a plan for Tuesday's class.
14 Nov 2011: H3.html gathers together the
spec and hint info for H3. The report is basically the same as for H2
except there is an extra category for average number of cycles per test
case (which is part of the performance measure for sequential circuits).
14 Nov 2011: With regards to H2, performance values were:
15,390 (my sample solution which is in the software distributions)
1.35 = 20792/15390
x/20792 x
1 20,792
1.22 25,388
1.45 30,160
1.919 39,920
2.05 42,640
2.649 55,080
3.929 79,620
4.989 103,744
7.986 166,054
54.117 1,125,216
54.1919 1,126,760
two handins had errors in only one opcode, seven handins had errors in 3 or
more opcodes, and 1 handin didn't compile.
With regards to the practice final, the marks were:
.7
.666
.633
.583
.566
.566
.466
.45
.45
.433
.4
.366
.366
.3
.3
.1
10 Nov 2011: The notes on answers to questions
on midterm has been updated. It was pointed out that there was an
error in my answer to question 5 in that I had indicated a signed right
shift was needed where in reality an unsigned right shift was needed.
8 Nov 2011: PracticeFinalNotes are
now up. As with the midterm, there probably won't be time for full converage
in class, so look over the notes and if you have any questions, raise them
at the start of Thursday's class or during office hours or email. Expect
at least one question like these on Final (in addition to questions like those
on the Midterm as the Final is cummulative).
5 Nov 2011: We now have H3.c in the distribution
CSv3.111105.tar.gz (new in gaul:~webber/CS350). Also, In addition
to SequentialRandom.java (which matches the output of H3.c on a correct
circuit), we also have SequentialRandomStepper.java which lets you watch
the output of your circuit change after each clock cycle. This new tester
is particularly useful for debugging, particularly if you augment your output
with the current state and the inputs and outputs of the controller (be
sure and remember that the done line should always be in the least significant
position as this is how the testers know to move onto another test case).
My initial solution takes 3764 transistors, has a depth of 57, and
the average test case is 107.56640625 . I haven't yet tested it on other
test streams nor spent any time optimizing it, so I expect these performance
values can be significantly improved upon.
4 Nov 2011: Regarding note on H3 below, the other change to Q5
was that instead of 255 being the upper limit, the upper limit is now
is now 256 * 256 * 256.
4 Nov 2011: For H3, you will be implementing the collatz circuit
from question 5 of the midterm. However, the input size will be changed
to 24 bits and the internal arithmetic will be as accurate as 32-bit
integer arithmetic. H3.c hasn't been written yet, but there the prototype
for it called colllatz2.c in CSv3.111104a.tar.gz (new in gaul:~webber/CS350).
In order to make collatz2.c into H3.c, the only thing that needs to be
changed is to make the output binary so it matches the result files.
With 24 input lines, SequentialExhaustive would take too long, so there
is a new tester called SequentialRandom. The random sequence collatz2.c
and SequentialRandom.java are using is:
r[-1] = seed * 4;
r[i+1] = abs(r[i] * (r[i] + 1)) % (256 * 256 * 256)
The program collatz2.c has been set up so that seed is 3967 and the numberOfTests
is 256 (seed and numberOfTests are macros in collatz2.c and class level statics
in SequentialRandom). Performance for this homework will be measured by
number of transistors times depth times average number of clock cycles per
test case on the sequence of 256 test cases where seed is set to 3967. However,
to determine if your circuit is working, I will pick other values of seed
and numberOfTests to run also. For 80 and above, the circuit has to work on
any test case possible. For the 50 to 70 category, it is enough that it works
on the 256 test cases coming from the seed 3967. Although the circuit is
simple, there are a number of interesting possibilities for optimization
in terms of number of transistors, depth, and even changing average number of
cycles per test case.
4 Nov 2011: CSv3.111103.tar.gz is in gaul:~webber/CS350 . It includes
a few small fixes to marie.c (and rebuild of marie.c.line and marie.c.line.pdf)
and the completion of the controller in marie.csd (and rebuild of marie.csd.line
and marie.csd.line.pdf). As mentioned before, marie.c.line.pdf and
marie.csd.line.pdf are appendices for the upcoming Practice Final (and will
evolve into appendices for the actual Final as well). The StandardComponents
appendix that accompanied the midterm will also accompany the practice final.
[Next up is to announce Homework 3.]
3 Nov 2011: CSv3.111103.tar.gz is in gaul:~webber/CS350 . It includes
a new version of marie.c with the fixes discussed in class today (marie.line
is the line numbered version and marie.line.pdf is the file that will be
printed to be an appendix to Tuesday's practice final). The new marie.c
has more line numbers and so the marie.csd file (marie.csd.line being the
line numbered version) reflects the change in line numbers. There is sample
testdata for marie.c in prog01 which produces the result in result01 by:
make marie
./marie < prog01 > result01
There will be
another distribution tommorrow to include the finishing up of the controller
in marie.csd which will then also be an appendix to the practice final on
Tuesday).
In addition to these, the file Midterm.tex is included for people
looking for an online copy of the midterm, collatz.c is the C code referenced
in question 5 and Add3.csd is the csd code referenced in question 1.
My solution to homework 2 is in H2.csd as well.
2 Nov 2011: notes on answers to questions
on midterm.
2 Nov 2011: The latest emailing of marks includes the midterm mark
as well as updating the practice midterm mark where appropriate. At the
bottom is a weighted combination of the marks so far (h1, practice midterm,
and midterm). So you can see where you stand relative to the rest of the
class, the following were the weighted combination marks:
0.839429
0.765714
0.757086
0.728286
0.717286
0.688286
0.658
0.652571
0.638286
0.612857
0.599
0.592571
0.564
0.557143
0.543714
0.507086
0.506857
0.505714
0.464286
27 Oct 2011: Also, the TA points out that the textbook has an error
on page 101 in its Hamming code example. The book says bit 4 checks parity
over bits 4 and 5 and it should be 4, 5, and 6. Also, note Example 2.39
says `assume even parity' and Example 2.40 says `assume odd parity'. My
Hamming code C code assumes even parity. Basically the book is saying there
are two Hamming code systems, even parity Hamming code and odd parity Hamming
code.
27 Oct 2011: The TA will have a special office hour 11:30 - 12:30
on 28 Oct 2011 in room 4A of Middlesex College for last minute questions
before the midterm on Friday
24 Oct 2011: There is a new release of the course software in
gaul:~webber/CS350 with the name CSv3.111024.tar.gz . The changes
don't impact H2, but relate to CRC and Hamming codes. There are three
new .c files in CSDExammples. mod2divrem.c calculates the remainder
for general mod2 division (as used in CRC codes). If you type:
./mod2divrem 75 11
you get the result 5, just like Example 2.36 in text. If you type
./mod2divrem 5 3 2
it gives the exhaustiveTester table for a generla mod2 remainder
circuit whose first input has width 5, second input width 3, and
the width of the output is 2. Rather than outputting in hex, I
figured out how to make binary output so that it would match the
ExhaustiveTester output directly (not needing a mod2divremReader program).
Note that division into 0 or by 0 are special cases for such a
circuit which I produce and output of 0 for.
If you recall, back when we were covering this in class, I mentioned that
it is an interesting exercise to build a circuit that checks a 12 bit
input to see if it is the valid CRC encoding of an 8 bit input where
the divsor was 31. crc31checker.c generates the exhaustiveTester output
for such a circuit. crc31checker.csd is my solution to that problem.
Finally, hammingFix.c generates the tables for a circuit that tries
to fix inputs to conform to the Hamming code conventions. If you type
./hammingFix 6
it produces the exhaustiveTester tables for such a circuit with input
width 6 and output width 7. The most significant bit of the output indicates
whether or not the input was fixable, and the remaining bits are the original
input if it wasn't fixable and the fixed input if it was fixable. It only
works for widths up to 30.
22 Oct 2011: By the way, when I post an answer to H2, it will
have a depth of 15 and take 1026 transistors for a performance of
15,390. As with H1, I expect some of the handins to be even better
than this.
19 Oct 2011: In gaul:~webber/CS350 there is now a new directory
called ContribFall2011 . The first files added to this directory,
H2Conv.README and H2Conv.py come from a fellow classmate who wrote a
python script to make debugging his H2 solution easier by working in
binary instead of hex . Attribution, description, and usage info can
be found in H2Conv.README. The software is provided on an `as is' basis.
18 Oct 2011: After class today, a student reported having trouble
with H2Reader.c Upon closer examination, it turned out that the bound
on how many rows H2Reader processes was wrong -- it has now been fixed
and the correct version of H2Reader.c can be found in CSv3.111018.tar.gz
in gaul:~webber/CS350. Note: this new version was tested against the
circuit H2Broken.csd in the CSDExamples directory. By the way, C code
is not always portable between Intel and Gaul/Sparc machines due to byte
ordering issues. It is the behavior of the C code on Gaul that defines
a check of the solution of a working circuit.
17 Oct 2011: Just sent out encrypted emails for the practice midterm
marks. 14 people took the practice midterm. Of these the top mark was
25 out of 30, five papers were in a cluster from 21 to 25; the rest were
uniformly distributed from 17.5 down to 11. There were two 10s on question 3,
one 10 on question 2, and two 9s on question 1. All but one of the people
in the 17.5 and below group were there because at least one of the questions
they answered was below 3. 1 person got question 1 below 3 (by skipping it
completely), 3 people were below 3 on question 2, and 5 people were below
3 on question 3. Questions 1 and 3 will be
covered in depth in class on Tuesday. Question 2 will be left with just
the hint provided on the 13th so that people can take a second attempt at
it. The TA will collect second attempts on Question 2 (and attempts to
apply DeMorgan's law to improve question 2 answers) on Thursday for
purposes of feedback (not to effect past mark, but hopefully to make
people better prepared for future marks).
16 Oct 2011: H2's handin spec is now
available. It is much like H1, except the problem is defined by H2.c
in the distributions since CSv3.11107.tar.gz that have been available
for over a week now. Three changes have been made regarding the
report2 handed in along with your csd file:
- A request that you use the question numbers so it is clear which
question you are answering.
- The marking scheme for non-working circuits depends on how many
opcodes you were able to get to work and so there is a new question 2a
asking you which op codes you got to work.
- As indicated before, question 8 on optimization has taken on greater
importance. The maximum penalty on messing it up has been set at 15%
(a day and a half of lateness). For details on what is wanted, see
the handin spec.
As indicated in the handin spec, in gaul:~webber/CS350 there is a
new software release CSv3.111016.tar.gz . New additions are the gcd
example in CSDExamples we discussed in class Thursday and the
new SequentialExhaustive tester also discussed in last Thursday's class.
I also renamed the CSv311107.tar.gz distribution to be CSv3111007.tar.gz .
13 Oct 2011: The practice midterm was done in class today and
is now available online. Also
available are notes on answers to practice
midterm, which may be updated after marking is done.
11 Oct 2011: Announced in today's class, due date for H2 was moved
from Monday 17 Oct 2011 to Monday 24 Oct 2011. Everything you need to solve
it has already been posted as of the 7 Oct 2011.
7 Oct 2011: CSv3.11107.tar.gz is now available in gaul:~webber/CS350 .
In the CSDExamples subdirectory, it contains the H1a.csd file discussed
in class Tuesday. It also contains H2.c, which is a C implementation of
the Homework 2 spec announced 6 Oct 2011. There is also an untested
program H2Reader.c which should convert a result file from the circuit
simulator into the same format as the output of H2.c for comparison
(just like I used normReader.c to compare the results of my norm2.csd
circuit to my norm1.c program output). Once I get a chance to solve H2,
I will post an actual result file for it as well. Incidently, H2.c has
an option where if you define the macro DEBUG, it will print out the
separate fo, so, op, r, and c parts of the input and output in addition
to the flat 11 bit input pattern and 6 bit output pattern. With 11
bit input, the truth table is 2048 entries and so ExhaustiveTester
still works fine for this problem.
6 Oct 2011: Homework 2 problem. A more complete presentation with
a table file will come out Friday or Saturday as well as handin instructions.
However, the basic specification is that your circuit will take 11-bits
of input bit[4] fo, bit[4] so, bit[3] opcode and will be producing a
four bit result bit[4] r and a two bit condition code bit[2] c. Basically
you implementing an ALU as we discussed in class and as illustrated in
Figure 3.17 of your text. The 8 operations that you will be supporting
(assuming fo and so are viewed as 2's complement integers) are:
opcode operation
0 r = fo + so
1 r = fo - so
2 r = fo << 1
3 r = fo >> 1
4 r = fo ^ so (XOR)
5 r = fo
6 r = S-AES(fo)
7 r = S-INVERSE-AES(fo)
S-AES is a simplified version of the AES encryption algorithm that ccrypt
uses. In regular AES, the function takes in 8-bits and produces an 8-bit
result. In the simplied S-AES algorithm, it takes in 4-bits and produces
a 4-bit result. Its boolean table is:
0000 -> 1001
0001 -> 0100
0010 -> 1010
0011 -> 1011
0100 -> 1101
0101 -> 0001
0110 -> 1000
0111 -> 0101
1000 -> 0110
1001 -> 0010
1010 -> 0000
1011 -> 0011
1100 -> 1100
1101 -> 1110
1110 -> 1111
1111 -> 0111
As you can see, this function is 1-1 and so it has an inverse. S-INVERSE-AES
has the table of the inverse function. For example, the inverse function
maps 1001 to 0000 whereas the regular function is mapping 1001 to 0010.
The idea for this simplified version of the AES encryption standard comes
from William Stallings' Cryptography and Network Security: Principles and
Practice -- Fifth Edtion. Of course, to implement the circuit for these
two truth tables, you don't need to know anything about the AES encryption
algorithm. However, we could think about this ALU as one that was
optimized to make S-AES encryption more efficient for the programmer running
on a CPU that had this ALU.
The condition code c[0] is 1 if r is zero and 0 otherwise. The condition
code c[1] is 1 if r is negative (2's complement) and 0 otherwise.
For the 50-70 category, at least 6 of the 8 opcodes must work
perfectly. For the 80 - 100 category, we are still looking at
optimizing number of transistors times depth on a circuit that
gets the right answer for all inputs (including all opcodes).
6 Oct 2011: As mentioned in class, last semester's midterm is online.
Actually, what is online is the questions with answers. So I have
created a new file
justQuestionsMidtermFeb11 so you can
work on the questions without being influenced by the answers. When you
are done you can then look at
notesMidtermFeb11 to see the answers.
Note that questions 1 and 2 are most relevant to the practice midterm
as they just involve combinational circuits. Question 3 relates to
material we covered today in class as it involves tracing the negative
edge-triggered D-flipflop. If you didn't get the circuit copied down
in class today, it can also be found online at
play-hookey and along with a bunch of other circuits at
ece385 flipflop lecture. For purposes of this question, the
first column of NOR gates were labelled from top to bottom P0, P1,
P2, and P3. Questions 4 and 5 are about material we are heading into
next week. Remember that the primitive FLIPFLOP referred to in these
last two questions have been replaced with SYNC in the latest CSD
software (same inputs and outputs, just a different name for the
primitive). As you can imagine, last semester's students were interested
in what they could look at to prepare for their midterm. There was less
for them, but what there was was linked to in the
oldProblems file.
5 Oct 2011: The marks for H1 have been emailed out to your uwo.ca
email account (7:31pm). If you did not get email, then that means I
didn't get homework handin from you. The email includes instructions on
how to decrypt it.
Many of the report1s had poorly written descriptions of the optimizations
done. This will count for more in future assignments. A properly written
description of optimization would start with the basic design/approach
to the problem and then from there describe how it was alterred to improve
performance. A number of people didn't bother doing any optimizations
such as DeMorgan's Law applications. Note that on my exams, there are
often small problems that require people to apply standard optimizations
in order to get the right answer, so if you don't understand how they
work, ask either me or the TA soon (certainly before the exam or next
homework).
3 Oct 2011: Of the 17 people who handed in solutions to H1, 12
worked. The best performance was 636, so below you see transistors *
depth = performance ( (performance / 636) * 636 ) and mark worth before
late (and other) penalties. As you can see, this fits nicely on a
quasi-logarithmic scale where each performance improvement by a factor
of 2 moved us 2.5% . 636 was more than 8 times better than my sample
solution, which in turn was 25 times better than the worst performance
solution (which was still better than nearly half the class (9 out of 21)
who were not able to get the circuit to work). The marks won't be out
til Tuesday as I still have to sort through the non-working solutions and
read the reports.
1 = 2 ** 0 (100)
106 * 6 = 636 (1 * 636) 100
116 * 6 = 696 (1.09434 * 636) 99
122 * 6 = 732 (1.15094 * 636) 99
116 * 7 = 812 (1.27673 * 636) 99
130 * 8 = 1040 (1.63522 * 636) 98
2 = 2 ** 1 (97.5)
198 * 8 = 1584 (2.49057 * 636) 97
176 * 14 = 2464 (3.87421 * 636) 96
4 = 2 ** 2 (95)
8 = 2 ** 3 (92.5)
402 * 14 = 5628 (8.84905 * 636) --- my sample solution
410 * 14 = 5740 (9.02516 * 636) 92
16 = 2 ** 4 (90)
714 * 23 = 16422 (25.8208 * 636) 89
32 = 2 ** 5 (87.5)
750 * 32 = 24000 (37.7358 * 636) 87
802 * 47 = 37694 (59.2673 * 636) 86
64 = 2 ** 6 (85)
128 = 2 ** 7 (82.5)
1690 * 84 = 141960 (223.208 * 636) 81
256 = 2 ** 8 (80)
30 Sept 2011: Well, the testing on norm2.csd finished and it passed
all 33,554,432 test cases. Incidently, the output of norm1.c was 568
megabytes. The output of ExhaustiveTester run on norm2.csd was 3,825
megabytes (or 3.8 gigabytes). Basically, the test cases took a week to
run on a 2.26 GHz Quad-Core Intel Xeon with 8 GB of 1.066 GHz DDR3 ram.
So, 25 input pins is a bit large for exhaustive testing using the current
system. Note that, all other things being equal, the amount of time it
takes to do exhaustive testing (as well as the size of the tables produced)
should be exponential in the number of input pins.
29 Sept 2011: as of 2pm today, 17 students have handed in assignment 1
and 21 encryption keys have been registered. All of the files handed in
decrypted successfully. 15 claim to work and 2 claim not to work.
Reminder: you can hand in an assignment more than once and handing in a
non-working circuit now is better than handing in the same non-working
circuit later. By the way, regarding the norm2.csd circuit, 33.44 million
test cases have executed so far of the 33.55 million needed, so by tommorrow
we should know if the circuit worked.
28 Sept 2011: It turns out there is a clearer discussion of CRC
on wikipedia.
The division shown in example 2.36 of the text is just to show how
division works -- it isn't the calculation of a CRC remainder. To make
it the calculation of a CRC remainder, one would have to append 000 to
1001011 making 1001011000 as mentioned in Step 3 on page 94 (the perils
of reading too quickly). By appending a pattern of 0's one less than the
number of bits in the divisor, this allows a 17-bit pattern like CRC-16
to be used as a check on a 16-bit word (which becomes 31 bits once the
zeros are appended). Also, the wikipedia page notes that you get parity
by using 11 as your divisor rather than 10 (note division here is different
from `integer' division). When building a circuit, one should be sure
and handle this correctly. Incidently CRC checksums are part of a wide
range of communication protocols from USB to ATM to Ethernet to Bluetooth
to SCSI to cordless telephones to OpenPGP.
27 Sept 2011: Reminder: in class today, I mentioned a sample
problem people could work on to keep improving their circuit design
skills. Coming Section 2.7.1 of the text, we have the CRC algorithm.
I proposed people work on the problem of having an 8-bit value with a
4-bit syndrome which is the remainder when dividing by 31 using the
bitwise modulo-2 division they show in the text.
27 Sept 2011: So far, 20 people have handed in an encryption form
and 14 people have handed in an assignment 1 solution (one of which didn't
have an encryption form and so can't be read yet, leaving 13). Of the 13,
12 claim to be handing in something that compiles and meets the spec.
26 Sept 2011: A few people have mentioned getting error messages
about their circuits when running ExhaustiveTester. The line numbers
mentioned in those error messages are not line numbers in the original
CSD file, but rather line numbers in the NL file produced by CSDDesign.
If you look up the line number in the NL file, you will find that there
are a bunch of comments around it which indicate what part of the original
CSD file is at issue.
26 Sept 2011: Of the 33,554,432 test cases, I have so far run,
17,795,448 of them (3:40 on Friday 23rd til 7:30 on Monday 26th).
So far, norm2.csd has passed each test case, so CSv3.110923a.tar.gz
still looks good, but it will be a few more days before testing will
be completed on norm2.csd.
23 Sept 2011: Of the 33,554,432 test cases, I have so far run
424,716 of them (taking a couple of hours to do so). So far, norm2.csd
has passed each case (after making the fixes talked about in class
Thursday, i.e., compute tz from in rather than temp, sway order of
args to ite in box definition, and
add 1 rather than 0 in incr5 definition). As mentioned below, norm4.c
was deleted. A new program, normReader.c, was written to convert the
output of ExhaustiveTester run on norm2.csd into the format of the
output of norm1.c, so that they can be compared with diff.
The new release including the above is CSv3.110923a.tar.gz in
the usual place.
23 Sept 2011: Testing the handout material from Thursday's
class, norm1.c, norm2.c, and norm3.c seem to work fine. norm4.c
turns out to be a failed idea as in order for the countLeadingZeros
function to work the way it is designed, it has to also be shifting
i while it counts (like norm in norm3.c does), so appears to be no
benefit to having a separate shiftLeft function. A new release with
norm4.c deleted and the fixes to norm2.csd made should be out later
today (after I have a chance to test the norm2.csd changes).
23 Sept 2011: Looking at some people's CSD designs, I have noticed
a few problems that are worth keeping an eye out for:
- When you have multiple variables that differ by a single letter,
it can easily happen that the wrong variable name gets typed. Since
there is no logical rational for typing the wrong variable name (it is
just something that happens from time to time), the resulting behavior
tends to be perplexing.
- When you grab the wrong variable name, it is possible to connect
wires that completely bypass your circuit (great for performance, but
tends not to generate right answers).
- When you grab the wrong variable name, it is also possible to
generate a loop in your circuit. Using the primitives you have been
taught so far, a circuit with a loop in it will generally not do what
you want it to. Generally the error of noticing a loop in the circuit
doesn't occur until the simulator is running, at which point the error
messages have little to do with the csd file and more to do with the
netlist nl file. If you look at the nl file, you will find comments
in it indicating what part of the nl data corresponds to what part
of the original csd file.
22 Sept 2011: A new release CSv3.110922a.tar.gz has been put up
in the usual place. There was a mistake in norm2.csd that I fixed
before class.
22 Sept 2011: A new release CSv3.110922.tar.gz has been put up
in the usual place. No changes have been made that relate to the
homework. What has been done is that this release contains the file
for the reference card CSD_RefCard.ps and CSD_RefCard and some C
programs for discussing different approaches to the normalization
problem we spent time on Tuesday and will some more in today's class
(so these will be handouts today). There is also a copy of CSD code
similar to that developed in class Tuesday under CSDExamples with
the name norm2.csd. The C programs and the CSD code all compile,
but I haven't had time to exhaustively test them to make sure they work
(with 25-bit inputs, exhaustive testing involves a table of 33,554,432
entries).
20 Sept 2011: As a general observation on problem solving, often
it works best to attack simpler related problems before going to the
target problem. For example, our target problem is -109 <= x <= 109
on 8-bit inputs. A number of simpler problems present themselves:
- -5 <= x <= 5 on 4-bit numbers
- x <= 109 on positive 8-bit numbers
- -109 <= x on negative 8-bit numbers
- -64 <= x <= 64 on 8-bit numbers
- -63 <= x <= 63 on 8-bit numbers
In the case of the first example, what is going on would be small
enough that you could output most of the relevant internal results
and hand trace what was going on.
Note that in the case of the last four, not only would you gain
understanding of how to do the homework, but you would have something
that is right for a substantial numbers of input and so worth a mark
in the 50 to 70 range.
20 Sept 2011: For details on handing in the assignment etc.,
see H1.
15 Sept 2011: Office hours announced (see top of this page).
15 Sept 2011: Forms for establishing an encryption code for online
handins will be/were handed out in class today. In case you need to change
your encryption code or lose your form, copies of the form are also
available online:
14 Sept 2011: By the way, homework 1 will involve you creating
a circuit that takes in an 8-bit 2's complement number and outputs
1 if the input is between -109 and 109 inclusive (i.e., -109, -108, ...,
0, 1, ..., 108, 109). The result of running my sample solution
through the compiler and simulator can be found in CSDExamples/H1a.result
in the CSv3.110914.tar.gz distribution. My solution used 402 transistors
and had a depth of 14 (for a performance measure of 5628). Doubtless
some of you will find even better solutions. The basic marking scheme
will be that solutions that work on all test data will get a mark between
80% and 100% based on the performance measure. Solutions that work on
a reasonable amount of test data will get a mark between 50% and 70%
based on how close they appeared to come to getting all the test data
right and how good their performance measure was. Solutions that
don't work on a reasonable amount of test data but appear to be
plausible starts on the project will get a mark between 20% and 40%,
based on how close they seemed to come to starting to work on a reasonable
amount of test data. Handing in nothing yields 0%. A more detailed
homework description will come out soon, but the above should give
you enough information to get started. We will be talking more about
using the CSD circuit notation in class Thursday (the 15th). Remember
these are NOT group projects, but individual efforts. Also, being able
to do this sort of stuff will definitely show up on exams, so the homeworks
are good practice/ study.
14 Sept 2011: CSv3.110914.tar.gz is up in the usual place. As
mentioned in Tuesday's class, in CSDExamples, there is the file
FourBitAdder.visual from the handout FourBitAdder.ps . You may recall
that FourBitAdder.csd took 360 transistors and had a depth of 20
(for a performance figure of 360 * 20 = 7200). The file FourBitAdderKM.csd
uses Karnaugh Maps to tighten up the calculation of carry improving
the circuit to 296 transistors and a depth of 17 (17 * 296 = 5032, which
is 1.4 times better). Applying DeMorgan's Law to FourBitAdderKM.csd
produces FourBitAdderKMDL.csd which uses 224 transistors and a depth of 9
(9 * 224 = 2016 which is 3.5 times better than the original). By
ad hocly tweaking the calculation of carry, I got a further improvement
with FourBitAdderKMDLX.csd of 216 transistors and a depth of 7
(7 * 216 = 1512 which is 4.76 times better than the original).
Using diff, you can see that the .result files produced by the simulator
for each of these designs are identical (except for the performance
statistics of number of transistors and depth).
13 Sept 2011: As mentioned in class today, the software and documentation
we will be using for the homeworks/ projects is in:
gaul:~webber/CS350/CSv3.110910.tar.gz
The compiler is in the root directory, and adder example can be found
in the CSDExamples subdirectory and the simulator can be found in the
CurrentSimulator subdirectory. To run the example, one would:
javac *.java
cd CurrentSimulator
javac *.java
cd ..
java CSDDesign < CSDExamples/FourBitAdder.csd > CSDExamples/FourBitAdder.nl
cd CurrentSimulator
java ExhaustiveTester < ../CSDExamples/FourBitAdder.nl > ../CSDExamples/x
The file x would then show a truth table for the circuit .
13 Sept 2011: Some links of interest re Chpt 2 & 3:
8 Sept 2011: The textbook is in the bookstore -- they have over 30 copies of it
and charge 161.50 for it. Interestingly enough,
Amazon sells the same edition for US 126.89.
8 Sept 2011: Two more references of note re Chpt 1:
7 Sept 2011: Four references of note regarding the future of computing:
Regarding history, there is, of course, a
wikipedia entry that gives a good start.
1 Sept 2011: For those with connections to past versions of this course, you
will note that we have a new textbook this semester: The Essentials of
Computer Organization and Architecture (ECOA), 3rd Edition by Linda Null and
Julia Lobur as opposed to Computer Architecture and Organization: An
Integrated Approach (CAOIA) by Miles Murdocca and Vincent Heuring. Most of
the textbooks in this area tend to be written by engineers, whereas
ECOA is written more from a computer science education point of view.
For people with some exposure to CAOIA, the following points will be of
interest:
- There is a copy of ECOA on reserve in Taylor and the exams are
closed book, so no one will be consulting their copy of the textbook
during exams.
- In CAOIA, ARC was used as an example CPU in much of the discussion.
ARC was a SPARC-like design. In CAOIA, MARIE is the used to illustrate
basic concepts. By way of comparison, ARC had 25 op codes and 32
registers whereas MARIE has 15 op codes and a single accumulator
register.
MARIE is discussed most directly in Chapter 4 of ECOA
but details about it can also be found in a word document that accompanies
the simulator's online
dowload site
and in a
general article discribing the system.
- One focus of the course is seeing how software can be converted into
hardware by sequential and combinational circuits. CAOIA handled this
in an Appendix, whereas ECOA handles in in the body of the text as
Chapter 3. In both cases the discussion is rudimentary and needs to be
supplemented by homeworks and class discussion. ECOA is probably a bit
better intro to the topic, but CAOIA expands more details.
In particular, ECOA lacks the schematic of an edge-triggered D-flipflop
as well as a discussion of carry-lookahead adders.
- Also of interest is the memory hierarchy with particular
attention to Caching. CAOIA handles this in Chapter 7 whereas ECOA handles
it in Chapter 6. At first glance, both seem to develop the same terminology
in a similar fashion.
- Also of interest is performance evaluation and benchmarking. Much
of the discussion is in CAOIA, but it is scattered, whereas ECOA has it
focussed in Chapter 11.
- Also of interest is software/hardware access to I/O. CAOIA handles
this in Chapter 8 while ECOA handles it in Chapter 7.
- As a general observation, both CAOIA and ECOA cover much the same
material, so the shift in text is not a major change in the course.
However, CAOIA tends to be more detailed in 500 pages whereas ECOA
clocks in at 738 pages (nearly 50% larger). Ultimately, I think ECOA
will prove to be a better aid to study in this course.
1 Sept 2011: Course outline
is available.