CS 2120: Class #12b
====================
.. Warning::
We are into the domain of *nontrivial* algorithms. You should **not** expect
to look at these algorithms and immediately understand how they work.
(In fact, if you can do that, you should immediately enroll in grad school
in Computer Science). These algorithms require *careful study* to make
sense of. The best tool you have for understanding how they work is to
*carefully trace their operation, step by step*. Trying to understand
complex algorithms "all in one go" is a recipe for frustration. Take your
time, and step through the algorithm (with pen and paper) on a small input to get a feel
for what it's doing.
Sorting
^^^^^^^^
* Believe it or not, there are a *lot* of ways to sort a list!
* Some algorithms are just plain bad.
* Some are usually pretty good.
* There is no *best* algorithm for *any* list. Just tradeoffs.
* We're going to look at a few simple sorting algorithms now.
* At the end of the class, we're going to learn about *recursion*, which will allow
us to revist sorting and see some really cool (and efficient) algorithms.
Insertion sort
^^^^^^^^^^^^^^^
.. image:: ../img/insertion.gif
* You give me a list called ``inlist``
* I create a new, empty, list called ``sortedlist``
* For each element in ``inlist``
* I *insert* that element into ``sortedlist`` in the correct spot.
* e.g., if I'm asked to insert ``5`` into the list ``[1,3,7]``, I should end up with: ``[1,3,5,7]``
.. admonition:: Activity
Do an insertion sort, with pencil and paper, on the list ``[3,7,15,9,4,11,1,5,2]``. Record
the value of ``sortedlist`` at each step.
* If our lists were huge and we didn't want to waste space, we could make a
small change and *remove* each element from ``inlist`` before we insert it into
``sortedlist``.
Let's have a look at insertion sort in Python::
def insertion_sort(inlist):
sorted_list = []
for element in inlist:
i = 0
while i < len(sorted_list) and (element > sorted_list[i]):
i = i + 1
sorted_list.insert(i,element)
return sorted_list
.. admonition:: Activity
Modify the ``insertion_sort()`` function above so that it prints out the value
of ``sorted_list`` after each iteration of the `for` loop. Try sorting a few
lists and following the output. Does it make sense to you?
* Now that we've got the idea down, let's be computer science nerds about it.
.. admonition:: Activity
How many times do I go around the `for` loop in ``insertion_sort()`` ? On each
trip through the `for` loop, I also have to go through the inner `while` loop.
* How many times do I go through the `while` loop, on average?
* In the worst case?
Is Insertion sort the best possible sort? Can we do better?
.. raw:: html
Selection Sort
^^^^^^^^^^^^^^^
.. image:: ../img/selection.gif
* You give me a list called ``inlist``
* I scan through the whole list to find the smallest element
* I swap the smallest element with the first element in the list
* I repeat the above process for the remainder of the list (excluding the first element)
* Lather, rinse, repeat.
.. admonition:: Activity
Do a selection sort, with pencil and paper, on the list ``[3,7,15,9,4,11,1,5,2]``. Record
the value of your list at each step.
* Different idea than Insertion sort, but still gets the job done!
* This is a very important thing to understand:
* *Sorting* is a *problem*, not an algorithm
* There are (infinitely) *many* algorithms to solve any (solvable) problem
* Some algorithms will always solve the problem more efficiently than others
* Some will solve the problem more efficiently only for certain conditions
* For some problems we can *prove* that a particular algorithm is the best
(in the sense that any other algorithm can, at best, be equally efficient)
* For many problems, we *still don't know* how to do this!
* Fortunately, for sorting we *do* know how to do this analysis... and both
Insertion Sort and Selection sort kinda suck.
Let's see Selection sort in action::
def selection_sort(inlist):
for i in range(len(inlist)):
# Find the smallest remaining element
min_index = i
min_val = inlist[i]
for j in range(i+1,len(inlist)):
if inlist[j] < min_val:
min_val = inlist[j]
min_index = j
# Swap it to the left side of the list
inlist[min_index] = inlist[i]
inlist[i] = min_val
return inlist
.. admonition:: Activity
Modify the ``selection_sort()`` function above so that it prints out the value
of ``inlist`` after each iteration of the outer `for` loop. Try sorting a few
lists and following the output.
.. admonition:: Activity
How many times do I go around the outer `for` loop in ``selection_sort()`` ? How
about the inner `for` loop?
.. raw:: html
Bubble Sort
^^^^^^^^^^^^
.. image:: ../img/bubble.gif
* Maybe you find Insertion sort or Selection sort ugly or offensive?
* No problem. Remember: there are *many* algorithms to solve any one problem.
* You give me a list called ``inlist``
* I scan through the list, looking at adjacent pairs of values.
* If I see a pair that is "out of order" (e.g., ``[17, 9]`` ), I swap the two values to be in order ( ``[9,17]`` ).
* I keep doing that until the list is sorted.
.. admonition:: Activity
Do a bubble sort, with pencil and paper, on the list ``[3,7,15,9,4,11,1,5,2]``. Record
the value of your list at each step.
* It's called "bubble sort" because the smaller values seem to "bubble up to the top".
* Kinda cool because:
* We end up effecting a *global* change on the list (it goes from unsorted to sorted)...
* ... but we only use *local* information about the elements (we only ever compare neighbours in the list)
Let's see Bubble sort in Python::
def bubble_sort(inlist):
swapped_something = True
while swapped_something:
swapped_something = False
for i in range(len(inlist)-1):
if inlist[i] > inlist[i+1]:
tmp = inlist[i]
inlist[i]=inlist[i+1]
inlist[i+1]=tmp
swapped_something = True
return inlist
* Ugh... Wouldn't the above code be better if there were comments?
.. admonition:: Activity
Modify the ``bubble_sort()`` function above so that it prints out the value
of ``inlist`` after each iteration of the outer `while` loop. Try sorting a few
lists and following the output.
.. admonition:: Activity
How many times do I go around the outer `while` loop ? How
about the inner `for` loop?
.. admonition:: Activity
When I swapped elements, I used a third, temporary, variable called ``tmp``. Could I have done
the swap without ``tmp``? Was using ``tmp`` wasteful?
.. raw:: html
Bogosort
^^^^^^^^^^
Here's another attempt at a sorting algorithm::
import random
def is_sorted(inlist):
last = inlist[0]
for element in inlist[1:]:
if last > element:
return False
last = element
return True
def bogosort(inlist):
while not is_sorted(inlist):
random.shuffle(inlist)
return inlist
.. admonition:: Activity
How does this sorting algorithm work? We're "working backwards" this time. Starting from the code,
come up with an English explanation for how the algorithm works. You might want to add a ``print`` statement
after the ``random.shuffle(inlist)`` line to get some intuition. If you aren't sure what ``random.shuffle()``
does... look it up, or just *try* it on some sample lists. Likewise, you'll have to figure out what
``is_sorted()`` is doing (though the name should help).
.. admonition:: Activity
Is this a good sorting algorithm? How many times do I have to go through the ``while`` loop in
``bogosort``? How about the ``for`` loop in ``is_sorted()``?
WTF!?
^^^^^
* With respect to SEARCHING, why would someone want to sort a list in order to search it slightly faster when sorting is so slow?
* Well, we might want to search the same list many times.
* We only need to sort it once.
* We might want to sort something without the end goal of searching.
* BUT, actually there are better sorting algorithms...
Why are we doing this again?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* In your day-to-day life as a research programmer, you won't write your own
sorting routines. You'll rely on routines written by others, like Python's
built-in ``sort()`` (which, by the way, uses the `Timsort algorithm `_ )
* BUT... even if you don't build the tools yourself, you should understand how they work
* More importantly: you may need to develop your own algorithms for some task that is
much less well-studied than sorting.
* You're learning fundamentals of algorithm development here... not just the details of sorting.
The horrible truth
^^^^^^^^^^^^^^^^^^^
* All three of Insertion, Selection, and Bubble sort generally suck as sorting algorithms.
* BUT... they are within our current means.
* Once we've studied *recursion*, we will revist sorting and see two *very good* sorting
algorithms (Quicksort and Mergesort).
* If you want to geek out on sorting *right now*:
* `The relevant Wikipedia page is very good `_
* Knuth's `The Art of Computer Programming Volume 3: Sorting and Searching `_ .
* It would be nearly impossible to overstate the importance of Donald Knuth's contributions to Computer Science.
Let's see some sorting in action!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* http://www.sorting-algorithms.com/