CS 2120: Class #9 ================= .. This chapter on NumPy Arrays, Matrices and floats! Array -- first steps ^^^^^^^^^^^^^^^^^^^^^ * The *list* was our first data structure. * Now we're going to meet a similar, but slightly different, one: the *array* * Let's get started: >>> a=numpy.array([5,4,2]) >>> print a [5 4 2] * Looks a lot like a list, doesn't it? * Can we manipulate it like a list? >>> print a[0] 5 >>> print a[1] 4 * We can definitely *index* it, the same as a list. * I wonder if arrays are *mutable*? >>> a[1]=7 >>> print a [5 7 2] * Yes, arrays are *mutable*. * With lists, I could mix types in a single list. Like this: >>> l = [5,4,3] >>> l[2] = 'walrus' >>> print l [5, 4, 'walrus'] * Can I do that with arrays? >>> a=numpy.array([5,4,2]) >>> a[2]='walrus' Traceback (most recent call last): File "", line 1, in ValueError: invalid literal for long() with base 10: 'walrus' * Ah ha! We found a way in which arrays are different. * Lists are just collections of stuff. Any old stuff. Each element can be of a different type. * In an array, *every element must have the same type*! .. admonition:: Activity Create two arrays of integers, each having the same number of elements. What mathematical operations can you do on the arrays? (``+,-,*,/``). What happens if you try to perform the operations on arrays of different sizes? \ How does ``+`` work differently on arrays than lists? NumPy object attributes ^^^^^^^^^^^^^^^^^^^^^^^^^ * We can ask NumPy what type the items in an array have like this: >>> a.dtype dtype('int64') * This is a new notation for us. We're used to passing something to a function, which will tell us the type. Like this: >>> type(something) * Here, we instead asked NumPy to tell us about an *attribute* of our array (In this case the attribute ``dtype`` -- standing for "data type"): >>> a.dtype * *Objects* in NumPy have many attributes. These will mostly get set automatically for you, but we'll need to set a few of them manually later. * If you want to see all the attributes your array has you can either type ``a.`` and then press the [Tab] key (if you're using ipython) or you can do: >>> dir(a) * That's a lot of attributes! * Some of those attributes are things like ``dtype`` that store information about the state of the object. * Some are special functions that can only be applied to that object. For example, every NumPy object comes with it's own ``view`` function. Check it out: >>> a.view() array([5, 4, 2]) * When a function appears after a ``.`` , that function is automatically applied to the object appearing before the ``.`` * These special functions built in to objects can also take parameters. * For example, we can change the types of the elements of our array: >>> b=a.astype(numpy.float32) >>> b.view() array([ 5., 4., 2.], dtype=float32) .. raw:: html .. admonition:: Activity Create an array ``a = numpy.array([1,2,6,7,5,4,3])``. Figure out to find/do the following things with attributes of ``a``: 1. "view" ``a`` 2. sort ``a`` 3. find the maximum value in ``a`` 4. find the minimum value in ``a`` 5. add up (sum) all the values in ``a`` 6. find the average (mean) of the values in ``a`` Making arrays bigger ^^^^^^^^^^^^^^^^^^^^^ * With lists, we could always append items to make them bigger (``+``) >>> [1,2,3] + [4] [1,2,3,4] * Arrays are meant to have *fixed* size. * Why do you think this is? * If you really, really, want to make an array bigger... you can't. * You *can* however, make a *new* array that is bigger using ``numpy.append()``: >>> a = numpy.array([1,2,3,4]) >>> a.view() array([1, 2, 3, 4]) >>> b = numpy.append(a,5) >>> a.view() array([1, 2, 3, 4]) >>> b.view() array([1, 2, 3, 4, 5]) * **Note** carefully that ``numpy.append()`` did *not* change *a*. It created a **new** array, *b*. .. admonition:: Activity Create an array of 4 integers. Create a new, bigger, array by appending the integer ``7`` on to your array. Create another new array by appending the string ``'walrus'``. Did that last one work? What happened? Flexibility vs Power ^^^^^^^^^^^^^^^^^^^^^ * Arrays are less flexible than lists: * We can't change their size * They can only store data of a single type * But... it is this very lack of flexibility that lets us do all sorts of cool stuff like have a ``.sum()`` attribute. .. admonition:: Activity How would you implement ``.sum()`` for a list? Higher dimensions ^^^^^^^^^^^^^^^^^^ * Numpy arrays generalize to higher dimensions. * Let's create a 2D array: >>> a=numpy.array([[1,2,3],[4,5,6],[7,8,9]]) >>> a.view() array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) * Note the format in our call to ``numpy.array``. A list of lists. * Each row of the array gets its own list. * As long as two 2D arrays have the same *shape*, you can do arithmetic on them, just like 1D arrays. * How do we check the *shape* of an array? >>> a.shape (3, 3) .. admonition:: Activity Create a 4x4 array. Verify that it has ``shape`` ``(4,4)``. You've changed your mind. The array should actually be 2x8. ``reshape`` your 4x4 array in to a 2x8 array without recreating it from scratch. Verify that the reshaped array is ``(2,8)``. Finally ``flatten`` your 2D array into a 1D array. Starting points ^^^^^^^^^^^^^^^^ * Sometimes you want an array of shape ``(n,m)`` that contains all zeros: >>> a=numpy.zeros([n,m]) * Guess what ``numpy.ones()`` does? * How about ``numpy.eye()``? Slicing ^^^^^^^^^^ * We've already seen that you can index arrays like lists (and strings) * Likewise, you can use Python's powerful *slicing* on arrays. .. admonition:: Activity Create an array ``arr = numpy.array([0,1,2,3,4,5,6,7])``. Using a single command 1. Print the first 3 elements 2. Print the last 3 elements 3. Print the even elements of ``arr`` * Slicing works for higher dimensional arrays, too. For example: >>> a=numpy.arange(25).reshape(5,5) >>> a.view() array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]) >>> print a[0:2,1:4] [[1 2 3] [6 7 8]] * Note the use of ``numpy.arange`` which works like ``range`` but returns an array. * If you want a whole column/row/etc, you can use a plain ``:`` as the index. For example, if I wanted to pull out every row of the first two columns: >>> print a[:,0:2] [[ 0 1] [ 5 6] [10 11] [15 16] [20 21]] .. admonition:: Activity Modify the previous command to print all of the columns of the first two *rows*. For loops ^^^^^^^^^^ * If ``for`` loops work for lists, do you think they'll work for arrays? .. admonition:: Activity Write a function ``printeach(arr)`` that uses a ``for`` loop to print each element of an array that is passed in as a parameter. Test it on a 1D array. Now try a 2D array. If you're feeling bold, how about a 3D array? .. raw:: html NumPy Matrices ^^^^^^^^^^^^^^^^ * NumPy makes a distinction between a 2D array and a matrix. * Every matrix is definitely a 2D array, but not every 2D array is a matrix. >>> a = numpy.eye(4) >>> b=numpy.array([1,2,3,4]) >>> b*a array([[ 1., 0., 0., 0.], [ 0., 2., 0., 0.], [ 0., 0., 3., 0.], [ 0., 0., 0., 4.]]) * That... wasn't what we expected. It's a perfectly reasonable interpretation of our request, but here we really wanted matrix-vector multiplication. * If we want Python to treat the NumPy arrays as matrices and vectors, we have to explicitly say so. Like this: >>> numpy.dot(a,b) array([ 1., 2., 3., 4.]) * Another option is to convert ``a`` to the ``matrix`` type: >>> a = numpy.matrix(a) >>> b *a matrix([[ 1., 2., 3., 4.]]) * The preferred option is the first one. Keep everything as an array and use ``numpy.dot`` when you want matrix/vector multiplication. * Moral of the story: if you want your 2D array to behave like a matrix when you do things like multiplication... you need to tell Python that it's a matrix. .. admonition:: Activity Write a function ``chain(matrix,n)`` that will take an input matrix, and return the result of multiplying it by itself ``n`` times. Test it out on a square matrix with every entry between ``0`` and ``1`` and the entries in each row summing exactly to ``1``. Like this: >>> a=numpy.array([[0.2,0.8,0.0],[0.1,0.3,0.6],[0.0,0.8,0.1]]) This is an example of something called a `Right Stochastic Matrix `_ .. raw:: html .. note:: That last activity took you most of the way to implementing a `Markov Chain `_ . If your interests run towards state-based probabilistic simulation... you'll want to follow up on this. NumPy Linear Algebra ^^^^^^^^^^^^^^^^^^^^^ * As you might expect, NumPy already has a whoooole bunch of linear algebra routines built right in: >>> a=numpy.random.rand(5,5) # Create a random matrix >>> matshow(a) # Visualize it >>> numpy.linalg.eigvals(a) # Find its eigenvalues >>> numpy.linalg.det(a) # Find its determinant >>> numpy.linalg.inv(a) # Find its inverse >>> b=numpy.random.rand(5) # Create a random vector >>> numpy.linalg.solve(a,b) # Solve the system of linear equations ax = b * And much, *much*, more. * Bottom line: If there's linear algebra you need to do, NumPy/SciPy *almost certainly* already have built-in routines to do it. Google and the NumPy docs are your friends! Going further with NumPy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * We've only just scratched the surface of what NumPy can do. * If you foresee yourself using NumPy frequently, read the assigned reading carefully. * Even that is just the bare-bones basics. * You'll eventually want to dig in to `The NumPy docs `_. Don't try to read those "cover-to-cover" though. The best way to use the docs is to look up stuff that you actually need and just read about that. Eventually you'll get a feel for the capabilities of the package. For next class ^^^^^^^^^^^^^^^ * If you think you'll use NumPy a lot (and you probably will), read the `NumPy Tutorial `_. * Just Skim. Reading it in detail at this point might give you a headache. * `Read the parts 1 through 4 of chapter 11 `_ * `Read a little bit of chapter 12 `_ * `Read chapter 13 of the text to get an idea of wtf objects are `_ Don't panic. Just get an idea of wtf they are.