NA Python recap

Section 4.9 Python recap

Matrices. In the code below we show how to define a matrix in numpy. In the example we use a \(2\times2\) matrix but the same syntax can be used for any \(n\times m\) matrix.

Operations with matrices. Matrices can be added, subtracted, multiplied and inverted. Moreover, matrices can be transposed, namely their rows become columns and viceversa. The code below shows examples of such operations. There are two important things to remark:

The multiplication operator for matrix multiplication is @.
Using the inversion to solve a system is not optimal, better using the LU decomposition or other methods.

Vectors and covectors. Arrays are by default row vectors. Technically, these are called covectors. A row vector v can be made into a column vector by transposition. Note that matrices act (by multiplication) on column vectors only from the right and on row vectors, i.e. covectors, only from the left, as the code below illustrates:

The time library. In order to understand which implementation is faster, we need to have a way to evaluate the time spent by Python to execute some set of code lines. We can achieve this with the time library. This library works as a clock, namely the call time.time() returns the current time in milli-seconds. Hence, by storing the time right before and right after some set of lines, we get the time spent on those lines by taking the difference of these two numbers, as shown in the example below.

Vectorized calculations in NumPy. Python's for loops are much slower than loops in C/C++. The reason is that variables in Python are not declared, namely Python must find out by itself which kind of data is in a given variable. This means that, inside a for cycle, at each iteration, Python has to perform a series of checks, such as determining the type of variable, resolving its scope (namely the "lifetime" of the variable) and checking for invalid operations. Over a large number of iterations, this might mean a big slowdown (we will verify this below!).

When arrays are the main dish of a loop, this bottleneck can be bypassed by using NumPy's "vectorization". The easiest way to understand vectorization is by examples. In the example below we print the runnning times for the evaluation of the dot product of two large vectors (\(10^4\) components) implemented in three different ways: first with a "naif" for loop, then by applying the sum function to the array a1*a2 (whose component \(i\) is equal to a1[i]*a2[i]), finally by applying to the arrays a1,a2 the function dot.