**matplotlib** is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. **matplotlib** can be used in python scripts, the python and ipython shell, web application servers, and some graphical user interface toolkits. matplotlib makes easy things easy and hard things possible! In the context of the **Super Computing Challenge** matplotlib can really make your final project report look professional! matplotlib provides a MATLAB-like plotting framework. You can check out more about matplotlib here: [https://matplotlib.org/3.3.1/index.html](https://matplotlib.org/3.3.1/index.html).

There are a few things you will need to install in order to be able to utilize the **matplotlib** module. There are also a few assumptions I have made about your coding level in order to introduce this topic. First, I expect that you are at least familiar with basic **python** syntax and could easily generate a `hello world` program in python and run it. I also expect that you understand how to **import** different modules into your python programs. You might even have some experience drawing with the **turtle** module. Ultimately, you don't need too much experience with python in order to utilize the **matplotlib** module ~ this is the glory of python! A little bit of perseverence will go a long way (:

Another assumption I am making is that you have access to a **Linux** environment. You can still run python on any OS, but the basic installation instructions will be different for a Windows or Mac computer. Instructions you see below are meant to be run in a Linux environment.

So, back to that part about what you will need to install to generate awesome figures for your final report and really wow those judges. First, you will want to make sure you have **python3** installed in your computer.

`python3 --version`

If this command errors on you, you will need to install python.

`sudo apt install python3`

Once you have python3 installed, you will want to install **pip**, the package manager used by python.

`sudo apt install python3-pip`

With pip installed, you are now ready to grab **matplotlib** and get started making static, animated, and interactive visualizations in python!

`python3 -m pip install matplotlib`





Let's take a look at a very simple figure, a line graph. It's going to be a little boring at first, but don't you worry, we will get into some pretty interesting looking figures by the end of this presentation. As someone famous once said: 
> *A journey of a thousand miles begins with a single step.*

So, without further ado, let's take our first **step** into our journey with matplotlib.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4])
plt.show()

Pretty boring ... but hey, its just out first step, right?

Let's jump in and take a look at what these two functions actually generated. The first one `plt.plot([1,2,3,4])` is the function to generate the data we want displayed on the graph, but notice we only really gave it **part** of that data. So which part of the data did we provide? The **y-axis** data for four **(x,y)** points that could be connected by a line. The location of the data within the brackets ([]) is what provides matplotlib with the **x-axis** data. So the points that we provided to the figure were: (0,1), (1,2), (2,3), (3,4).

The `plt.show()` function is used to generate the figure with the data we provided in the earlier functions. 

Now that we can at least generate a figure, let's begin to make our simple figure a little more complex.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4])
plt.ylabel('A label on the y-axis')
plt.xlabel('Another label for the x-axis')
plt.show()

I bet your maths teacher wishes **all** your figures looked this good! 

We can now label our axes, and provide some y-axis data when generating our figures. But what if we want to provide both x-axis and y-axis data?

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4], [1,4,9,16])
plt.ylabel('Squared values')
plt.xlabel('Values to be sqaured')
plt.show()

Up to this point we have been using the basic blue line to generate our figures. But **matplotlib** has sooooo much more to offer. Let's take a look at how we might alter the way our **data appears** on the figure.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4], [1,4,9,16], 'r+')
plt.ylabel('Squared values')
plt.xlabel('Values to be squared')
plt.show()

The `r+` we used in the `plt.plot()` function displayed our data as **red +'s** instead of the default blue line. That is just one of many options available within matplot lib for visualizing data. Some other colors you can use within matplotlib include the following:

|character| color  |
|:---     |    ---:|
|'b'      | blue   |
|'g'      | green  |
|'r'      | red    |
|'c'      | cyan   |
|'m'      | magenta|
|'y'      | yellow |
|'k'      | black  |
|'w'      | white  |

Some of the options for line or marker styles include (but are not limited to) the following:

|character | description          |
| :---     |                -----:|
|'-'       | solid line style     |
|'--'      | dashed line style    |
|'-.'      | dash-dot line style  |
|':'       | dotted line style    |
|'.'       | point marker         |
|','       | pixel marker         |
|'o'       | circle marker        |
|'v'       | triangle-down marker |
|'^'       | triangle-up marker   |
|'<'       | triangle-left marker |
|'>'       | triangle-right marker|
|'1'       |tri-down marker       |

We can now begin to alter the way our data appears in our figures by using some of the options above. We can use these options to differentiate between two data sets displayed in a single figure.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4], [1,4,9,16], 'r+')
plt.plot([1,2,3,4], [1,2,3,4], 'bo-')
plt.show()

I can also generate the above figure by combining the `plt.plot()` functions into a single function call.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4], [1,4,9,16], 'r+', [1,2,3,4], [1,2,3,4], 'bo-')
plt.show()

Aside from being able to change the color of our data and the way that data appears (line, dotted-line, marker, etc) we can also change a few other things about the appearance of the data in our figure. We can change the **label**, **linewidth**, and **markersize** in our figures.

Line properties we can change include (but are not limited to):
- **label**: set the label within the legend
- **linewidth**: set the linewidth in points
- **markersize**: set the marker size in points

If I use the **label** property, I also need to tell matplotlib to include a legend for my figure by using the `plt.legend()` function.

In [None]:
import matplotlib.pyplot as plt 
plt.plot([1,2,3,4], [1,4,9,16], 'r+', label='line 1', markersize=10)
plt.plot([1,2,3,4], [1,2,3,3], 'bo-', label='line 2', linewidth=5)
plt.legend()
plt.show()


Instead of putting our two data sets onto the same figure, we can also generate two different plots, or **subplots** to hold the data.

In [None]:
import matplotlib.pyplot as plt 
plt.subplot(211)
plt.plot([1,2,3,4], [1,4,9,16], 'r+', markersize=10)
plt.subplot(212)
plt.plot([1,2,3,4], [1,2,3,4], 'bo-', linewidth=5)
plt.show()

So far, we have had some fun with line charts and we have a basic idea of how we can utilize the different functions of matplotlib to alter the way our data appears within the line chart. Let's take a look at a different type of chart, the **bar chart**.

To create a barchart, we will use the `plt.bar()` function to create a bar chart. By default the **width** of a bar is 0.8. 

In [None]:
import matplotlib.pyplot as plt 
plt.bar(list(range(1, 4+1)), [1,4,9,16], align='center', width=0.5)
plt.show()

Notice that the **center** of each bar hovers over the tick of the x-axis. This is what the `align='center'` function does. 

Generally, when we make a bar chart, the labels on the x-axis, or **xticks** should be meaningful. Right now, in the chart above, those numbers really don't have any relavance to our data. So let's take a look at how to alter those xticks so that the labels are meaningful.

In [None]:
import matplotlib.pyplot as plt 
lst = list(range(1, 4+1))
plt.bar(lst, [1,4,9,16], align='center', width=0.5)
glst = ['g' + str(i) for i in lst]
plt.xticks(lst, glst)
plt.show()

Just like we can have two charts in a single line graph figure, we can place two charts into a single bar graph figure. 

In [None]:
import matplotlib.pyplot as plt 
lst = list(range(1, 4+1))
plt.bar(lst, [1,4,9,16], align='center', width=0.5)
plt.bar(lst, [1,2,3,4], color='r', align='center', width=0.5)
plt.show()

The graph above places one bar graph on top of the other. But what if we wanted to set one bar graph **beside** the first bar graph. We can do this by offsetting the midpoint of our second bar graph. Let's take a look at the code to do this.

In [None]:
import matplotlib.pyplot as plt 
lst = list(range(1, 4+1))
plt.bar(lst, [1,4,9,16], align='center', width=0.5)
plt.bar([i+0.5 for i in lst], [1,2,3,4], color='r', align='center', width=0.5)
plt.show()

Let's add **xticks** to the above graph so that we can provide proper labels for the double bar graph. 

In [None]:
import matplotlib.pyplot as plt 
lst = list(range(1, 4+1))
glst = ['g' + str(i) for i in lst]
plt.bar(lst, [1,4,9,16], align='center',width=0.5)
plt.bar([i+0.5 for i in lst], [1,2,3,4], color='r', align='center', width=0.5)
plt.xticks([i+0.5/2 for i in lst], glst)
plt.show()

What if we wanted to add a little spacing between these double bars? How would that alter the code above?

In [None]:
import matplotlib.pyplot as plt 
w = 0.35
lst = list(range(1, 4+1))
glst = ['g' + str(i) for i in lst]
plt.bar(lst, [1,4,9,16], align='center', width=w)
plt.bar([i+w for i in lst], [1,2,3,4], color='r', align='center', width=w)
plt.xticks([i+w/2 for i in lst], glst)
plt.show()

Another pretty common type of chart you may want to try out are **scatter plots**. The make a scatter plot of *x* vs. *y*, where *x* and *y* are sequence like objects of the same length.

For example, let's generate a scatter plot of 10 points. To make this a little more interesting, let's use the `random` module available in python. The random module simply allows us to generate random values.

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1,10) for i in range(1, 10+1)]

plt.scatter(x,y)
plt.show()

The dots in the chart above are fairly small ... let's have some fun with the color and size of these dots. Much like we can change the width of lines and the size of markers in line charts, we can alter the appearance of **points** in our scatter plots.

In [None]:
import matplotlib.pyplot as plt 
import random

x = [i for i in range(1, 10+1)]
y = [random.randint(1, 10) for i in range(1, 10+1)]

plt.scatter(x, y, s=200)
plt.show()

Now we have some decently sized dots, let's try and change the color of our dots.

In [None]:
import matplotlib.pyplot as plt 
import random

x = [i for i in range(1, 10+1)]
y = [random.randint(1, 10) for i in range(1, 10+1)]

plt.scatter(x, y, s=200, c='red')
plt.show()

Sometimes you will want to use **color** to show something **about** your data. For example, a heat map where the color of a marker relays information about how hot or cold a specific point is. We can perform this type of data visualization within our scatter charts by changing the **color** of the marker based on the location of that marker on the y axis. 

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1,10) for i in range(1, 10+1)]

plt.scatter(x, y, s=200, c=y)
plt.show()

Notice that the colors change from dark purple when y is small and change from blue, blue-green, green, and finally to yellow as y increases. These colors are generated using the basic **RGBA** format (red green blue alpha). 

The RGB color model is an additive color model in which red, green, and blue light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additive primary colors, red, green, and blue. The **alpha** is a color component that represents the degree of transparency (or **opacity**) of a color. In other words, the alpha component determines how **see-through** our color is going to be. An **alpha of 1** generates an **opaque** color, while an **alpha of 0** means that a color is completely **transparent** or clear.

Armed with this knowledge of color, let's alter the above figure to have more transparent markers (not completely transparent, mind you) so that we can see some blending when our markers meet. We will also enlarge our markers significantly so that we are much more likely to have overlapping markers so that we can see this blending occuring.

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1,10) for i in range(1, 10+1)]

plt.scatter(x, y, s=10000, c=y, alpha=0.5)
plt.show()

This time, let's alter the size of our markers based on their location on the y-axis.

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1, 10) for i in range(1, 10+1)]

plt.scatter(x, y, s=[i*500 for i in y], c=y, alpha=0.5)
plt.show()

Sometimes you have a piece of data that you want to specifically mark in your figure. We can do this in matplotlib by utilizing the **annotate** function, which creates a text string referring to a specific data point.

The annotate function looks like this:
`plt.annotate(s, xy)`

where **s** is a string that holds the text you want to use for your annotation, and **xy** is the data point in coordinates that you want to annotate.

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1,10) for i in range(1, 10+1)]

plt.scatter(x, y, s=200, c=y)
plt.annotate(s='annotation text', xy=(x[0],y[0]))
plt.show()

As you can see in the chart above, the *annotation text* that we added gets overrun a little bit by the data point that we wanted to mark. We can use `xytext` to set the location in coordinates at which we want our text to begin. We can also use `arrowprops`, a dictionary of line properties for the arrow that connects the annotation to the data point we want to annotate.

Let's take a look at how we might alter the code to include specific annotation text placement along with defining a type of arrow that will point from the annotation text to the specified data point.

For even more information on annotation points visit the matplotlib tutorials on annotations: [https://matplotlib.org/tutorials/text/annotations.html](https://matplotlib.org/tutorials/text/annotations.html)

In [None]:
import matplotlib.pyplot as plt 
import random 

x = [i for i in range(1, 10+1)]
y = [random.randint(1,10) for i in range(1, 10+1)]

plt.scatter(x, y, s=200, c=y)
plt.annotate(s='annotation text', xy=(x[0],y[0]), xytext=(2,5), arrowprops={"arrowstyle": "wedge, tail_width=1"})
plt.show()

The figures we have taken a look at represent only the tip of the iceberg. There are so many interesting and publication quality figures you can generate using matplotlib. For example, some of the other plots you can create include (but are not limited to):

- histograms: a graphical representation that organizes a group of data points into user-specified ranges.
- path: allows you to define paths in your visualization using `moveto`, `lineto`, and `curveto` commands to draw simple and compound lines consisting of line segments and splines.
- pie charts: a graphical representation in which a circle is divided into sectors that each represent a portion of the whole.
- table: a graphical representation in which data is organized into columns and rows.
- polar chart: a graphical representation which utilizes a spherical coordinate system.

In closing, let's take a look at some more complicated and interesting figures that can be developed in matplotlib.

In [None]:
import matplotlib 
import matplotlib.pyplot as plt 
import numpy as np 

# Data for plotting
t = np.arange(0.0, 2.0, 0.01)
s = 1 + np.sin(2 * np.pi * t)

fig, ax = plt.subplots()
ax.plot(t, s)

ax.set(xlabel='time (s)', ylabel='voltage (mV)', title='A simple sine wave')
ax.grid()

plt.show()

In [None]:
import numpy as np 
import matplotlib.cm as cm 
import matplotlib.pyplot as plt 
import matplotlib.cbook as cbook 
from matplotlib.path import Path 
from matplotlib.patches import PathPatch 

# Fixing random state for reproducibility
np.random.seed(19680801)

First we will generate a simple bivariate normal distribution made up of two independent random variables.

In [None]:
delta = 0.025
x = y = np.arange(-3.0, 3.0, delta)
X, Y = np.meshgrid(x, y)
Z1 = np.exp(-X**2 - Y**2)
Z2 = np.exp(-(X-1)**2 - (y-1)**2)
Z = (Z1 - Z2) * 2

fig, ax = plt.subplots()
im = ax.imshow(Z, interpolation='bilinear', cmap=cm.RdYlGn, origin='lower', extent=[-3, 3, -3, 3], vmax=abs(Z).max(), vmin=-abs(Z).max())
plt.show()

It is also possible to show images of pictures.

In [None]:
# A sample image
with cbook.get_sample_data('ada.png') as image_file:
    image = plt.imread(image_file)

fig, ax = plt.subplots()
ax.imshow(image)
ax.axis('off') # Clear the x-axis and y-axis

# Add another image
w,h = 512, 512

with cbook.get_sample_data('ct.raw.gz') as datafile:
    s = datafile.read()

A = np.frombuffer(s, np.uint16).astype(float).reshape((w,h))
A /= A.max()

fig, ax = plt.subplots()
extent = (0, 25, 0, 25)
im = ax.imshow(A, cmap=plt.cm.hot, origin='upper', extent=extent)

markers = [(15.9, 14.5), (16.8, 15)]
x,y = zip(*markers)
ax.plot(x, y, 'o')

ax.set_title('CT density')

plt.show()

The `pcolormesh()` function can make a colored representation of a two-dimensional array, even if the horizontal dimensions are unevenly spaced. The `contour()` function is another way to represent the same data.

In [None]:
import matplotlib 
import matplotlib.pyplot as plt 
from matplotlib.colors import BoundaryNorm 
from matplotlib.ticker import MaxNLocator 
import numpy as np 

np.random.seed(19680801)
Z = np.random.rand(6, 10)
x = np.arange(-0.5, 10, 1) # len = 11
y = np.arange(4.5, 11, 1) # len = 7

fig, ax = plt.subplots()
ax.pcolormesh(x,y,Z)

The `hist()` function automatically generates histograms and returns the bin count probabilities. In addition to the basic histogram, this demo shows a few optional features:

- setting the number of data bins
- the *density* parameter, which normalizes bin heights so that the integral of the histogram is 1. The resulting histogram is an approximation of the probability density function.
- setting the face color of the bars.
- setting the opacity (alpha value).

Selecting different bin counts and sizes can significantly affect the shape of a histogram. The Astropy documentation has a great [section](https://docs.astropy.org/en/stable/visualization/histogram.html) on how to select these parameters.

In [None]:
import matplotlib 
import numpy as np 
import matplotlib.pyplot as plt 

np.random.seed(19680801)

# Example data
mu = 100    # mean of distribution
sigma = 15  # standard deviation of distribution
x = mu + sigma * np.random.randn(437)

num_bins = 50

fig, ax = plt.subplots()

# The histogram of the data
n, bins, patches = ax.hist(x, num_bins, density=True)

# Add a 'best fit' line
y = ((1/ (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma *(bins - mu))**2))
ax.plot(bins, y, '--')
ax.set_xlabel('Smarts')
ax.set_ylabel("Probability density")
ax.set_title(r'Historgram of IQ: $\mu=100$, $\sigma=15$')

# Tweak spacing to prevent cliping of ylabel
fig.tight_layout()
plt.show()

You can add arbitrary paths in matplotlib using the `matplotlib.path` module. The following example shows how to create `Path` and `PathPatch` objects through matplotlib's API.

In [None]:
import matplotlib.path as mpath 
import matplotlib.patches as mpatches 
import matplotlib.pyplot as plt 

fig, ax = plt.subplots()

Path = mpath.Path
path_data = [
    (Path.MOVETO, (1.58, -2.57)),
    (Path.CURVE4, (0.35, -1.1)),
    (Path.CURVE4, (-1.75, 2.0)),
    (Path.CURVE4, (0.375, 2.0)),
    (Path.LINETO, (0.85, 1.15)),
    (Path.CURVE4, (2.2, 3.2)),
    (Path.CURVE4, (3, 0.05)),
    (Path.CURVE4, (2.0, -0.5)),
    (Path.CLOSEPOLY, (1.58, -2.57)),
]

codes, verts = zip(*path_data)
path = mpath.Path(verts, codes)
patch = mpatches.PathPatch(path, facecolor='r', alpha=0.5)
ax.add_patch(patch)

# Plot control points connecting lines
x, y = zip(*path.vertices)
line, = ax.plot(x, y, 'go-')

ax.grid()
ax.axis('equal')
plt.show()

The mplot3d toolkit has support for simple 3D graphs including surface, wireframe, scatter, and bar charts. The following example demonstrates plotting a 3D surface colored with the coolwarm color map. The surface is made opaque by using `antialiased=False`. This example also demonstrates using the `LinearLocator` and custom formatting for the z-axis tick labels.

In [None]:
import matplotlib.pyplot as plt 
from matplotlib import cm 
from matplotlib.ticker import LinearLocator 
import numpy as np 

fig, ax = plt.subplots(subplot_kw={"projection": "3d"})

# Make the data
X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)

# Plot the surface
surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm, linewidth=0, antialiased=False)

# Customize the z axis
ax.set_zlim(-1.01, 1.01)
ax.zaxis.set_major_locator(LinearLocator(10))

# A StrMethodFormatter is used automatically
ax.zaxis.set_major_formatter('{x:.02f}')

# Add a color bar which maps values to colors
fig.colorbar(surf, shrink=0.5, aspect=5)

plt.show()

The following example is a demo of a basic pie chart plus a few additional features. In addition to the basic pie chart, this demo shows a few optional features:

- slice labels
- auto-labeling the percentage
- offsetting a slice with `explode`
- drop-shadow
- custom start angle

In [None]:
import matplotlib.pyplot as plt 

# Pie chart, where the slices will be ordered and plotted counter-clockwise
labels = 'Frogs', 'Hogs', 'Dogs', 'Logs'
sizes = [15, 30, 45, 10]
explode = (0, 0.1, 0, 0) # Only "explode" the 2nd slice (i.e. 'Hogs')

fig1, ax1 = plt.subplots()
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True, startangle=90)
ax1.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle.

plt.show()

I will leave you with an example of XKCD-style sketch plots just for fun! 

Before I run this code, I wanted to provide you with a little bit of information about me. I am an instructor in the Computer Science & Engineering Department at New Mexico Tech. You can contact me at amy.knowles@nmt.edu. 

Thank you for joining this section on data visualization with matplotlib. Good Luck on your Super Computer Challenge Projects. Enjoy the journey!

In [None]:
import matplotlib.pyplot as plt 
import numpy as np 

with plt.xkcd():
    # Based on "Stove Ownership" from XKCD by Randall Munroe
    # https://xkcd.com/418/

    fig = plt.figure()
    ax = fig.add_axes((0.1, 0.2, 0.8, 0.7))
    ax.spines['right'].set_color('none')
    ax.spines['top'].set_color('none')
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_ylim([-30,10])

    data = np.ones(100)
    data[70:] -= np.arange(30)

    ax.annotate(
        'THE DAY I REALIZED\nI COULD COOK BACON\nWHENEVER I WANTED',
        xy=(70, 1), arrowprops=dict(arrowstyle='->'), xytext=(15, -10)
        )
    
    ax.plot(data)

    ax.set_xlabel('time')
    ax.set_ylabel('my overall health')
    fig.text(
        0.5, 0.05, 
        '"Stove Ownership" from xkcd by Randall Munroe',
        ha='center'
    )