Jonderwu 2020-02-10
This is an old article from my notebook. I found this very useful and I often look back into it to find some source demo for my work. I revise it again and post it here.
Many tutorial and article of matplotlib will try to ignore this topic I don‘t know why.
Introduction
Labelling is a eternal topic in data analysis. After a data visulization with matplotlib or other libraries, people still want to see how excatly the number is. For example, not only how long one bar is compare to another, but also how long each bar is.
Another good thing of labelling is after we put numbers on the plot, we no longer need y-axis. We can unvisible y-axis and make our plot more refreshing.
Bad news is in matplotlib there is no such bulid-in function or argument to help us do so. There is not much useful infomation on internet when I studied this topic, some of them even highly wrong. Today I give myself a chance to summary what I have used/ learned in a period about labelling, including numerical and string text.
In this article we will talk about:
1. How to label a vertical bar chart
2. How to label a horizontal bar char.
3. How to label a stacked bar chart.
4. How to label a line plot.
5. How to label a pie chart.
We will import these libraries in the article:
import pandas as pd import matplotlib.pyplot as plt
How to label a vertical bar chart
First we will make up some data for example.
# fake some data companies = { ‘Company‘: [‘Intel‘, ‘IBM‘, ‘Nvdia‘], ‘Revenue‘: [12000, 11100, 12200] } companies = pd.DataFrame.from_dict(companies) companies
I often use object-oriented style, that is create figure & axises. Someone may like function style which use "plt." a lot, but the idea behind the process is all the same.
# install fig = plt.figure(figsize=(10, 5)) ax1 = fig.add_subplot(1, 1, 1) # draw ax1.bar(companies[‘Company‘], companies[‘Revenue‘], width=0.5, color=[‘lightblue‘, ‘grey‘, ‘salmon‘])
Without any label or customize, this is what we get. Let‘s have a look at the labelling.
# install fig = plt.figure(figsize=(10, 5)) ax1 = fig.add_subplot(1, 1, 1) # draw rects = ax1.bar(companies[‘Company‘], companies[‘Revenue‘], width=0.5, color=[‘lightblue‘, ‘grey‘, ‘salmon‘]) # labeling for rect in rects: height = rect.get_height() ax1.annotate(‘{}‘.format(height), xy=(rect.get_x() + rect.get_width() / 2, height), xytext=(0, 3), # 3 points vertical offset textcoords="offset points", ha=‘center‘, va=‘bottom‘) # customize ax1.set_ylim([0, 14000])
This method use a fact that ax.bar() will return a rects object. These kind of objects have attached many good methods behind the screen: .height()
/ .get_x()
/ .get_width()
, we can use them for labeling.
One quick note, some example on internet will tell you to use ax1.text().
But that function has much less arguments than ax.annotate()
, so basically we don‘t use it.
Adding a label to plot may makes y-axis not long enough. We often use ax.set_ylim([])
to make y-axis longer.
Also we can use ax1.get_yaxis().set_visible(False)
to dismiss y-axis, after we have exact number on plot, which will make the plot more refreshing.
# install fig = plt.figure(figsize=(10, 5)) ax1 = fig.add_subplot(1, 1, 1) # draw rects = ax1.bar(companies[‘Company‘], companies[‘Revenue‘], width=0.5, color=[‘lightblue‘, ‘grey‘, ‘salmon‘]) # labeling for rect in rects: height = rect.get_height() ax1.annotate(‘{:.1f}‘.format(height / 10000), xy=(rect.get_x() + rect.get_width() / 2, height), xytext=(0, 3), # 3 points vertical offset textcoords="offset points", ha=‘center‘, va=‘bottom‘, fontsize=12) # customize ax1.set_ylim([0, 14000]) ax1.set_title(‘Picture: Compaines Revenue‘, fontsize=14) ax1.get_yaxis().set_visible(False) ax1.set_xticklabels(labels=[‘Intel‘, ‘IBM‘, ‘Nvidia‘], fontsize=12)
How to label a horizontal bar
We may think label a horizontal bar is the same with vertical bar. Yes and no.
There are certain difference we need to know. Without these knowledge, we may spend a lot of time or get lost.