Asked  7 Months ago    Answers:  5   Viewed   137 times

I was looking for a way to annotate my bars in a Pandas bar plot with the rounded numerical values from my DataFrame.

>>> df=pd.DataFrame({'A':np.random.rand(2),'B':np.random.rand(2)},index=['value1','value2'] )         
>>> df
                 A         B
  value1  0.440922  0.911800
  value2  0.588242  0.797366

I would like to get something like this:

bar plot annotation example

I tried with this code sample, but the annotations are all centered on the x ticks:

>>> ax = df.plot(kind='bar') 
>>> for idx, label in enumerate(list(df.index)): 
        for acc in df.columns:
            value = np.round(df.ix[idx][acc],decimals=2)
            ax.annotate(value,
                        (idx, value),
                         xytext=(0, 15), 
                         textcoords='offset points')

 Answers

67

You get it directly from the axes' patches:

for p in ax.patches:
    ax.annotate(str(p.get_height()), (p.get_x() * 1.005, p.get_height() * 1.005))

You'll want to tweak the string formatting and the offsets to get things centered, maybe use the width from p.get_width(), but that should get you started. It may not work with stacked bar plots unless you track the offsets somewhere.

Tuesday, June 1, 2021
 
SheppardDigital
answered 7 Months ago
85
    #Seaborn --factorplot

    colors = ["windows blue", "orange red", "grey", "amber"]  
    myPalette = sns.xkcd_palette(colors) #envío "colors" a la función xkcd_palette

    sns.set(style="white") #fondo blanco
    g = sns.factorplot(x="Stages", y="Accuracy", hue="Dataset", data=df, saturation=5, size=4, aspect=3, kind="bar",
              palette= myPalette, legend=False) #se suprime la leyenda

    g.set(ylim=(0, 140)) 
    g.despine(right=False) 
    g.set_xlabels("") 
    g.set_ylabels("")  
    g.set_yticklabels("") 


   #Matplotlib --legend creation

     myLegend=plt.legend(bbox_to_anchor=(0., 1.2, 1., .102), prop ={'size':10}, loc=10, ncol=4,  #left, bottom, width, height
                title=r'TOTAL ACCURACY AND PER STAGE-RANDOM FOREST')                    
     myLegend.get_title().set_fontsize('24')



     #Matplotlib --anotación de barras

       ax=g.ax #annotate axis = seaborn axis
       def annotateBars(row, ax=ax): 
       for p in ax.patches:
             ax.annotate("%.2f" % p.get_height(), (p.get_x() + p.get_width() / 2., p.get_height()),
                 ha='center', va='center', fontsize=11, color='gray', rotation=90, xytext=(0, 20),
                 textcoords='offset points')  verticales


     plot = df.apply(annotateBars, ax=ax, axis=1)

enter image description here

Saturday, July 3, 2021
 
MoarCodePlz
answered 5 Months ago
38

You need to specify the axes more explicitly. Try it like this:

%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

fig, tsax = plt.subplots()
barax = tsax.twinx()

data = pd.read_csv('mpg.csv', skipinitialspace=True,index_col='Date')
data['Trip Miles'].plot(kind='bar', ax=barax)
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')
data['MPG'].plot(ax=tsax)
data['MPG-D'].plot(ax=tsax)

Edit

So a big problem here is that pandas bar plots and line plots format the x-axis in fundamentally different ways. Specifically, bar plots attempt to make qualitative scales with ticks and labels for every single bar. But it seems here that you're interested in a getting a format more like a typical time series.

So here's where I suggest that you forget about dual axis charts. Instead, just plot on two completely separate axes. Like this:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd

fig = plt.figure(figsize=(12,5))
grid = mgrid.GridSpec(nrows=2, ncols=1, height_ratios=[2, 1])

barax = fig.add_subplot(grid[0])
tsax = fig.add_subplot(grid[1])
data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))

data['A'] **= 2
data['A'].plot(ax=barax, style='o--')
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')

barax.xaxis.tick_top()

data['B'].plot(ax=tsax)
data['C'].plot(ax=tsax)
fig.tight_layout()

Which gives me: separate axes

However, if you really need bars or you really want everything on the same twin x-axes, then you have to plot with matplotlib's API like this:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd

fig, tsax = plt.subplots(figsize=(12,5))
barax = tsax.twinx()

data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))
data['A'] **= 2

# the `width` is specified in days -- adjust for your data
barax.bar(data.index, data['A'], width=5, facecolor='indianred')

barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')

barax.xaxis.tick_top()

fig.tight_layout()

tsax.plot(data.index, data['B'])
tsax.plot(data.index, data['C'])

Which then gives me

single axes

Thursday, July 29, 2021
 
waylaidwanderer
answered 5 Months ago
65

The values you were using for your coordinates in your for loop were screwed up. Also you were using plt.colorbar instead of something cleaner like fig.colorbar. Try this (it gets the job done, with no effort made to otherwise cleanup the code):

def heatmap_binary(df,
            edgecolors='w',
            #cmap=mpl.cm.RdYlGn,
            log=False):    
    width = len(df.columns)/7*10
    height = len(df.index)/7*10

    fig, ax = plt.subplots(figsize=(20,10))#(figsize=(width,height))

    cmap, norm = mcolors.from_levels_and_colors([0, 0.05, 1],['Teal', 'MidnightBlue'] ) # ['MidnightBlue', Teal]['Darkgreen', 'Darkred']

    heatmap = ax.pcolor(df ,
                        edgecolors=edgecolors,  # put white lines between squares in heatmap
                        cmap=cmap,
                        norm=norm)
    data = df.values
    for y in range(data.shape[0]):
        for x in range(data.shape[1]):
            plt.text(x + 0.5 , y + 0.5, '%.4f' % data[y, x], #data[y,x] +0.05 , data[y,x] + 0.05
                 horizontalalignment='center',
                 verticalalignment='center',
                 color='w')


    ax.autoscale(tight=True)  # get rid of whitespace in margins of heatmap
    ax.set_aspect('equal')  # ensure heatmap cells are square
    ax.xaxis.set_ticks_position('top')  # put column labels at the top
    ax.tick_params(bottom='off', top='off', left='off', right='off')  # turn off ticks

    ax.set_yticks(np.arange(len(df.index)) + 0.5)
    ax.set_yticklabels(df.index, size=20)
    ax.set_xticks(np.arange(len(df.columns)) + 0.5)
    ax.set_xticklabels(df.columns, rotation=90, size= 15)

    # ugliness from http://matplotlib.org/users/tight_layout_guide.html
    from mpl_toolkits.axes_grid1 import make_axes_locatable
    divider = make_axes_locatable(ax)
    cax = divider.append_axes("right", "3%", pad="1%")
    fig.colorbar(heatmap, cax=cax)

Then

df1 = pd.DataFrame(np.random.choice([0, 0.75], size=(4,5)), columns=list('ABCDE'), index=list('WXYZ'))
heatmap_binary(df1)

gives:

The Answer

Monday, September 13, 2021
 
Andro Selva
answered 3 Months ago
50

Roger that Ajean and Alios!

Well I did finally find the answer to the question. This is something I've been trying to do for days now. The problem was apparently an issue in an earlier version of Pandas. I installed Pandas 0.15.0 and you can now reference another data frame and use the data for error bars on grouped bar plots like Ceflo was trying to do above. So the following code now works in Pandas 0.15.0.

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])
print(df)

df[['mean1', 'mean2']].plot(kind='bar', yerr=df[['std1', 'std2']].values.T, alpha = 0.5,error_kw=dict(ecolor='k'))
plt.show()
Sunday, October 10, 2021
 
The Coding Wombat
answered 2 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share