Asked  7 Months ago    Answers:  5   Viewed   2k times

I'm generating a bar-chart with matplotlib. It all works well but I can't figure out how to prevent the labels of the x-axis from overlapping each other. Here an example:
enter image description here

Here is some sample SQL for a postgres 9.1 database:

drop table if exists mytable;
create table mytable(id bigint, version smallint, date_from timestamp without time zone);
insert into mytable(id, version, date_from) values

('4084036', '1', '2006-12-22 22:46:35'),
('4084938', '1', '2006-12-23 16:19:13'),
('4084938', '2', '2006-12-23 16:20:23'),
('4084939', '1', '2006-12-23 16:29:14'),
('4084954', '1', '2006-12-23 16:28:28'),
('4250653', '1', '2007-02-12 21:58:53'),
('4250657', '1', '2007-03-12 21:58:53')
;  

And this is my python-script:

# -*- coding: utf-8 -*-
#!/usr/bin/python2.7
import psycopg2
import matplotlib.pyplot as plt
fig = plt.figure()

# for savefig()
import pylab

###
### Connect to database with psycopg2
###

try:
  conn_string="dbname='x' user='y' host='z' password='pw'"
  print "Connecting to databasen->%s" % (conn_string)

  conn = psycopg2.connect(conn_string)
  print "Connection to database was established succesfully"
except:
  print "Connection to database failed"

###
### Execute SQL query
###  

# New cursor method for sql
cur = conn.cursor()

# Execute SQL query. For more than one row use three '"'
try:
  cur.execute(""" 

-- In which year/month have these points been created?
-- Need 'yyyymm' because I only need Months with years (values are summeed up). Without, query returns every day the db has an entry.

SELECT to_char(s.day,'yyyymm') AS month
      ,count(t.id)::int AS count
FROM  (
   SELECT generate_series(min(date_from)::date
                         ,max(date_from)::date
                         ,interval '1 day'
          )::date AS day
   FROM   mytable t
   ) s
LEFT   JOIN mytable t ON t.date_from::date = s.day
GROUP  BY month
ORDER  BY month;

  """)

# Return the results of the query. Fetchall() =  all rows, fetchone() = first row
  records = cur.fetchall()
  cur.close()

except:
  print "Query could not be executed"

# Unzip the data from the db-query. Order is the same as db-query output
year, count = zip(*records)

###
### Plot (Barchart)
###

# Count the length of the range of the count-values, y-axis-values, position of axis-labels, legend-label
plt.bar(range(len(count)), count, align='center', label='Amount of created/edited points')

# Add database-values to the plot with an offset of 10px/10px
ax = fig.add_subplot(111)
for i,j in zip(year,count):
    ax.annotate(str(j), xy=(i,j), xytext=(10,10), textcoords='offset points')

# Rotate x-labels on the x-axis
fig.autofmt_xdate()

# Label-values for x and y axis
plt.xticks(range(len(count)), (year))

# Label x and y axis
plt.xlabel('Year')
plt.ylabel('Amount of created/edited points')

# Locate legend on the plot (http://matplotlib.org/users/legend_guide.html#legend-location)
plt.legend(loc=1)

# Plot-title
plt.title("Amount of created/edited points over time")

# show plot
pylab.show()

Is there a way how I can prevent the labels from overlapping each other? Ideally in an automatic way, because I can't predict the amount of bars.

 Answers

21

Edit 2014-09-30

pandas now has a read_sql function. You definitely want to use that instead.

Original Answer

Here's how you should convert your date string into real datetime objects:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
data_tuples = [
    ('4084036', '1', '2006-12-22 22:46:35'),
    ('4084938', '1', '2006-12-23 16:19:13'),
    ('4084938', '2', '2006-12-23 16:20:23'),
    ('4084939', '1', '2006-12-23 16:29:14'),
    ('4084954', '1', '2006-12-23 16:28:28'),
    ('4250653', '1', '2007-02-12 21:58:53'),
    ('4250657', '1', '2007-03-12 21:58:53')]
datatypes = [('col1', 'i4'), ('col2', 'i4'), ('date', 'S20')]
data = np.array(data_tuples, dtype=datatypes)
col1 = data['col1']
dates = mdates.num2date(mdates.datestr2num(data['date']))
fig, ax1 = plt.subplots()
ax1.bar(dates, col1)
fig.autofmt_xdate()

Getting a simple list of tuples out of your database cursor should be as simple as...

data_tuples = []
for row in cursor:
    data_tuples.append(row)

However, I posted a version of a function that I use to take db cursors directly to record arrays or pandas dataframes here: How to convert SQL Query result to PANDAS Data Structure?

Hopefully that helps too.

Thursday, June 3, 2021
 
barden
answered 7 Months ago
71

Try replace

plt.bar(range(len(my_dict)), my_dict.values(), align='center')

with

plt.figure(figsize=(20, 3))  # width:20, height:3
plt.bar(range(len(my_dict)), my_dict.values(), align='edge', width=0.3)
Thursday, August 12, 2021
 
ala
answered 4 Months ago
ala
78
  • You can "lift" the graph by setting a lower ylim with ax.set_ylim.
  • Marker dots can be added to the plot using the marker = 'o' parameter setting in the call to plt.plot:

import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)

xticks=['Jan','Feb','Mar','April','May']
x=[1,2,3,4,5]
yticks = ['Windy', 'Sunny', 'Rainy', 'Cloudy', 'Snowy']
y=[2,1,3,5,4]

plt.plot(x,y,'b-', marker = 'o') #.2,.1,.7,.8
plt.subplots_adjust(left =0.2)

plt.xticks(x,xticks)
plt.yticks(y,yticks)
ax.set_ylim(0.5,max(y))
plt.show()

enter image description here

Monday, August 16, 2021
 
nighter
answered 4 Months ago
28

You could provide a dummy x-range, and then override the xtick labels. I do agree with the comments above questioning wether its the best solution, but thats hard to judge without any context.

If you really want to, this might be an option:

fig, ax = plt.subplots(1,2, figsize=(10,4))

x = [2,4,3,6,1,7]
y = [1,2,3,4,5,6]

ax[0].plot(x, y)

ax[1].plot(np.arange(len(x)), y)
ax[1].set_xticklabels(x)

enter image description here

edit: If you work with dates, why not plot the real date on the axis (and perhaps format it by the day-of-month if you do want 29 30 1 2 etc on the axis?

Friday, October 8, 2021
 
Indranil
answered 2 Months ago
33

Thank you @arsho for your input. I have made it a little more compact. It fixed also the error on the last group of bars in your code. See comments in code. Hope this helps.

For those like me who are new to matplotlib: we can simple plot a line to the subplot, no matter if it already contains bars.

import numpy as np
import matplotlib.pyplot as plt

# fig, is the whole thing; ax1 is a subplot in the figure, 
# so we reference it to plot bars and lines there
fig, ax1 = plt.subplots()

ind = np.arange(3)
width = 0.15

# per dimension
colors = ['#00ff00', '#0000ff', '#ff00ff']
markers = ['x','o','v']
xticklabels = ['50/50', '60/40', '70/30']

# 
group1 = [12,6,5]
group2 = [6,8,12]
group3 = [2,4,9]

#
all_groups = [ group1, group2, group3 ]

# plot each group of bars; loop-variable bar_values contains values for bars
for i, bar_values in enumerate( all_groups ):

  # compute position for each bar
  bar_position = width*i
  ax1.bar( ind + bar_position, bar_values, width, color=colors[i] )

# plot line for each group of bars; loop-variable y_values contains values for lines
for i, y_values in enumerate( all_groups ):

  # moves the beginning of a line to the middle of the bar
  additional_space = (width*i) + (width/2);
  # x_values contains list indices plus additional space
  x_values = [ x + additional_space for x,_ in enumerate( y_values ) ]

  # simply plot the values in y_values
  ax1.plot( x_values, y_values, marker=markers[i], color=colors[i] )

plt.setp([ax1], xticks=ind + width, xticklabels=xticklabels)

plt.tight_layout()
plt.show()
Saturday, November 13, 2021
 
KatokichiSoft
answered 3 Weeks ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share