# Haversine Formula in Python (Bearing and Distance between two GPS points)

## Problem

I would like to know how to get the distance and bearing between 2 GPS points. I have researched on the haversine formula. Someone told me that I could also find the bearing using the same data.

## Edit

Everything is working fine but the bearing doesn't quite work right yet. The bearing outputs negative but should be between 0 - 360 degrees. The set data should make the horizontal bearing `96.02166666666666` and is:

``````Start point: 53.32055555555556 , -1.7297222222222221
Bearing:  96.02166666666666
Distance: 2 km
Destination point: 53.31861111111111, -1.6997222222222223
Final bearing: 96.04555555555555
``````

Here is my new code:

``````from math import *

Aaltitude = 2000
Oppsite  = 20000

lat1 = 53.32055555555556
lat2 = 53.31861111111111
lon1 = -1.7297222222222221
lon2 = -1.6997222222222223

lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
Base = 6371 * c

Bearing =atan2(cos(lat1)*sin(lat2)-sin(lat1)*cos(lat2)*cos(lon2-lon1), sin(lon2-lon1)*cos(lat2))

Bearing = degrees(Bearing)
print ""
print ""
print "--------------------"
print "Horizontal Distance:"
print Base
print "--------------------"
print "Bearing:"
print Bearing
print "--------------------"

Base2 = Base * 1000
distance = Base * 2 + Oppsite * 2 / 2
Caltitude = Oppsite - Aaltitude

a = Oppsite/Base
b = atan(a)
c = degrees(b)

distance = distance / 1000

print "The degree of vertical angle is:"
print c
print "--------------------"
print "The distance between the Balloon GPS and the Antenna GPS is:"
print distance
print "--------------------"
``````

12

Here's a Python version:

``````from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r
``````
Tuesday, June 1, 2021

32

This link might be helpful to you, as it details the use of the Haversine formula to calculate the distance.

Excerpt:

This script [in Javascript] calculates great-circle distances between the two points – that is, the shortest distance over the earth’s surface – using the ‘Haversine’ formula.

``````function getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2) {
var R = 6371; // Radius of the earth in km
var a =
Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.sin(dLon/2) * Math.sin(dLon/2)
;
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
var d = R * c; // Distance in km
return d;
}

return deg * (Math.PI/180)
}
``````
Tuesday, June 1, 2021

84

Seems like you could use the magic of `pandas`.

It's easy to create a `pandas dataframe` from a csv file using the `read_csv()` function:

``````import pandas as pd
``````

Based on your sample data, this will create the following `dataframe`:

``````    ID        timestamp   latitude  longitude
0    3   6/9/2017 22:20  38.795333  77.008883
1    1   5/5/2017 13:10  38.889011  77.050061
2    2  2/10/2017 16:23  40.748249  73.984191
3    1   5/5/2017 12:35  38.920602  77.222329
4    3  6/10/2017 10:00  42.366211  71.020943
5    1   5/5/2017 20:00  38.897416  77.036833
6    2   2/10/2017 7:30  38.851426  77.042298
7    3   6/9/2017 10:20  38.917346  77.222553
8    2  2/10/2017 19:51  40.782869  73.967544
9    3   6/10/2017 6:42  38.954268  77.449695
10   1   5/5/2017 16:35  38.872875  77.007763
11   2  2/10/2017 10:00  40.776931  73.876155
``````

### Convert the timestamp column

Pandas (and python in general) has extensive libraries for date and time operations. But first, you will need to prepare your data by converting the timestamp column (a string) into a datetime object. I am assuming your data is in the format `"MM/DD/YYYY"` (since you didn't specify).

``````df['timestamp'] = pd.to_datetime(df['timestamp'], format='%m/%d/%Y %H:%M')
``````

### Helper functions

You're going to have to define some functions to compute the distance and the velocity. The Haversine distance function is adapted from this answer.

``````from math import sin, cos, sqrt, atan2, radians

def getDistanceFromLatLonInKm(lat1,lon1,lat2,lon2):
R = 6371 # Radius of the earth in km
a = sin(dLat/2) * sin(dLat/2) + cos(rLat1) * cos(rLat2) * sin(dLon/2) * sin(dLon/2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c # Distance in km
return d

def calc_velocity(dist_km, time_start, time_end):
"""Return 0 if time_start == time_end, avoid dividing by 0"""
return dist_km / (time_end - time_start).seconds if time_end > time_start else 0
``````

### Make some intermediate variables

We want to compute the Haversine function on each row, but we need some information from the first row for each group. Luckily, `pandas` makes this easy with `sort_values()`, `groupby()` and `transform()`.

The following code makes 3 new columns, one each for the initial latitude, longitude, and time for each ID.

``````# First sort by ID and timestamp:
df = df.sort_values(by=['ID', 'timestamp'])

# Group the sorted dataframe by ID, and grab the initial value for lat, lon, and time.
df['lat0'] = df.groupby('ID')['latitude'].transform(lambda x: x.iat)
df['lon0'] = df.groupby('ID')['longitude'].transform(lambda x: x.iat)
df['t0'] = df.groupby('ID')['timestamp'].transform(lambda x: x.iat)
``````

### Apply the functions

``````# create a new column for distance
df['dist_km'] = df.apply(
lambda row: getDistanceFromLatLonInKm(
lat1=row['latitude'],
lon1=row['longitude'],
lat2=row['lat0'],
lon2=row['lon0']
),
axis=1
)

# create a new column for velocity
df['velocity_kmps'] = df.apply(
lambda row: calc_velocity(
dist_km=row['dist_km'],
time_start=row['t0'],
time_end=row['timestamp']
),
axis=1
)
``````

### The Result

``````>>> print(df[['ID', 'timestamp', 'latitude', 'longitude', 'dist_km', 'velocity_kmps']])

ID           timestamp   latitude  longitude     dist_km  velocity_kmps
3    1 2017-05-05 12:35:00  38.920602  77.222329    0.000000       0.000000
1    1 2017-05-05 13:10:00  38.889011  77.050061   15.314742       0.007293
10   1 2017-05-05 16:35:00  38.872875  77.007763   19.312148       0.001341
5    1 2017-05-05 20:00:00  38.897416  77.036833   16.255868       0.000609
6    2 2017-02-10 07:30:00  38.851426  77.042298    0.000000       0.000000
11   2 2017-02-10 10:00:00  40.776931  73.876155  344.880549       0.038320
2    2 2017-02-10 16:23:00  40.748249  73.984191  335.727502       0.010498
8    2 2017-02-10 19:51:00  40.782869  73.967544  339.206320       0.007629
7    3 2017-06-09 10:20:00  38.917346  77.222553    0.000000       0.000000
0    3 2017-06-09 22:20:00  38.795333  77.008883   22.942974       0.000531
9    3 2017-06-10 06:42:00  38.954268  77.449695   20.070609       0.000274
4    3 2017-06-10 10:00:00  42.366211  71.020943  648.450485       0.007611
``````

From here, I will leave it to you to figure out how to grab the last entry for each ID.

Wednesday, August 11, 2021

100

No, because lines of longitude converge towards the poles. If your points are relatively close together, you can approximate the distance thus:

``````d = sqrt(pow(lat2-lat1, 2) + cos(lat1)*pow(lon2-lon1, 2))
``````

If you need greater accuracy over large distances, there are several fancy formulae for computing great-circle distances, but I find it simpler to convert to 3D coordinates on a unit circle then do a simple pythagorean distance, followed by 2 sin-1(d/2) to convert back to an angle (though I can understand that some might find not find this simpler, :-).

Tuesday, August 17, 2021

87

This sounds like a classic use case for k-D trees.

If you first transform your points into Euclidean space then you can use the `query_pairs` method of `scipy.spatial.cKDTree`:

``````from scipy.spatial import cKDTree

tree = cKDTree(data)
# where data is (nshops, ndim) containing the Euclidean coordinates of each shop
# in units of km

pairs = tree.query_pairs(50, p=2)   # 50km radius, L2 (Euclidean) norm
``````

`pairs` will be a `set` of `(i, j)` tuples corresponding to the row indices of pairs of shops that are ≤50km from each other.

The output of `tree.sparse_distance_matrix` is a `scipy.sparse.dok_matrix`. Since the matrix will be symmetric and you're only interested in unique row/column pairs, you could use `scipy.sparse.tril` to zero out the upper triangle, giving you a `scipy.sparse.coo_matrix`. From there you can access the nonzero row and column indices and their corresponding distance values via the `.row`, `.col` and `.data` attributes:

``````from scipy import sparse

tree_dist = tree.sparse_distance_matrix(tree, max_distance=10000, p=2)
udist = sparse.tril(tree_dist, k=-1)    # zero the main diagonal
ridx = udist.row    # row indices
cidx = udist.col    # column indices
dist = udist.data   # distance values
``````
Thursday, September 16, 2021