# Moving average or running mean

Asked  6 Months ago    Answers:  5   Viewed   76 times

Is there a SciPy function or NumPy function or module for Python that calculates the running mean of a 1D array given a specific window?

## Answers

57

For a short, fast solution that does the whole thing in one loop, without dependencies, the code below works great.

``````mylist = [1, 2, 3, 4, 5, 6, 7]
N = 3
cumsum, moving_aves = [0], []

for i, x in enumerate(mylist, 1):
cumsum.append(cumsum[i-1] + x)
if i>=N:
moving_ave = (cumsum[i] - cumsum[i-N])/N
#can do stuff with moving_ave here
moving_aves.append(moving_ave)
``````
Tuesday, June 1, 2021

answered 6 Months ago
37

The agg framework now has `\$map` and `\$reduce` and `\$range` built in so array processing is much more straightfoward. Below is an example of calculating moving average on a set of data where you wish to filter by some predicate. The basic setup is each doc contains filterable criteria and a value, e.g.

``````{sym: "A", d: ISODate("2018-01-01"), val: 10}
{sym: "A", d: ISODate("2018-01-02"), val: 30}
``````

Here it is:

``````// This controls the number of observations in the moving average:
days = 4;

c=db.foo.aggregate([

// Filter down to what you want.  This can be anything or nothing at all.
{\$match: {"sym": "S1"}}

// Ensure dates are going earliest to latest:
,{\$sort: {d:1}}

// Turn docs into a single doc with a big vector of observations, e.g.
//     {sym: "A", d: d1, val: 10}
//     {sym: "A", d: d2, val: 11}
//     {sym: "A", d: d3, val: 13}
// becomes
//     {_id: "A", prx: [ {v:10,d:d1}, {v:11,d:d2},  {v:13,d:d3} ] }
//
// This will set us up to take advantage of array processing functions!
,{\$group: {_id: "\$sym", prx: {\$push: {v:"\$val",d:"\$date"}} }}

// Nice additional info.  Note use of dot notation on array to get
// just scalar date at elem 0, not the object {v:val,d:date}:
,{\$addFields: {numDays: days, startDate: {\$arrayElemAt: [ "\$prx.d", 0 ]}} }

// The Juice!  Assume we have a variable "days" which is the desired number
// of days of moving average.
// The complex expression below does this in python pseudocode:
//
// for z in range(0, size of value vector - # of days in moving avg):
//    seg = vector[n:n+days]
//    values = seg.v
//    dates = seg.d
//    for v in seg:
//        tot += v
//    avg = tot/len(seg)
//
// Note that it is possible to overrun the segment at the end of the "walk"
// along the vector, i.e. not enough date-values.  So we only run the
// vector to (len(vector) - (days-1).
// Also, for extra info, we also add the number of days *actually* used in the
// calculation AND the as-of date which is the tail date of the segment!
//
// Again we take advantage of dot notation to turn the vector of
// object {v:val, d:date} into two vectors of simple scalars [v1,v2,...]
// and [d1,d2,...] with \$prx.v and \$prx.d
//
,{\$addFields: {"prx": {\$map: {
input: {\$range:[0,{\$subtract:[{\$size:"\$prx"}, (days-1)]}]} ,
as: "z",
in: {
avg: {\$avg: {\$slice: [ "\$prx.v", "\$\$z", days ] } },
d: {\$arrayElemAt: [ "\$prx.d", {\$add: ["\$\$z", (days-1)] } ]}
}
}}
}}

]);
``````

This might produce the following output:

``````{
"_id" : "S1",
"prx" : [
{
"avg" : 11.738793632512115,
"d" : ISODate("2018-09-05T16:10:30.259Z")
},
{
"avg" : 12.420766702631376,
"d" : ISODate("2018-09-06T16:10:30.259Z")
},
...

],
"numDays" : 4,
"startDate" : ISODate("2018-09-02T16:10:30.259Z")
}
``````
Tuesday, July 6, 2021

answered 5 Months ago
57

I think you don't need to do fftshift(), and you can pass sampling period to fftfreq():

``````import scipy
import scipy.fftpack
import pylab
from scipy import pi
t = scipy.linspace(0,120,4000)
acc = lambda t: 10*scipy.sin(2*pi*2.0*t) + 5*scipy.sin(2*pi*8.0*t) + 2*scipy.random.random(len(t))

signal = acc(t)

FFT = abs(scipy.fft(signal))
freqs = scipy.fftpack.fftfreq(signal.size, t[1]-t[0])

pylab.subplot(211)
pylab.plot(t, signal)
pylab.subplot(212)
pylab.plot(freqs,20*scipy.log10(FFT),'x')
pylab.show()
``````

from the graph you can see there are two peak at 2Hz and 8Hz.

Friday, July 9, 2021

answered 5 Months ago
34

You can use rolling with `transform`:

``````df['moving'] = df.groupby('object')['value'].transform(lambda x: x.rolling(10, 1).mean())
``````

The `1` in `rolling` is for minimum number of periods.

Tuesday, July 13, 2021

answered 5 Months ago
99

SQL Fiddle

``````select
"date",
shop_id,
amount,
extract(dow from date),
case when
row_number() over (order by date) > 3
then
avg(amount) OVER (
ORDER BY date DESC
ROWS BETWEEN 1 following AND 3 FOLLOWING
)
else null end
from (
select *
from ro
where extract(dow from date) = 4
) s
``````

What is wrong with the OP's query is the frame specification:

``````ROWS BETWEEN 0 PRECEDING AND 2 FOLLOWING
``````

Other than that my query avoids unneeded computing by filtering Thursdays before applying the expensive window functions.

If it is necessary to partition by shop_id then obviously add the `partition by shop_id` to both functions, `avg` and `row_number`.

Friday, September 17, 2021

answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
Share
Related Answers
Top Answers Related To