Skip to content

Latest commit

 

History

History
154 lines (104 loc) · 6.63 KB

README.md

File metadata and controls

154 lines (104 loc) · 6.63 KB

RollingFunctions.jl

Roll a [weighted] function or run a statistic along windowed data.

Copyright © 2017-2024 by Jeffrey Sarnoff. Released under the MIT License.


IMPORTANT <<<

For now, rather than use e.g runmean, runcov, with version 0.8 use running(cov, ..) here is an example



Package Downloads


works with integers, floats, and missings

works with unweighted data

  • data that is a simple vector
  • data that is a CartesianIndexed vector

works with weights

  • weights given as a simple vector
  • weights given as a kind of StatsBase.AbstractWeights

applies functions (1-arg, .., 4-args)

  • applied over unweighted or weighted data

works with data matrices

  • same 1-arg function applied to each column

reasonable uses

  • with a simple vector
  • with a DataFrame column
  • with a TimeSeries column
  • with your own function

Rolling a function over data

With ndata = length(data), using a window of length windowsize, rolling a function results in a vector of ndata - windowsize + 1 elements. So there will be obtained windowsize - 1 fewer values than there are data values. All exported functions named with the prefix roll behave this way unless the keyword padding is given with the value to use for padding (e.g. missing). Using padding will fill the initial windowsize - 1 values with that padding value; the result will match the length of the data.

julia> data = collect(1.0f0:5.0f0); print(data)
Float32[1.0, 2.0, 3.0, 4.0, 5.0]
julia> windowsize = 3;

julia> result = rollmean(data, windowsize); print(result)
Float32[2.0, 3.0, 4.0]

julia> result = rollmean(data, windowsize; padding=missing); print(result)
Union{Missing, Float32}[missing, missing, 2.0, 3.0, 4.0]
julia> weights = normalize([1.0f0, 2.0f0, 4.0f0])
3-element Array{Float32,1}:
 0.21821788
 0.43643576
 0.8728715 
 
julia> result = rollmean(data, windowsize, weights); print(result)
Float32[1.23657, 1.74574, 2.25492]

julia> result = rollmean(data, windowsize, weights; padding=missing); print(result)
Union{Missing,Float32}[missing, missing, 1.23657, 1.74574, 2.25492]

Running a function over data

To obtain the same number of output data values as are given, the initial windowsize - 1 values output must be generated outside of the rolling behavior. This is accomplished by tapering the needed values -- using the same function, rolling it over successively smaller window sizes. All exported functions named with the prefix run behave this way.

julia> using RollingFunctions
julia> data = collect(1.0f0:5.0f0); print(data)
Float32[1.0, 2.0, 3.0, 4.0, 5.0]
julia> windowsize = 3;

julia> result = runmean(data, windowsize); print(result)
Float32[1.0, 1.5, 2.0, 3.0, 4.0]
julia> using RollingFunctions
julia> using LinearAlgebra: normalize

julia> weights = normalize([1.0f0, 2.0f0, 4.0f0]);
 
julia> result = runmean(data, windowsize, weights); print(result)
Float32[1.0, 1.11803, 1.23657, 1.74574, 2.25492]

rolling stats

  • rollmin, rollmax, rollmean, rollmedian
  • rollvar, rollstd, rollsem, rollmad, rollmad_normalized
  • rollskewness, rollkurtosis, rollvariation
  • rollcor, rollcov (over two data vectors)

running stats

  • runmin, runmax, runmean, runmedian
  • runvar, runstd, runsem, runmad, runmad_normalized
  • runskewness, runkurtosis, runvariation
  • runcor, runcov (over two data vectors)

Some of these use a limit value for running over vec of length 1.

works with functions over 1, 2, 3 or 4 data vectors

  • rolling(function, data1, data2, windowsize)

  • rolling(function, data1, data2, windowsize, weights) (weights apply to both data vectors)

  • rolling(function, data1, data2, windowsize, weights1, weights2)

  • rolling(function, data1, data2, data3, windowsize)

  • rolling(function, data1, data2, data3, windowsize, weights) (weights apply to all data vectors)

  • rolling(function, data1, data2, data3, windowsize, weights1, weights2, weights3)

  • running(function, data1, data2, data3, data4, windowsize)

  • running(function, data1, data2, data3, data4, windowsize, weights) (weights apply to all data vectors)

  • running(function, data1, data2, data3, data4, windowsize, weights1, weights2, weights3, weights4)

!! CHANGE ME !! Many statistical functions of two or more vector variables are not well defined for vectors of length 1. To run these functions and get an initial tapered value that is well defined, supply the desired value as firstresult.

  • running(function, data1, data2, windowsize, firstresult)
  • running(function, data1, data2, windowsize, weights, firstresult) (weights apply to both data vectors)

Philosophy and Purpose

This package provides a way for rolling and for running a functional window over data. Data is conveyed either as a vector or as a means of obtaining a vector from a matrix or 3D array or other data structure (e.g. dataframes, timeseries). Windows move over the data. One may use unweighted windows or windows wherein each position carries weight. Weighted windows apply the weight sequence through the window as it moves over the data.

When a window is provided with weights, the weights should must be normalized. We provide an algorithmically safe normalizing function that you may rely upon. Adding the sequence of normalized values one to the next obtains 1.0 or a value very slightly less than 1.0 -- their sum will not exceed unity. I do not know how to augment something already whole while respecting its integrity.

When running with a weighted window, the initial (first, second ..) values are determined using a tapering of the weighted window's span. This requires that the weights themselves be tapered along with the determinative function that is rolled. In this case, the weight subsequence is normalized (sums to one(T)), and that reweighting is used with the foreshortened window to taper that which rolls.

This software exists to simpilfy some of what you create and to faciliate some of the work you do.

Some who use it insightfully share the best of that. Others write words that smile.

All of this is expressed through the design of RollingFunctions.

Also Consider

  • The mapwindow function from ImageFiltering supports multidimensional window indexing and different maps.