Skip to content

danielor/BeamStats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

BeamStats

BeamStats is a collection of statistical transforms for the Apache Beam framework written in Java.

Usage

You can calculate the moments of a distribution by using the Moment object.

import com.danielor.beamstats.Moment;

PCollection<Double> input = buildInput();
PCollection<Double> mean = input.apply(Moment.globally(Moment.MomentType.MEAN));
PCollection<Double> variance = input.apply(Moment.globally(Moment.MomentType.VARIANCE));
PCollection<Double> skew = input.apply(Moment.globally(Moment.MomentType.SKEW));
PCollection<Double> kurtosis = input.apply(Moment.globally(Moment.MomentType.KURTOSIS));

You can calculate a simple linear regression using the SimpleRegression object.

import com.danielor.beamstats.SimpleRegression;

// The schema for the row needs to be equal to SimpleRegression.point2dSchema
PCollection<Row> input = buildInput();

// The schema for the simple regression row can be found at SimpleRegression.outputSchema
PCollection<Row> regression = input.apply(SimpleRegression.globally());

About

A collection of Apache Beam statistical transforms

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages