bistats

# Differences

This shows you the differences between two versions of the page.

 — bistats [2018/12/27 13:27] (current) Line 1: Line 1: + + ~~CLOSETOC~~ + + ~~TOC 1-3 wide~~ + + + ```juliarepl + julia> pkgchk.( [ "​julia"​ => v"​1.0.3",​ "​DataFrames"​ => v"​0.14.1",​ "​GLM"​ => v"​1.0.1",​ "​Loess"​ => v"​0.5.0",​ "​Plots"​ => v"​0.21.0"​ ] ); + + ``` + + + + # Bivariate Relations + + ## Regressing Y on X + + Bivariate regressions are simply a special case of the [[multistats|Multivariate Regressions]]. + + + ## Splines and Loess + + Loess has become the most popular method to fit data in two dimensions without a functional specification. ​ It runs regressions localized around points, akin to splines. ​ The canonical example, copied from the [Loess.jl](https://​github.com/​JuliaStats/​Loess.jl) docs, is: + + ```juliarepl + julia> using Loess; using Random; + + julia> Random.seed!(0);​ xs= sort(10 .* rand(100)); ys= sin.(xs) .+ 0.5 * rand(100); ​ ## sample points to fit + + julia> model= loess(xs, ys);                                   ## the loess engine + + julia> xpoints= collect(minimum(xs):​0.1:​maximum(xs)); ​         ## where to fit + + julia> ypoints= Loess.predict(model,​ xpoints); ​                ## the fitted values + + julia> (hcat(xpoints,​ypoints))[1:​5,​ :] + 5×2 Array{Float64,​2}:​ + ​0.353445 ​ 0.896297 + ​0.453445 ​ 0.92369 + ​0.553445 ​ 0.946465 + ​0.653445 ​ 0.964635 + ​0.753445 ​ 0.978406 + + julia> using Plots + + julia> plot( xs, ys, seriestype= :scatter, legend= false) + + julia> plot!( xpoints, ypoints ) + + julia> savefig("​plotting/​loess.png"​);​ + + ``` + + See [[graphing|Graphing and Plotting]]. + + + + ## Vector and Matrix Moments for Bivariate Matrix + + The following can be useful when working with bivariate data: + + ```juliarepl + julia> using StatsBase, Statistics + + julia> m= reshape( [1.0:6.0;], 2,3 ) + 2×3 Array{Float64,​2}:​ + ​1.0 ​ 3.0  5.0 + ​2.0 ​ 4.0  6.0 + + julia> mean( m ) ## overall mean + 3.5 + + julia> mean( m, dims=1) ## column means + 1×3 Array{Float64,​2}:​ + ​1.5 ​ 3.5  5.5 + + julia> mean( m, dims=2) ## row means + 2×1 Array{Float64,​2}:​ + 3.0 + 4.0 + + julia> std( m ) ## overall stddev + 1.8708286933869707 + + julia> std(m, dims=1) ## column stddev + 1×3 Array{Float64,​2}:​ + ​0.707107 ​ 0.707107 ​ 0.707107 + + julia> std(m, dims=2) ## row stddev + 2×1 Array{Float64,​2}:​ + 2.0 + 2.0 + + ``` + + + + ### Vector Distances + + Statsbase has [fast functions](http://​juliastats.github.io/​StatsBase.jl/​latest/​deviation.html) to calculate distances between two vectors, such as + + * number-different (`counteq`) + * sum absolute difference (`L1dist`), and mean, + * sum squared difference (`L2dist`), and mean + * root mean-squared distance (`rmsd`), etc. + + Statsbase has [Scatter Matrix and Covariances](http://​juliastats.github.io/​StatsBase.jl/​latest/​cov.html) functionality,​ principally `cov()` and `cor()`. + + + + + + + + + # Backmatter + + ## Useful Packages on Julia Repository + + * [Mocha](https://​github.com/​pluskid/​Mocha.jl) for neural network learning + + * [LightML](https://​juliaobserver.com/​packages/​LightML) for many statistical techniques now grouped as "​machine learning."​ + + ## Notes + + * `linreg(x,​y)` computes coefficients for a bivariate regression. + + * See also [[unistats|Univariate Statistics]] and [[unitimeseries|Time-Series]] + + ## References + + - [GLM.jl](https://​github.com/​JuliaStats/​GLM.jl) + + 