User Tools

Site Tools


random

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

random [2018/12/27 13:27] (current)
Line 1: Line 1:
 +
 +~~CLOSETOC~~
 +
 +~~TOC 1-3 wide~~
 +
 +```juliarepl
 +julia> pkgchk.( [ "​julia"​ => v"​1.0.3",​ "​Distributions"​ => v"​0.16.4"​ ] );
 +
 +```
 +
 +
 +# Statistical Distributions and Random Variates
 +
 +An array of random numbers can be created using the `rand` or `randn` functions.
 +
 +The rand function can also be called as `rand(T, n)` to generate n random numbers of type `T`.
 +
 +Base julia has facilities to generate uniform numbers (rand) and Gaussian (Normal) random numbers (randn). ​ Other distributions rely on the [Distributions](https://​github.com/​JuliaStats/​Distributions.jl) package.
 +
 +```juliarepl
 +julia> using Distributions
 +```
 +
 +
 +
 +## Drawing Random Variables
 +
 +`rand( 1 )` draws a vector with 1 element, while `rand()` draws a scalar. ​ Thus, `y= 3.0 + rand( 1 )` results in a vector, not a scalar.
 +
 +The following is a selection of random numbers drawn from some common distributions. ​ For a more complete list of available distributions,​ refer to [Distributions](http://​juliastats.github.io/​Distributions.jl/​stable/​).
 +
 +
 +### Setting the Seed
 +
 +```juliarepl
 +julia> using Random; ​ Random.seed!(0);​
 +
 +```
 +
 +
 +
 +### Uniform Floats
 +
 +```juliarepl
 +julia> using Random; Random.seed!(0);​
 +
 +julia> rand(3,​2) ​           ## multiple int args are array dimensions
 +3×2 Array{Float64,​2}:​
 + ​0.823648 ​ 0.177329
 + ​0.910357 ​ 0.27888
 + ​0.164566 ​ 0.203477
 +```
 +
 +### Uniform Integers
 +
 +```juliarepl
 +julia> using Random; Random.seed!(0); ​
 +
 +julia> rand(Int, 3)         ## Types as argument to rand mean over the whole range
 +3-element Array{Int64,​1}:​
 + ​1339893410598768192
 + ​1575814717733606317
 + ​2119082727832020007
 +```
 +
 +
 +### Uniform Ranged Integers (e.g., Bernoullis)
 +
 +```juliarepl
 +julia> using Random, Distributions;​ Random.seed!(0);​
 +
 +julia> rand( 1:10, 3 ) ## 3 draws from 1 to 10, equally likely
 +3-element Array{Int64,​1}:​
 +  1
 +  3
 + 10
 +
 +julia> Random.seed!(0);​
 +
 +julia> rand( 0:1, 10) ## 10 draws from 0 to 1, equally likely
 +10-element Array{Int64,​1}:​
 + 0
 + 0
 + 1
 + 1
 + 0
 + 0
 + 0
 + 1
 + 1
 + 0
 +
 +julia> Random.seed!(0);​
 +
 +julia> rand( Bernoulli( 0.15 ), 10 )    ## 10 draws, each is 1 with prob 0.15
 +10-element Array{Int64,​1}:​
 + 0
 + 0
 + 0
 + 0
 + 0
 + 0
 + 1
 + 1
 + 0
 + 0
 +
 +```
 +
 +* Note: If speed is of the essence, there are faster methods than `rand()`. ​ For example, to draw 1,000 uniform numbers between 1 and 4, you can replace `[ rand(1:4) for i=1:1000 ]` with `rand( Random.SamplerRangeFast(1:​4),​ 1000)` or  with `1 .+ Int.(floor.(rand(1000)*4))`.
 +
 +
 +### Binomial Integers
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0);​
 +
 +julia> rand( Binomial( 10, 0.2 ), 10 )   ## each draw has 0.2 prob of 1, summed up 10 times
 +10-element Array{Int64,​1}:​
 + 3
 + 4
 + 1
 + 1
 + 1
 + 1
 + 0
 + 0
 + 1
 + 5
 +```
 +
 +
 +### Poissons
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0);​
 +
 +julia> rand( Poisson( 2 ), 20 )
 +20-element Array{Int64,​1}:​
 + 3
 + 4
 + 1
 + 1
 + 1
 + 1
 + 0
 + 0
 + 1
 + 5
 + 2
 + 2
 + 1
 + 4
 + 1
 + 2
 + 2
 + 4
 + 5
 + 3
 +
 +```
 +
 +
 +### Random Orthogonal Matrix
 +
 +```juliarepl
 +julia> using Random, LinearAlgebra; ​ Random.seed!(0);​
 +
 +julia> QR= qr(randn(3,​3)); ​       ## QR is already orthogonal
 +
 +julia> Q= QR.Q * Diagonal(sign.(diag(QR.R))) ​       ## orthogonal *and* uniformly distributed
 +3×3 Array{Float64,​2}:​
 +  0.60212 ​  ​-0.466433 ​ -0.647991
 +  0.7345 ​    ​0.641769 ​  ​0.220551
 + ​-0.312988 ​  ​0.608747 ​ -0.729017
 +
 +julia> Q'​*Q ​               ## show that we have orthogonal vectors
 +3×3 Array{Float64,​2}:​
 +  1.0          -1.94289e-16 ​ 0.0
 + ​-1.94289e-16 ​  ​1.0 ​         1.66533e-16
 +  0.0           ​1.66533e-16 ​ 1.0
 +
 +```
 +
 +
 +
 +
 +### Shuffling an Array
 +
 +Shuffled values from 1:N can be used as indices to rearrange N vector elements.
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0);​
 +
 +julia> Random.seed!(0);​ randperm(5)
 +5-element Array{Int64,​1}:​
 + 5
 + 4
 + 3
 + 1
 + 2
 +
 +julia> Random.seed!(0);​ shuffle( collect(1:​5) ) ## a different way of shuffling
 +5-element Array{Int64,​1}:​
 + 5
 + 4
 + 2
 + 3
 + 1
 +
 +julia> Random.seed!(0);​ sample( collect(10:​20),​ 3 ) ## Pick 3 random numbers from 10:20 array
 +3-element Array{Int64,​1}:​
 + 10
 + 12
 + 19
 +
 +```
 +
 +StatsBase offers many [sampling variants](http://​juliastats.github.io/​StatsBase.jl/​latest/​sampling.html).
 +
 +
 +## Drawing "​Perfect-Density"​ Random Variables
 +
 +Independent random draws do not have empirical densities that perfectly match their theoretical distributions. ​ For example, a million fair coin flips may have 500,021 heads, rather than 500,000 heads. ​ Sometimes, this type of true randomness is desirable. ​ Often, it is not and simply adds extra noise where it is not desired. ​ Users may want to have random draws on each draw, but have exactly 500,000 heads and 500,000 tails. ​ The randomness is in when what value shows up.
 +
 +The following draws from a "​perfect density"​ distribution,​ guaranteed to match the theoretical density function, with randomness in when each value is drawn:
 +
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0); ​
 +
 +julia> randPerfect(d::​UnivariateDistribution,​ n::​Integer)= shuffle!(quantile.(d,​ (1:​n)/​(n+1)));​
 +
 +julia> randPerfect( Normal(0,​1),​ 5 ) ## Example: draws from a normal distribution
 +5-element Array{Float64,​1}:​
 +  0.9674215661017014
 +  0.43072729929545733
 + ​-0.4307272992954576
 +  0.0
 + ​-0.9674215661017014
 +
 +```
 +
 +## Fitting Parameters of Statistical Distributions To Data
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0);​
 +
 +julia> fit_mle( Normal, rand( Normal( 10 , 5 ) , 10_000_000 ) )    ## Fitting the correct distribution -> recover params
 +Normal{Float64}(μ=9.99913328006548,​ σ=5.000592281736906)
 +
 +julia> fit_mle( Normal, rand(1_000_000) )         ## Fitting the wrong distribution--a normal fitted to uniforms
 +Normal{Float64}(μ=0.5001626330884961,​ σ=0.288518549254692)
 +
 +```
 +
 +
 +### Testing for Statistical Distributions
 +
 +See [[unistats]] for tests whether a set of random draws come from a particular distributin.
 +
 +
 +## The Gaussian (Normal) Distribution
 +
 +### Random Numbers
 +
 +```juliarepl
 +julia> using Random; ​ Random.seed!(0); ​
 +
 +julia> randn(3)
 +3-element Array{Float64,​1}:​
 +  0.6791074260357777
 +  0.8284134829000359
 + ​-0.3530074003005963
 +
 +```
 +
 +
 +In distributions,​
 +
 +```juliarepl
 +julia> using Random, Distributions; ​ Random.seed!(0);​
 +
 +julia> rand( Normal(0,​1),​ 4 )
 +4-element Array{Float64,​1}:​
 +  0.6791074260357777
 +  0.8284134829000359
 + ​-0.3530074003005963
 + ​-0.13485387193052173
 +
 +```
 +
 +See also the example of perfect-density draws above, which also generate Normal variates.
 +
 +
 +### Multivariate Gaussian (Normal) Distribution with Covariance
 +
 +
 +```juliarepl
 +
 +julia> using Random, Distributions;​ Random.seed!(0);​
 +
 +julia> gen3dim= MvNormal( 3, 100.0 )    ## generate 3-dim normal SD=100
 +ZeroMeanIsoNormal(
 +dim: 3
 +μ: [0.0, 0.0, 0.0]
 +Σ: [10000.0 0.0 0.0; 0.0 10000.0 0.0; 0.0 0.0 10000.0]
 +)
 +
 +
 +julia> rand( gen3dim, 5 )
 +3×5 Array{Float64,​2}:​
 +  67.9107 ​ -13.4854 ​   6.49475 ​ 157.433 ​   39.7482
 +  82.8413 ​  ​58.6617 ​ -10.9017 ​  ​-68.8907 ​  ​81.163
 + ​-35.3007 ​  ​29.7336 ​ -51.421 ​   -76.2804 ​ -34.6355
 +
 +julia> Random.seed!(0);​
 +
 +julia> x= 10.0*randn(1000);​ y= 10.0*randn(1000)+x; ​ covmat= cov(hcat( x, y )); x= y= nothing
 +
 +julia> gen2dim= MvNormal( [10.0, 20.0], covmat ) ## specific mean and covariance matrix
 +FullNormal(
 +dim: 2
 +μ: [10.0, 20.0]
 +Σ: [106.834 98.4571; 98.4571 190.721]
 +)
 +
 +
 +julia> rand( gen2dim, 5 )
 +2×5 Array{Float64,​2}:​
 + ​11.4939 ​ 23.7481 ​ 17.1309 ​ -1.47097 ​ -8.87016
 + ​30.0597 ​ 33.7182 ​ 14.2002 ​ 13.7076 ​   5.54033
 +
 +```
 +
 +
 +### Probability Density Function (PDF)
 +
 +```juliarepl
 +julia> using Random, Distributions;​ Random.seed!(0);​
 +
 +julia> pdf.( Normal(0,​1),​ [ -2.0, 0, 2.0 ] )  ### use logpdf for tiny values
 +3-element Array{Float64,​1}:​
 + ​0.05399096651318806
 + ​0.3989422804014327
 + ​0.05399096651318806
 +```
 +
 +### Log Likelihood
 +
 +```juliarepl
 +julia> using Random, Distributions;​ Random.seed!(0);​
 +
 +julia> loglikelihood( Normal(0,​1),​ rand( Normal(0,​1),​ 1_000 ) )
 +-1452.9989715615147
 +
 +```
 +
 +
 +### Cumulative Density Funtion (CDF)
 +
 +```juliarepl
 +julia> using Distributions;​ Random.seed!(0);​
 +
 +julia> cdf.( Normal(0,​1),​ [ -2.0, 0, 2.0 ] )  ### use logcdf for tiny values
 +3-element Array{Float64,​1}:​
 + ​0.022750131948179205
 + 0.5
 + ​0.9772498680518208
 +
 +```
 +
 +### Inverse CDF
 +
 +```juliarepl
 +julia> using Distributions;​ Random.seed!(0);​
 +
 +julia> ​ quantile.( Normal(0,​1),​ [0.025,​0.05,​0.5,​0.95,​0.975] )
 +5-element Array{Float64,​1}:​
 + ​-1.9599639845400592
 + ​-1.6448536269514724
 +  0.0
 +  1.6448536269514717
 +  1.9599639845400576
 +```
 +
 +
 +
 +
 +## Histograms
 +
 +```juliarepl
 +julia> using Random, Distributions,​ StatsBase; Random.seed!(0);​
 +
 +julia> x= rand( Normal(0,​1),​ 10_000);
 +
 +julia> fit( Histogram, x , closed= :left, nbins=10 ​ )
 +Histogram{Int64,​1,​Tuple{StepRangeLen{Float64,​Base.TwicePrecision{Float64},​Base.TwicePrecision{Float64}}}}
 +edges:
 +  -4.0:​1.0:​5.0
 +weights: [10, 224, 1398, 3426, 3373, 1342, 213, 13, 1]
 +closed: left
 +isdensity: false
 +
 +julia> fit( Histogram, x , closed= :right, nbins=10 ​ )
 +Histogram{Int64,​1,​Tuple{StepRangeLen{Float64,​Base.TwicePrecision{Float64},​Base.TwicePrecision{Float64}}}}
 +edges:
 +  -4.0:​1.0:​5.0
 +weights: [10, 224, 1398, 3426, 3373, 1342, 213, 13, 1]
 +closed: right
 +isdensity: false
 +```
 +
 +Julia'​s [[plotting]] makes it easy to plot histograms:
 +
 +```juliaeval
 +using Plots
 +using Distributions,​ StatsBase; n=1_000; rn= quantile.( Normal(0.0,​1.0),​ [1:​n;​]/​(n+1));​
 +histogram( rn.^2, legend=false ); # a chi-squared
 +savefig( "​plotting/​histogram1.png"​ );
 +```
 +
 +{{histogram1.png}}
 +
 +
 +
 +
 +* `StatsBase` contains an `ecdf` function that is the direct empirical cdf, whereas `EmpiricalUnivariateDistribution` produces an distribution function for input into cdf, pdf, rand, etc.
 +
 +
 +
 +# Backmatter
 +
 +## Useful Packages on Julia Repository
 +
 +## Notes
 +
 +* EmpiricalUnivariateDistribution is [broken](https://​github.com/​JuliaStats/​Distributions.jl/​pull/​800).
 +
 +```julianoeval
 +julia> using Random, Distributions,​ Plots; Random.seed!(0);​
 +
 +julia> xr= -4.0:​0.01:​4.0;​ plot( xr, cdf.( Normal(0,​1),​ xr ), legend= false) ## plots the standard normal first
 +
 +julia> x=  max.(randn(10000),​ (rand(10000) .- 0.5));
 +
 +julia> empcdfun= EmpiricalUnivariateDistribution( x );
 +
 +julia> cdf.( empcdfun, [-0.5,​-0.25,​0.0,​0.5,​ 10.0] )
 +5-element Array{Float64,​1}:​
 + 0.0
 + ​0.0998
 + ​0.2453
 + ​0.6972
 + 1.0
 +
 +julia> rand( empcdfun, 4 )
 +4-element Array{Float64,​1}:​
 + ​0.43878137920825677
 + ​0.17368409076394875
 + ​0.17510611936949205
 + ​1.0969671028632133
 +
 +julia> plot!( x -> cdf(empcdfun,​ x) )
 +
 +```
 +
 +## References
  
random.txt · Last modified: 2018/12/27 13:27 (external edit)