User Tools

Site Tools


arraysvector

Arrays Introduction Vectors and Range Iterators Sort and Sets Matrices+

snippet.juliarepl
julia> pkgchk( [ "julia" => v"1.0.2", "CategoricalArrays" => v"0.4.0" ] )

Vectors (1-Dimensional Arrays)

  • Vector is an alias for a 1-dimensional array.
snippet.juliarepl
julia> Vector{Float64}
Array{Float64,1}
  • All vectors are, by default, (displayed as) column vectors.
  • Vectors come with stack, queue, and set operations. end is a valid abbreviation for length() inside the index operator.
  • Row vectors are rarely useful. In fact, Julia has both RowVector types and matrices with one row and many columns. Matrices are discussed in the arraysmatrix chapter.

Creating a Vector from Values

A vector can be created by specifying all elements in a comma-separated list. (Space separated values create matrices.) Vectors usually convert seamlessly and as expected.
FIXME add comprehensions from Special Loops for Vectors and Tuples. perhaps add for vectors of other items, like structs

Column Vectors

The default orientation of Vector is column.

snippet.juliarepl
julia> x1= [ 1, 2, 3 ]
3-element Array{Int64,1}:
 1
 2
 3

julia> show( IOContext(stdout, :compact => true), x1 )              ## compact display
[1, 2, 3]

A column vector is not the same as a matrix with 1 column:

snippet.juliarepl
julia> cv= fill(0.0, 3)
3-element Array{Float64,1}:
 0.0
 0.0
 0.0

julia> mcv= fill(0.0, 3, 1)            ## 3 by 1 array = matrix
3×1 Array{Float64,2}:
 0.0
 0.0
 0.0

julia> typeof(cv) == typeof(mcv)
false

Row Vectors

Julia's row vectors are rarely useful, which is why they require loading the linear algebra package and are not part of the base Julia. Julia understands that “row within a 1x matrix” is different from a row vector.

snippet.juliarepl
julia> [1 2 3]                        ## space, instead of comma: one row in a two-dimensional 1x3 matrix!!  
1×3 Array{Int64,2}:
 1  2  3

julia> using LinearAlgebra

julia> Transpose( [1,2,3] )           ## confirm this is not exactly the same type.
1×3 Transpose{Int64,Array{Int64,1}}:
 1  2  3

Initializing a Vector of Specific Type (e.g., Float32)

snippet.juliarepl
julia> [1.0,NaN,2.0]                          ## Floats default to Float64
3-element Array{Float64,1}:
   1.0
 NaN
   2.0

julia> Vector{Float32}( [1.0,NaN,2.0] )       ## Force Float32 vector
3-element Array{Float32,1}:
     1.0
 NaN
     2.0

Checking whether a Vector is Numeric

snippet.juliarepl
julia> isnumeric(x::Vector)::Bool= (eltype(x) <: Union{Missing,Real});

julia> for x in ( [1,2], [1,2.0], [1,2.0,missing], [1,'a',2] ); @info("'$x' is **$(isnumeric(x))**"); end
[ Info: '[1, 2]' is **true**
[ Info: '[1.0, 2.0]' is **true**
[ Info: 'Union{Missing, Float64}[1.0, 2.0, missing]' is **true**
[ Info: 'Any[1, 'a', 2]' is **false**

Initializing an Array with Same Values

For good programming practice, avoid uninitialized arrays. Please always initialize your variables. A common initialization is to zero or NaN which can be achieved, e.g., with zeros or repmat:

snippet.juliarepl
julia> zeros(2)                         ## built in
2-element Array{Float64,1}:
0.0
0.0

julia> fill( NaN, 3 )                   ## usually best for initialization
3-element Array{Float64,1}:
 NaN
 NaN
 NaN

julia> using LinearAlgebra

julia> repeat( [ NaN, 1.0 ], 3 )
6-element Array{Float64,1}:
 NaN
   1.0
 NaN
   1.0
 NaN
   1.0

Do not use fill instead of repeat:

snippet.juliarepl
julia> fill( [ NaN, 1.0 ], 3 )
3-element Array{Array{Float64,1},1}:
 [NaN, 1.0]
 [NaN, 1.0]
 [NaN, 1.0]

These functions also work on higher-dimensional arrays. Matrices also often use fill(). Type-specialized zeros(Float64,5,5) (five by five zeros), or ones(Bool, 4, 4) (four by four trues), or rand(3,5) function, explained in Random Variables, can also initialize arrays of any dimension.

Printing a Numeric Vector With a Particular Format

snippet.juliarepl
julia> using Formatting

julia> join( sprintf1.("%4.1f", [ 0.123, pi, exp(3.0) ]), " : " )
" 0.1 :  3.1 :  20.1"

Stack and Queue Operatins

Already expleined in the Inquiring chapter, functions with trailing '!' modify their first argument, while functions without it just return a result and leave their first argument contents untouched.

Push!, Pop!, Shift!, Unshift!

snippet.juliarepl
julia> x= [1,2,3]; y= append!(x,4); (y === x)	## both y and x have 4 elements now
true

julia> x= [1,2,3];

julia> pushfirst!(x,0)				## shift right and put 0 into pos 1 (often also called unshift)
4-element Array{Int64,1}:
 0
 1
 2
 3

julia> println( shift!(x) , "   ", x)		## drop 0 from pos 1 and shift left
ERROR: UndefVarError: shift! not defined
Stacktrace:

julia> push!(x,4)               		## add 4 at the end, like append
5-element Array{Int64,1}:
 0
 1
 2
 3
 4

julia> println( pop!(x) , "   ", x)		## drop 4 from the end
4   [0, 1, 2, 3]

Appending (Values or Arrays) to an Array

vcat, append, and push all do the same thing.

snippet.juliarepl
julia> x= [1,2,3];

julia> vcat(x,4)                   ## appends 4, but does not modify x
4-element Array{Int64,1}:
 1
 2
 3
 4

julia> x
3-element Array{Int64,1}:
 1
 2
 3

julia> push!(x,4); x               ## push!() is really vcat!(); appends 4 and modifies x
4-element Array{Int64,1}:
 1
 2
 3
 4

Flat and Nested Joins

snippet.juliarepl
julia> a1=[ "a", "b" ]; a2= [ "c", "d" ];

julia> [ a1, a2 ]                               ## not auto-flattened
2-element Array{Array{String,1},1}:
 ["a", "b"]
 ["c", "d"]

julia> vcat( a1, a2 )                           ## flattened
4-element Array{String,1}:
 "a"
 "b"
 "c"
 "d"

Flattening Everything (Vectors of Vectors)

snippet.juliarepl
julia> using Base.Iterators

julia> function flattenall(a::AbstractArray)
           while ( any( x->(typeof(x) <: AbstractArray), a ) )
               a= collect(Base.Iterators.flatten(a))
           end
           a
       end#function##
flattenall (generic function with 1 method)

julia> flattenall( [ [1,2], 1, [[3]] ] )                   ## PS: you may want to convert this to Vector{Int}
4-element Array{Int64,1}:
 1
 2
 1
 3
  • You can also often flatten and nest via reshaping. For example,
snippet.juliarepl
julia> using Base.Iterators

julia> arr= collect( product( [1,2,3], [4,5] ) )
3×2 Array{Tuple{Int64,Int64},2}:
 (1, 4)  (1, 5)
 (2, 4)  (2, 5)
 (3, 4)  (3, 5)


julia> reshape( ans, (6,1) )	## A flatter list
6×1 Array{Tuple{Int64,Int64},2}:
 (1, 4)
 (2, 4)
 (3, 4)
 (1, 5)
 (2, 5)
 (3, 5)

julia> reshape( ans, (2,3) )
2×3 Array{Tuple{Int64,Int64},2}:
 (1, 4)  (3, 4)  (2, 5)
 (2, 4)  (1, 5)  (3, 5)

Arithmetic Operations With Flattening

snippet.juliarepl
julia> @. [ 10 + [1,+1]*2; 20 + [3,+5]^1 ]
4-element Array{Int64,1}:
  8
 12
 17
 25

julia> vcat(@. (10 + (-i:i)*2 for i=1:2)...)
8-element Array{Int64,1}:
  8
 10
 12
  6
  8
 10
 12
 14

Leading or Lagging Arrays

  • A lag() function unshifts NaN or Missing onto a vector and pops off the vector's last element.
  • The functions chapter defines a good lag() vector function that also handles missing well.
  • ShiftedArrays.jl offers similar functionality (and even some event-study capability to shift multiple vectors for alignment). It is a little more efficient (working with views rather than copies, but it always uses Missing instead of NaN. Missing works for all types, but are much slower).
  • TimeSeries.jl Package provides TimeArray object lead() and lag() functions. It is covered in Univariate Timeseries.

Reversing Arrays

snippet.juliarepl
julia> reverse([1,2,3])
3-element Array{Int64,1}:
 3
 2
 1

Like sort, the reverse function creates a copy of the argument array and reverses that. If you'd like to reverse the input array, use the reverse! function.

Iterating over Arrays

Many use cases for arrays involve iterating over each element, and applying some operation on each element. The result could be a new array, changes to the same array, or an aggregate of some kind.

Julia provides the in operator which when used in conjunction with the for loop, can be used to iterate over every element in the list:

Iterating over Array Contents

snippet.juliarepl
julia> for item in ["A","B","C"];  println(item);  end
A
B
C

Iterating Circularly over Array Contents

snippet.juliarepl
julia> circidx( i::Int, arrlen::Int )::Int=  mod( i–1, arrlen ) + 1;

julia> for i=1:9; println( i, " => ", circidx(i,4) ); end
1 => 1
2 => 2
3 => 3
4 => 4
5 => 1
6 => 2
7 => 3
8 => 4
9 => 1

Iterating over Array Index and Contents

snippet.juliarepl
julia> for (index,item) in enumerate(["A","B","C"]);  println("$(index) -> $(item)");  end#for
1 -> A
2 -> B
3 -> C

Iterating over Array Index

snippet.juliarepl
julia> for index in eachindex(["A","B","C"]);  println(index);  end
1
2
3

Iterating Synchronously over Multiple Arrays

snippet.juliarepl
julia> for (item1,item2) in zip(["A","B","C"], ["a","b","c"]);  println(item1, " ", item2);  end
 A a
 B b
 C c

Applying the same Function to each Element (Map)

Most of the time, you can just use a 'postfix-dot' function equivalent call (e.g., sqrt.([1.0,2.0])) to apply a function to each element of a vector. Sometimes, the map() function is more convenient:

snippet.juliarepl
julia> map( x->(x+1), [1,2,3] )
3-element Array{Int64,1}:
 2
 3
 4

For very large arrays and slow functions, the pmap function can perform the operation in parallel, but it rarely pays. See Parallel Processing:

Element-Wise Comparisons (Dot Operators)

Use .== (and equivalent dot_postfix_functionsdot operators):

snippet.juliarepl
julia> [ 1, 5, 1, 5, 1, 5 ] .>= [ 2, 10, 10, 2, 1, 5 ]
6-element BitArray{1}:
 false
 false
 false
  true
  true
  true

Or use map:

snippet.juliarepl
julia> y = [ 2, 1, 1, 1, 1 ];  x = [ 1, 3, 5, 0, 2 ];

julia> map( (x,y)->((x>y) ? 1 : 2), x, y )
5-element Array{Int64,1}:
 2
 1
 1
 2
 1

Larger or Smaller Elements of Two Equal-Length Arrays

To find the maximum or minimum values across two arrays (generating an array with the larger of the two values at each index), use the max. or min. function:

snippet.juliarepl
julia> max.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
 2
 3
 5
 1
 2

julia> min.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
 1
 1
 1
 0
 1

Applying Operation and Aggregating (Map Reduce)

To perform an operation on every element in the array, and then perform some sort of aggregation on the result (min, max, sum etc):

snippet.juliarepl
julia> mapreduce(x->x^2, +, [1,2,3,4,5])       ## sum of squares of first five numbers
55

This is often faster than sum(map(x->x^2, [1,2,3,4,5])). (In this example, mapreduce() is twice as fast as map(x->x^2, [1,2,3,4,5]) |> sum.)

Recoding all Values in an Array

julia> using Missings

snippet.juliarepl
julia> v= [1.0, NaN, missing, 4.0]; replace( v, NaN => missing )       ## special case: could use Missings.replace(v, NaN)
4-element Array{Union{Missing, Float64},1}:
 1.0
  missing
  missing
 4.0
  • Missing is a type, missing is a value.

R-Like List Apply (lapply) and Parallel List Apply (mclapply)

snippet.juliarepl
julia> using CategoricalArrays

julia> function lapply(obj::AbstractVector, ind::AbstractVector, func::Function, x...)
           map( elem->(elem, func(obj[findall(elem .== ind)], x...)), levels(ind))
       end;#function##

julia> using Random

julia> Random.seed!(0);

julia> randcateg4 = rand('a':'d', 100);			## sample testset

julia> hundredsquares= [1:100;].^2 ;

julia> using Statistics

julia> lapply( hundredsquares, randcateg4, mean )	## for each element, by randcateg4, calculate mean
4-element Array{Tuple{Char,Float64},1}:
 ('a', 3575.25)
 ('b', 3704.0)
 ('c', 3516.217391304348)
 ('d', 2854.310344827586)

julia> lapply( hundredsquares, randcateg4, quantile, [0.25, 0.50, 0.75] )
4-element Array{Tuple{Char,Array{Float64,1}},1}:
 ('a', [885.25, 2234.0, 6321.0])
 ('b', [720.25, 3192.5, 5932.0])
 ('c', [626.0, 1936.0, 6376.5])
 ('d', [625.0, 2209.0, 4489.0])

See also Univariate Statistics -- Classifications for calculating an original-size vector of group means (R ave()).

SUGGESTION: AVOID Any (and Unitialized) Arrays

An array of type Any can support elements of all types:

snippet.juliarepl
julia> Any[1, 1.0, "1", true]  ## often evil; avoid this when you know the types of elements
4-element Array{Any,1}:
 1
 1.0
 "1"
 true

You should avoid use of generic (Any) types, especially in arrays, unless you desperately need them (which is rare).

You can even create uninitialized arrays, whose values contain junk.

The following example shows both sins: an uninitialized 5-dimensional array of type T is using the Array constructor:

snippet.juliarepl
julia> Array{Any, 5}  ## evil-squared
Array{Any,5}

Range Iterators

Range objects can be viewed as smart unexpanded sequences (consecutive tuples or arrays). To expand a range object into an array, use the collect() function:

snippet.juliarepl
julia> 1:3
 1:3

julia> typeof(1:3)
UnitRange{Int64}

julia> collect(1:3)
3-element Array{Int64,1}:
1
2
3

julia> [1:3;]                    ## [ range ;]  (or [ range ... ]) means collect, too
3-element Array{Int64,1}:
 1
 2
 3

Arbitrary Steps

For steps other than 1, use the “three colon” form:

snippet.juliarepl
julia> collect(1:0.5:3)
5-element Array{Float64,1}:
 1.0
 1.5
 2.0
 2.5
 3.0

Specific Number of Steps

To obtain a specific number of elements in a sequence, use range. For example, for six elements from 1 to 3 are

snippet.juliarepl
julia> collect( range(1; stop=3, length=6) )
6-element Array{Float64,1}:
 1.0
 1.4
 1.8
 2.2
 2.6
 3.0

Backmatter

Notes

AbstractArray{T,N} can be useful, e.g., for sorting, but are too complex for this tutorial and thus ignored.

It is a pity that julia does not force type declarations of all variables, and especially of its Any type.

When an array display would exceed the terminal display, Julia fits it with dot indicators, omitting middle elements. For brutalists, show( [1:10000;] ) would print all 10,000 numbers, even when the terminal can hold only 50 rows!

References

arraysvector.txt · Last modified: 2018/11/22 20:47 (external edit)