User Tools

Site Tools


arraysvector

Arrays Introduction Vectors and Range Iterators Sort and Sets Matrices+

snippet.juliarepl
julia> pkgchk.( [ "julia" => v"1.0.3", "CategoricalArrays" => v"0.4.0" ] );

Vectors (1-Dimensional Arrays)

  • Vector is an alias for a 1-dimensional array. (The next chapter (arraysmatrix) discusses two-dimensional matrices and more linear algebra.)
snippet.juliarepl
julia> Vector{Float64}
Array{Float64,1}

julia> [1,2,3]		## a 1-dimensional array (of Ints) of length 3
3-element Array{Int64,1}:
 1
 2
 3

julia> [1 2 3]		## but spaces between numbers: a 2-dimensional array (of Ints) with 3 columns and 1 row
1×3 Array{Int64,2}:
 1  2  3
  • Vectors offer stack, queue, and set operations. end is a valid abbreviation for length() inside the index operator.

Plain and Linear Algebra Vectors and Matrices

Julia can distinguish between ordinary vectors and LinearAlgebra vectors. LinearAlgebra vectors can have a (column) orientation, too. When needed, julia converts ordinary vectors into column vectors and vice-versa. A row vector is typically represented as a matrix with one row and many columns.

Thus, a transpose on a plain vector considers its input as a column vector, and renders it such that it can be used as if it were a row vector (which is really more like a matrix with one row, albeit with a different type name). This “transposed-vector” type is still different from a “1-row-matrix” type, but the “transposed-vector” values can seamlessly be treated as or compared to the “1-row-matrix” values.

snippet.juliarepl
julia> a= [1,2,3]		## a plain (column) vector
3-element Array{Int64,1}:
 1
 2
 3

julia> ta1= transpose( a )	## transpose is *not fully* a matrix, but a similar *sort-of-row* vector now
1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}:
 1  2  3

julia> b= [1 2 3]		## this is a matrix, with one row
1×3 Array{Int64,2}:
 1  2  3

julia> tb1= transpose( b )	## this is a matrix, with one column
3×1 LinearAlgebra.Transpose{Int64,Array{Int64,2}}:
 1
 2
 3

julia> ( a==b, a==ta1, a==tb1, b==a, b==ta1, b==tb1 )	## note the fifth element that is 'true'
(false, false, false, false, true, false)
  • Despite its type 1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}, ta1 can be used just like b which is a 1×3 Array{Int64,2}. This is why b==ta1. However typeof(b) == typeof(ta1) is false. (The type information is useful, because if you transpose ta1 [again], you get the plain vector back; but if you transpose b, you get the two-dimensional matrix back.)
  • Note that this is not symmetric: although b can be the same as the transpose of a, a can never be the same as a transpose of b. This is because a transpose of b remains a matrix with one column, and a plain one-dimensional vector is never the same as a two-dimensional matrix with one column.
  • Not shown: The REPL-printed name of the transpose vector changes when LinearAlgebra is loaded. This is not because the type has changed, but because the display changes: after LinearAlgebra is loaded, the prefix is no longer shown.

Linear Algebra Operations and Element-by-Element Operations

snippet.juliarepl
julia> [1 2 3] * [2,3,4]		## even without LinearAlgebra.jl loaded, julia knows vector products
1-element Array{Int64,1}:
 20

julia> [1,2,3] .* [2,3,4]		## with the dot, this is an element-by-element operation
3-element Array{Int64,1}:
  2
  6
 12

julia> transpose([1,2,3]) .* [2,3,4]	## this is quasi-element by element, but really linear algebra cross-product
3×3 Array{Int64,2}:
 2  4   6
 3  6   9
 4  8  12

julia> transpose([1,2,3]) .* transpose([2,3,4])	## but this is again just element by element
1×3 Array{Int64,2}:
 2  6  12
  • [1,2,3] * [1,2,3] is an error. Column vectors have no vector product.

Creating a Vector from Values

A vector can be created by specifying all elements in a comma-separated list. (Space separated values create matrices.) Vectors usually convert seamlessly and as expected.

Plain Vectors

A plain vector, without orientation, looks like a column vector for most purposes:

snippet.juliarepl
julia> x1= [ 1, 2, 3 ]
3-element Array{Int64,1}:
 1
 2
 3

julia> show( IOContext(stdout, :compact => true), x1 )              ## compact display
[1, 2, 3]

Comprehensions

Mentioned in Special Loops for Vectors and Tuples, you can also initialize vector with comprehensions:

snippet.juliarepl
julia> [ (i+2,i/2) for i=4:6 ]
3-element Array{Tuple{Int64,Float64},1}:
 (6, 2.0)
 (7, 2.5)
 (8, 3.0)

Row Vectors (Are Matrices)

As noted, Julia no longer features row vectors but uses two-dimensional matrices with one row instead. Equivalently, you can transpose a (column) vector. To initialize a quasi-row vector (which is really just a matrix with one row), you have two choices: create a matrix or a transposed vector

snippet.juliarepl
julia> fill(0.0, 3, 1)            ## 3 by 1 array = 2-dim array = matrix
3×1 Array{Float64,2}:
 0.0
 0.0
 0.0

julia> fill(0.0,3)'		## the '' is the transpose() operator
1×3 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:
 0.0  0.0  0.0

Initializing a Vector of Specific Type (e.g., Float32)

snippet.juliarepl
julia> [1.0,NaN,2.0]                          ## Floats default to Float64
3-element Array{Float64,1}:
   1.0
 NaN
   2.0

julia> Vector{Float32}( [1.0,NaN,2.0] )       ## Force Float32 vector
3-element Array{Float32,1}:
     1.0
 NaN
     2.0

julia> typeof([ 1.0f0, NaN32, 2.0f0 ])			## f0 designates F32, so this is the same
Array{Float32,1}

julia> typeof([ 1.0f0, NaN32, 2.0 ])			## 2.0 requires F64, so it is promoted
Array{Float64,1}

Initializing an Array with Same Values

For good programming practice, avoid uninitialized arrays. Please always initialize your variables. A common initialization is to zero or NaN which can be achieved, e.g., with zeros or repmat:

snippet.juliarepl
julia> zeros(2)                         ## built in
2-element Array{Float64,1}:
0.0
0.0

julia> fill( NaN, 3 )                   ## usually best for initialization
3-element Array{Float64,1}:
 NaN
 NaN
 NaN

julia> using LinearAlgebra

julia> repeat( [ NaN, 1.0 ], 3 )	## for more than one value, start with an array and use repeat
6-element Array{Float64,1}:
 NaN
   1.0
 NaN
   1.0
 NaN
   1.0

Do not use fill instead of repeat, or you will get an array of arrays:

snippet.juliarepl
julia> fill( [ NaN, 1.0 ], 3 )
3-element Array{Array{Float64,1},1}:
 [NaN, 1.0]
 [NaN, 1.0]
 [NaN, 1.0]

These functions also work on higher-dimensional arrays. Matrices also often use fill(). Type-specialized zeros(Float64,5,5) (five by five zeros), or ones(Bool, 4, 4) (four by four trues), or rand(3,5) function, explained in Random Variables, can also initialize arrays of any dimension.

Checking whether a Vector is Numeric

snippet.juliarepl
julia> isnumeric(x::Vector)::Bool= (eltype(x) <: Union{Missing,Real});

julia> for x in ( [1,2], [1,2.0], [1,2.0,missing], [1,'a',2] ); @info("'$x' is **$(isnumeric(x))**"); end
[ Info: '[1, 2]' is **true**
[ Info: '[1.0, 2.0]' is **true**
[ Info: 'Union{Missing, Float64}[1.0, 2.0, missing]' is **true**
[ Info: 'Any[1, 'a', 2]' is **false**

Printing a Numeric Vector With a Particular Format

snippet.juliarepl
julia> using Formatting

julia> join( sprintf1.("%4.1f", [ 0.123, pi, exp(3.0) ]), " : " )
" 0.1 :  3.1 :  20.1"

Stack and Queue Operations

Already explained in the Inquiring chapter, functions with trailing '!' modify their first argument, while functions without it just return a result and leave their first argument contents untouched.

Appending (Values or Arrays) to an Array

vcat, append, and push all do the same thing.

snippet.juliarepl
julia> x= [1,2,3];

julia> vcat(x,4)                   ## appends 4, but does not modify x
4-element Array{Int64,1}:
 1
 2
 3
 4

julia> x			## the original vector x is not modified
3-element Array{Int64,1}:
 1
 2
 3

They also exist with an ending '!' to modify the array in place.


julia> push!(x,4); x               ## push!() is really vcat!(); appends 4 *and* modifies x
4-element Array{Int64,1}:
 1
 2
 3
 4

There is a convenient short-hand “bracket” notation for vcat:

snippet.juliarepl
julia> [ [1,2,3] ; [ 4,5,6 ] ]
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

Push!, Pop!, PopFirst!, PushFirst!

snippet.juliarepl
julia> x= [1,2,3];

julia> append!(x,4)				## append = push.  note '!' for changing contents of array
4-element Array{Int64,1}:
 1
 2
 3
 4

julia> pushfirst!(x,0)				## shift right and place 0 into pos 1 (often also called unshift)
5-element Array{Int64,1}:
 0
 1
 2
 3
 4
 
julia> ( popfirst!(x), " -- ", x )		## drop 0 from pos 1 and shift left
(0, " -- ", [1, 2, 3, 4])

julia> ( pop!(x), " -- ", x )                   ## drop 4 from last pos
(4, " -- ", [1, 2, 3])
  • In other languages, shift and unshift play the same role as popfirst and pushfirst

Flat and Nested Joins

snippet.juliarepl
julia> a1=[ "a", "b" ]; a2= [ "c", "d" ];

julia> [ a1, a2 ]                               ## not auto-flattened
2-element Array{Array{String,1},1}:
 ["a", "b"]
 ["c", "d"]

julia> vcat( a1, a2 )                           ## flattened.  or use shortcut [ a1 ; a2 ]
4-element Array{String,1}:
 "a"
 "b"
 "c"
 "d"

Flattening Everything (Vectors of Vectors)

snippet.juliarepl
julia> using Base.Iterators

julia> function flattenall(a::AbstractArray)
           while ( any( x->(typeof(x) <: AbstractArray), a ) )
               a= collect(Base.Iterators.flatten(a))
           end
           a
       end#function##
flattenall (generic function with 1 method)

julia> flattenall( [ [1,2], 1, [[3]] ] )                   ## PS: you may want to convert this to Vector{Int}
4-element Array{Int64,1}:
 1
 2
 1
 3
  • You can also often flatten and nest via reshaping. For example,
snippet.juliarepl
julia> using Base.Iterators

julia> arr= collect( product( [1,2,3], [4,5] ) )
3×2 Array{Tuple{Int64,Int64},2}:
 (1, 4)  (1, 5)
 (2, 4)  (2, 5)
 (3, 4)  (3, 5)


julia> reshape( ans, (6,1) )	## A flatter list
6×1 Array{Tuple{Int64,Int64},2}:
 (1, 4)
 (2, 4)
 (3, 4)
 (1, 5)
 (2, 5)
 (3, 5)

julia> reshape( ans, (2,3) )
2×3 Array{Tuple{Int64,Int64},2}:
 (1, 4)  (3, 4)  (2, 5)
 (2, 4)  (1, 5)  (3, 5)

Arithmetic Operations With Flattening

The '@.' expression is syntactic sugar that inserts dots whereever it is needed in function calls.

snippet.juliarepl
julia> [ 10 .+ [1,+1].*2; 20 .+ [3,+5].^1 ]	## accurate
4-element Array{Int64,1}:
  8
 12
 17
 25

julia> @. [ 10 + [1,+1]*2; 20 + [3,+5]^1 ]	## lazy way of doing this
4-element Array{Int64,1}:
  8
 12
 17
 25

julia> vcat(@. (10 + (-i:i)*2 for i=1:2)...)	## a more useful example: changes iterator into range
8-element Array{Int64,1}:
  8
 10
 12
  6
  8
 10
 12
 14

Leading or Lagging Arrays

It is often useful to lead or lag an array. For example, you may want to see whether values in a timeseries like [ 1, 2, 4, 7, 11 ] can be explained by their past value.

snippet.juliarepl
julia> lag(x::Vector{Float64}, num=1)::Vector{Float64}= vcat( fill(NaN,num), [ x[i] for i=1:length(x)-num] );

julia> x= [ 1.0,2,4,7,11 ]; [ x lag(x) ]
5×2 Array{Float64,2}:
  1.0  NaN
  2.0    1.0
  4.0    2.0
  7.0    4.0
 11.0    7.0

You could now run a regression explaining the first column with the second.

  • The function dispatch chapter defines a better lag() vector function that works for more types.
  • ShiftedArrays.jl offers similar functionality (and even some event-study capability to shift multiple vectors for alignment). It is a little more efficient (working with views rather than copies, but it always uses Missing instead of NaN. Missing works for all types, but are much slower).
  • TimeSeries.jl Package provides TimeArray object lead() and lag() functions. It is covered in Univariate Timeseries.

Reversing Arrays

snippet.juliarepl
julia> reverse([1,2,3])
3-element Array{Int64,1}:
 3
 2
 1

Like sort, the reverse function creates a copy of the argument array and reverses that. If you'd like to reverse the input array, use the reverse! function.

Iterating over Arrays

Many use cases for arrays involve iterating over each element, and applying some operation on each element. The result could be a new array, changes to the same array, or an aggregate of some kind.

Julia provides the in operator which when used in conjunction with the for loop, can be used to iterate over every element in the list:

Iterating over Array Contents

snippet.juliarepl
julia> for item in ["A","B","C"];  println(item);  end
A
B
C

julia> [ item for item in ["a", "b", "c" ] ]	## a "comprehension" is often convenient, creating a new array
3-element Array{String,1}:
 "a"
 "b"
 "c"

Iterating Circularly over Array Contents

snippet.juliarepl
julia> circidx( i::Int, arrlen::Int )::Int=  mod( i–1, arrlen ) + 1;

julia> for i=1:9; println( i, " => ", circidx(i,4) ); end
1 => 1
2 => 2
3 => 3
4 => 4
5 => 1
6 => 2
7 => 3
8 => 4
9 => 1
  • Julia offers a set of common iteration tools in IterTools.

Iterating over Array Index and Contents

snippet.juliarepl
julia> for (index,item) in enumerate(["A","B","C"]);  println("$(index) -> $(item)");  end#for
1 -> A
2 -> B
3 -> C

Iterating over Array Index

snippet.juliarepl
julia> for index in eachindex(["A","B","C"]);  println(index);  end
1
2
3

Iterating Synchronously over Multiple Arrays

snippet.juliarepl
julia> for (item1,item2) in zip(["A","B","C"], ["a","b","c"]);  println(item1, " ", item2);  end
 A a
 B b
 C c

Applying the same Function to each Element (Map)

Most of the time, you can just use a 'postfix-dot' function equivalent call (e.g., sqrt.([1.0,2.0])) to apply a function to each element of a vector. Sometimes, the map() function is more convenient:

snippet.juliarepl
julia> map( x->(x+1), [1,2,3] )
3-element Array{Int64,1}:
 2
 3
 4

For very large arrays and slow functions, the pmap function can perform the operation in parallel, but it rarely pays. See Parallel Processing:

Element-Wise Comparisons (Dot Operators)

Use .== (and equivalent, see funother#dot_postfix_functionsdot operators chapter):

snippet.juliarepl
julia> [ 1, 5, 1, 5, 1, 5 ] .>= [ 2, 10, 10, 2, 1, 5 ]
6-element BitArray{1}:
 false
 false
 false
  true
  true
  true

Or use map:

snippet.juliarepl
julia> y = [ 2, 1, 1, 1, 1 ];  x = [ 1, 3, 5, 0, 2 ];

julia> map( (x,y)->((x>y) ? 1 : 2), x, y )
5-element Array{Int64,1}:
 2
 1
 1
 2
 1

Larger or Smaller Elements of Two Equal-Length Arrays

To find the maximum or minimum values across two arrays (generating an array with the larger of the two values at each index), use the max. or min. function:

snippet.juliarepl
julia> max.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
 2
 3
 5
 1
 2

julia> min.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
 1
 1
 1
 0
 1

Applying Operation and Aggregating (Map Reduce)

To perform an operation on every element in the array, and then perform some sort of aggregation on the result (min, max, sum etc):

snippet.juliarepl
julia> mapreduce(x->x^2, +, [1,2,3,4,5])       ## sum of squares of first five numbers
55

This is often faster than sum(map(x->x^2, [1,2,3,4,5])). (In this example, mapreduce() is twice as fast as map(x->x^2, [1,2,3,4,5]) |> sum.)

Recoding all Values in an Array

snippet.juliarepl
julia> using Missings

julia> v= [1.0, NaN, missing, 4.0]; replace( v, NaN => missing )       ## special case: could use Missings.replace(v, NaN)
4-element Array{Union{Missing, Float64},1}:
 1.0
  missing
  missing
 4.0
  • Important: Missing is a type, missing is a value.

R-Like By

It is not a great idea to keep R habits in Julia. It is better to learn to think the Julia way. However, when needed, the following function will work like by in R, applying a function to all elements split by a (categorical) vector:

snippet.juliarepl
julia> using CategoricalArrays

julia> function Rby(obj::AbstractVector, ind::AbstractVector, func::Function, x...)
           map( elem->(elem, func(obj[findall(elem .== ind)], x...)), levels(ind))
       end;#function##

julia> using Random

julia> Random.seed!(0);

julia> randcateg4 = rand('a':'d', 100);			## sample testset: 4 categories

julia> hundredsquares= [1:100;].^2 ;

julia> using Statistics

julia> Rby( hundredsquares, randcateg4, mean )		## for each element, by randcateg4, calculate mean
4-element Array{Tuple{Char,Float64},1}:
 ('a', 3575.25)
 ('b', 3704.0)
 ('c', 3516.217391304348)
 ('d', 2854.310344827586)

julia> Rby( hundredsquares, randcateg4, quantile, [0.25, 0.50, 0.75] )	## or calculate the three quantiles
4-element Array{Tuple{Char,Array{Float64,1}},1}:
 ('a', [885.25, 2234.0, 6321.0])
 ('b', [720.25, 3192.5, 5932.0])
 ('c', [626.0, 1936.0, 6376.5])
 ('d', [625.0, 2209.0, 4489.0])

See also Univariate Statistics -- Classifications for calculating an original-size vector of group means (R ave()).

Suggestion: Avoid `Any` and Unitialized Arrays

An array of type Any can support elements of all types:

snippet.juliarepl
julia> Any[1, 1.0, "1", true]  ## often evil; avoid this when you know the types of elements
4-element Array{Any,1}:
 1
 1.0
 "1"
 true

You should avoid use of generic (Any) types, especially in arrays, unless you desperately need them (which is rare).

You can even create uninitialized arrays, whose values contain junk.

The following example shows both sins: an uninitialized 5-dimensional array of type T is using the Array constructor:

snippet.juliarepl
julia> Array{Any, 5}  ## evil-squared
Array{Any,5}

Using such constructs can not only rob you of compile-time type checking, but also of memory and speed if ever accidentally used the wrong way. After all, your computer is good at operating on its native types, and if this is all you need, then restrict yourself to it.

Range Iterators

Range objects can be viewed as smart unexpanded sequences (consecutive tuples or arrays). To expand a range object into an array, use the collect() function:

snippet.juliarepl
julia> 1:3
 1:3

julia> typeof(1:3)
UnitRange{Int64}

julia> collect(1:3)		## convert iterator (range) now into an array of ints
3-element Array{Int64,1}:
1
2
3

julia> [1:3;]                    ## [ range ;]  (or [ range ... ]) means collect, too
3-element Array{Int64,1}:
 1
 2
 3

Arbitrary Steps

For steps other than 1, use the “three colon” form:

snippet.juliarepl
julia> collect(1:0.5:3)
5-element Array{Float64,1}:
 1.0
 1.5
 2.0
 2.5
 3.0

Specific Number of Steps

To obtain a specific number of elements in a sequence, use range. For example, for six elements from 1 to 3 are

snippet.juliarepl
julia> collect( range(1; stop=3, length=6) )
6-element Array{Float64,1}:
 1.0
 1.4
 1.8
 2.2
 2.6
 3.0

Backmatter

Notes

AbstractArray{T,N} can be useful, e.g., for sorting, but are too complex for this tutorial and thus ignored.

It is a pity that julia does not force type declarations of all variables, and especially of its Any type.

When an array display would exceed the terminal display, Julia fits it with dot indicators, omitting middle elements. If you really want to see a long display, you can use show( [1:10000;] ).

References

arraysvector.txt · Last modified: 2018/12/28 11:31 (external edit)