# julia

### Site Tools

arraysvector

Arrays Introduction Vectors and Range Iterators Sort and Sets Matrices+

snippet.juliarepl
`julia> pkgchk.( [ "julia" => v"1.0.3", "CategoricalArrays" => v"0.4.0" ] );`

# Vectors (1-Dimensional Arrays)

• `Vector` is an alias for a 1-dimensional array. (The next chapter (arraysmatrix) discusses two-dimensional matrices and more linear algebra.)
snippet.juliarepl
```julia> Vector{Float64}
Array{Float64,1}

julia> [1,2,3]		## a 1-dimensional array (of Ints) of length 3
3-element Array{Int64,1}:
1
2
3

julia> [1 2 3]		## but spaces between numbers: a 2-dimensional array (of Ints) with 3 columns and 1 row
1×3 Array{Int64,2}:
1  2  3```
• Vectors offer stack, queue, and set operations. `end` is a valid abbreviation for `length()` inside the index operator.

## Plain and Linear Algebra Vectors and Matrices

Julia can distinguish between ordinary vectors and LinearAlgebra vectors. LinearAlgebra vectors can have a (column) orientation, too. When needed, julia converts ordinary vectors into column vectors and vice-versa. A row vector is typically represented as a matrix with one row and many columns.

Thus, a transpose on a plain vector considers its input as a column vector, and renders it such that it can be used as if it were a row vector (which is really more like a matrix with one row, albeit with a different type name). This “transposed-vector” type is still different from a “1-row-matrix” type, but the “transposed-vector” values can seamlessly be treated as or compared to the “1-row-matrix” values.

snippet.juliarepl
```julia> a= [1,2,3]		## a plain (column) vector
3-element Array{Int64,1}:
1
2
3

julia> ta1= transpose( a )	## transpose is *not fully* a matrix, but a similar *sort-of-row* vector now
1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}:
1  2  3

julia> b= [1 2 3]		## this is a matrix, with one row
1×3 Array{Int64,2}:
1  2  3

julia> tb1= transpose( b )	## this is a matrix, with one column
3×1 LinearAlgebra.Transpose{Int64,Array{Int64,2}}:
1
2
3

julia> ( a==b, a==ta1, a==tb1, b==a, b==ta1, b==tb1 )	## note the fifth element that is 'true'
(false, false, false, false, true, false)```
• Despite its type `1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}`, `ta1` can be used just like `b` which is a `1×3 Array{Int64,2}`. This is why `b==ta1`. However `typeof(b) == typeof(ta1)` is false. (The type information is useful, because if you transpose `ta1` [again], you get the plain vector back; but if you transpose `b`, you get the two-dimensional matrix back.)
• Note that this is not symmetric: although `b` can be the same as the transpose of `a`, `a` can never be the same as a transpose of `b`. This is because a transpose of `b` remains a matrix with one column, and a plain one-dimensional vector is never the same as a two-dimensional matrix with one column.
• Not shown: The REPL-printed name of the transpose vector changes when LinearAlgebra is loaded. This is not because the type has changed, but because the display changes: after LinearAlgebra is loaded, the prefix is no longer shown.

### Linear Algebra Operations and Element-by-Element Operations

snippet.juliarepl
```julia> [1 2 3] * [2,3,4]		## even without LinearAlgebra.jl loaded, julia knows vector products
1-element Array{Int64,1}:
20

julia> [1,2,3] .* [2,3,4]		## with the dot, this is an element-by-element operation
3-element Array{Int64,1}:
2
6
12

julia> transpose([1,2,3]) .* [2,3,4]	## this is quasi-element by element, but really linear algebra cross-product
3×3 Array{Int64,2}:
2  4   6
3  6   9
4  8  12

julia> transpose([1,2,3]) .* transpose([2,3,4])	## but this is again just element by element
1×3 Array{Int64,2}:
2  6  12```
• `[1,2,3] * [1,2,3]` is an error. Column vectors have no vector product.

## Creating a Vector from Values

A vector can be created by specifying all elements in a comma-separated list. (Space separated values create matrices.) Vectors usually convert seamlessly and as expected.

### Plain Vectors

A plain vector, without orientation, looks like a column vector for most purposes:

snippet.juliarepl
```julia> x1= [ 1, 2, 3 ]
3-element Array{Int64,1}:
1
2
3

julia> show( IOContext(stdout, :compact => true), x1 )              ## compact display
[1, 2, 3]```

### Comprehensions

Mentioned in Special Loops for Vectors and Tuples, you can also initialize vector with comprehensions:

snippet.juliarepl
```julia> [ (i+2,i/2) for i=4:6 ]
3-element Array{Tuple{Int64,Float64},1}:
(6, 2.0)
(7, 2.5)
(8, 3.0)```

### Row Vectors (Are Matrices)

As noted, Julia no longer features row vectors but uses two-dimensional matrices with one row instead. Equivalently, you can transpose a (column) vector. To initialize a quasi-row vector (which is really just a matrix with one row), you have two choices: create a matrix or a transposed vector

snippet.juliarepl
```julia> fill(0.0, 3, 1)            ## 3 by 1 array = 2-dim array = matrix
3×1 Array{Float64,2}:
0.0
0.0
0.0

julia> fill(0.0,3)'		## the '' is the transpose() operator
0.0  0.0  0.0```

### Initializing a Vector of Specific Type (e.g., Float32)

snippet.juliarepl
```julia> [1.0,NaN,2.0]                          ## Floats default to Float64
3-element Array{Float64,1}:
1.0
NaN
2.0

julia> Vector{Float32}( [1.0,NaN,2.0] )       ## Force Float32 vector
3-element Array{Float32,1}:
1.0
NaN
2.0

julia> typeof([ 1.0f0, NaN32, 2.0f0 ])			## f0 designates F32, so this is the same
Array{Float32,1}

julia> typeof([ 1.0f0, NaN32, 2.0 ])			## 2.0 requires F64, so it is promoted
Array{Float64,1}```

## Initializing an Array with Same Values

For good programming practice, avoid uninitialized arrays. Please always initialize your variables. A common initialization is to zero or NaN which can be achieved, e.g., with `zeros` or `repmat`:

snippet.juliarepl
```julia> zeros(2)                         ## built in
2-element Array{Float64,1}:
0.0
0.0

julia> fill( NaN, 3 )                   ## usually best for initialization
3-element Array{Float64,1}:
NaN
NaN
NaN

julia> using LinearAlgebra

julia> repeat( [ NaN, 1.0 ], 3 )	## for more than one value, start with an array and use repeat
6-element Array{Float64,1}:
NaN
1.0
NaN
1.0
NaN
1.0```

Do not use fill instead of repeat, or you will get an array of arrays:

snippet.juliarepl
```julia> fill( [ NaN, 1.0 ], 3 )
3-element Array{Array{Float64,1},1}:
[NaN, 1.0]
[NaN, 1.0]
[NaN, 1.0]```

These functions also work on higher-dimensional arrays. Matrices also often use `fill()`. Type-specialized `zeros(Float64,5,5)` (five by five zeros), or `ones(Bool, 4, 4)` (four by four trues), or `rand(3,5)` function, explained in Random Variables, can also initialize arrays of any dimension.

## Checking whether a Vector is Numeric

snippet.juliarepl
```julia> isnumeric(x::Vector)::Bool= (eltype(x) <: Union{Missing,Real});

julia> for x in ( [1,2], [1,2.0], [1,2.0,missing], [1,'a',2] ); @info("'\$x' is **\$(isnumeric(x))**"); end
[ Info: '[1, 2]' is **true**
[ Info: '[1.0, 2.0]' is **true**
[ Info: 'Union{Missing, Float64}[1.0, 2.0, missing]' is **true**
[ Info: 'Any[1, 'a', 2]' is **false**```

## Printing a Numeric Vector With a Particular Format

snippet.juliarepl
```julia> using Formatting

julia> join( sprintf1.("%4.1f", [ 0.123, pi, exp(3.0) ]), " : " )
" 0.1 :  3.1 :  20.1"```

## Stack and Queue Operations

Already explained in the Inquiring chapter, functions with trailing '!' modify their first argument, while functions without it just return a result and leave their first argument contents untouched.

### Appending (Values or Arrays) to an Array

`vcat`, `append`, and `push` all do the same thing.

snippet.juliarepl
```julia> x= [1,2,3];

julia> vcat(x,4)                   ## appends 4, but does not modify x
4-element Array{Int64,1}:
1
2
3
4

julia> x			## the original vector x is not modified
3-element Array{Int64,1}:
1
2
3```

They also exist with an ending '!' to modify the array in place.

```
julia> push!(x,4); x               ## push!() is really vcat!(); appends 4 *and* modifies x
4-element Array{Int64,1}:
1
2
3
4
```

There is a convenient short-hand “bracket” notation for vcat:

snippet.juliarepl
```julia> [ [1,2,3] ; [ 4,5,6 ] ]
6-element Array{Int64,1}:
1
2
3
4
5
6```

### Push!, Pop!, PopFirst!, PushFirst!

snippet.juliarepl
```julia> x= [1,2,3];

julia> append!(x,4)				## append = push.  note '!' for changing contents of array
4-element Array{Int64,1}:
1
2
3
4

julia> pushfirst!(x,0)				## shift right and place 0 into pos 1 (often also called unshift)
5-element Array{Int64,1}:
0
1
2
3
4

julia> ( popfirst!(x), " -- ", x )		## drop 0 from pos 1 and shift left
(0, " -- ", [1, 2, 3, 4])

julia> ( pop!(x), " -- ", x )                   ## drop 4 from last pos
(4, " -- ", [1, 2, 3])```
• In other languages, shift and unshift play the same role as popfirst and pushfirst

## Flat and Nested Joins

snippet.juliarepl
```julia> a1=[ "a", "b" ]; a2= [ "c", "d" ];

julia> [ a1, a2 ]                               ## not auto-flattened
2-element Array{Array{String,1},1}:
["a", "b"]
["c", "d"]

julia> vcat( a1, a2 )                           ## flattened.  or use shortcut [ a1 ; a2 ]
4-element Array{String,1}:
"a"
"b"
"c"
"d"```

## Flattening Everything (Vectors of Vectors)

snippet.juliarepl
```julia> using Base.Iterators

julia> function flattenall(a::AbstractArray)
while ( any( x->(typeof(x) <: AbstractArray), a ) )
a= collect(Base.Iterators.flatten(a))
end
a
end#function##
flattenall (generic function with 1 method)

julia> flattenall( [ [1,2], 1, [[3]] ] )                   ## PS: you may want to convert this to Vector{Int}
4-element Array{Int64,1}:
1
2
1
3```
• You can also often flatten and nest via reshaping. For example,
snippet.juliarepl
```julia> using Base.Iterators

julia> arr= collect( product( [1,2,3], [4,5] ) )
3×2 Array{Tuple{Int64,Int64},2}:
(1, 4)  (1, 5)
(2, 4)  (2, 5)
(3, 4)  (3, 5)

julia> reshape( ans, (6,1) )	## A flatter list
6×1 Array{Tuple{Int64,Int64},2}:
(1, 4)
(2, 4)
(3, 4)
(1, 5)
(2, 5)
(3, 5)

julia> reshape( ans, (2,3) )
2×3 Array{Tuple{Int64,Int64},2}:
(1, 4)  (3, 4)  (2, 5)
(2, 4)  (1, 5)  (3, 5)```

## Arithmetic Operations With Flattening

The '@.' expression is syntactic sugar that inserts dots whereever it is needed in function calls.

snippet.juliarepl
```julia> [ 10 .+ [–1,+1].*2; 20 .+ [–3,+5].^1 ]	## accurate
4-element Array{Int64,1}:
8
12
17
25

julia> @. [ 10 + [–1,+1]*2; 20 + [–3,+5]^1 ]	## lazy way of doing this
4-element Array{Int64,1}:
8
12
17
25

julia> vcat(@. (10 + (-i:i)*2 for i=1:2)...)	## a more useful example: changes iterator into range
8-element Array{Int64,1}:
8
10
12
6
8
10
12
14```

## Leading or Lagging Arrays

It is often useful to lead or lag an array. For example, you may want to see whether values in a timeseries like `[ 1, 2, 4, 7, 11 ]` can be explained by their past value.

snippet.juliarepl
```julia> lag(x::Vector{Float64}, num=1)::Vector{Float64}= vcat( fill(NaN,num), [ x[i] for i=1:length(x)-num] );

julia> x= [ 1.0,2,4,7,11 ]; [ x lag(x) ]
5×2 Array{Float64,2}:
1.0  NaN
2.0    1.0
4.0    2.0
7.0    4.0
11.0    7.0```

You could now run a regression explaining the first column with the second.

• The function dispatch chapter defines a better `lag()` vector function that works for more types.
• ShiftedArrays.jl offers similar functionality (and even some event-study capability to shift multiple vectors for alignment). It is a little more efficient (working with views rather than copies, but it always uses Missing instead of NaN. Missing works for all types, but are much slower).
• TimeSeries.jl Package provides `TimeArray` object `lead()` and `lag()` functions. It is covered in Univariate Timeseries.

## Reversing Arrays

snippet.juliarepl
```julia> reverse([1,2,3])
3-element Array{Int64,1}:
3
2
1```

Like `sort`, the `reverse` function creates a copy of the argument array and reverses that. If you'd like to reverse the input array, use the `reverse!` function.

## Iterating over Arrays

Many use cases for arrays involve iterating over each element, and applying some operation on each element. The result could be a new array, changes to the same array, or an aggregate of some kind.

Julia provides the `in` operator which when used in conjunction with the for loop, can be used to iterate over every element in the list:

### Iterating over Array Contents

snippet.juliarepl
```julia> for item in ["A","B","C"];  println(item);  end
A
B
C

julia> [ item for item in ["a", "b", "c" ] ]	## a "comprehension" is often convenient, creating a new array
3-element Array{String,1}:
"a"
"b"
"c"```

### Iterating Circularly over Array Contents

snippet.juliarepl
```julia> circidx( i::Int, arrlen::Int )::Int=  mod( i–1, arrlen ) + 1;

julia> for i=1:9; println( i, " => ", circidx(i,4) ); end
1 => 1
2 => 2
3 => 3
4 => 4
5 => 1
6 => 2
7 => 3
8 => 4
9 => 1```
• Julia offers a set of common iteration tools in IterTools.

### Iterating over Array Index and Contents

snippet.juliarepl
```julia> for (index,item) in enumerate(["A","B","C"]);  println("\$(index) -> \$(item)");  end#for
1 -> A
2 -> B
3 -> C```

### Iterating over Array Index

snippet.juliarepl
```julia> for index in eachindex(["A","B","C"]);  println(index);  end
1
2
3```

### Iterating Synchronously over Multiple Arrays

snippet.juliarepl
```julia> for (item1,item2) in zip(["A","B","C"], ["a","b","c"]);  println(item1, " ", item2);  end
A a
B b
C c```

## Applying the same Function to each Element (Map)

Most of the time, you can just use a 'postfix-dot' function equivalent call (e.g., `sqrt.([1.0,2.0])`) to apply a function to each element of a vector. Sometimes, the `map()` function is more convenient:

snippet.juliarepl
```julia> map( x->(x+1), [1,2,3] )
3-element Array{Int64,1}:
2
3
4```

For very large arrays and slow functions, the `pmap` function can perform the operation in parallel, but it rarely pays. See Parallel Processing:

## Element-Wise Comparisons (Dot Operators)

Use `.==` (and equivalent, see funother#dot_postfix_functionsdot operators chapter):

snippet.juliarepl
```julia> [ 1, 5, 1, 5, 1, 5 ] .>= [ 2, 10, 10, 2, 1, 5 ]
6-element BitArray{1}:
false
false
false
true
true
true```

Or use map:

snippet.juliarepl
```julia> y = [ 2, 1, 1, 1, 1 ];  x = [ 1, 3, 5, 0, 2 ];

julia> map( (x,y)->((x>y) ? 1 : 2), x, y )
5-element Array{Int64,1}:
2
1
1
2
1```

## Larger or Smaller Elements of Two Equal-Length Arrays

To find the maximum or minimum values across two arrays (generating an array with the larger of the two values at each index), use the `max.` or `min.` function:

snippet.juliarepl
```julia> max.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
2
3
5
1
2

julia> min.( [1,3,5,0,2] , [2,1,1,1,1] )
5-element Array{Int64,1}:
1
1
1
0
1```

## Applying Operation and Aggregating (Map Reduce)

To perform an operation on every element in the array, and then perform some sort of aggregation on the result (min, max, sum etc):

snippet.juliarepl
```julia> mapreduce(x->x^2, +, [1,2,3,4,5])       ## sum of squares of first five numbers
55```

This is often faster than `sum(map(x->x^2, [1,2,3,4,5]))`. (In this example, `mapreduce()` is twice as fast as `map(x->x^2, [1,2,3,4,5]) |> sum`.)

## Recoding all Values in an Array

snippet.juliarepl
```julia> using Missings

julia> v= [1.0, NaN, missing, 4.0]; replace( v, NaN => missing )       ## special case: could use Missings.replace(v, NaN)
4-element Array{Union{Missing, Float64},1}:
1.0
missing
missing
4.0```
• Important: Missing is a type, missing is a value.

## R-Like By

It is not a great idea to keep R habits in Julia. It is better to learn to think the Julia way. However, when needed, the following function will work like `by` in R, applying a function to all elements split by a (categorical) vector:

snippet.juliarepl
```julia> using CategoricalArrays

julia> function Rby(obj::AbstractVector, ind::AbstractVector, func::Function, x...)
map( elem->(elem, func(obj[findall(elem .== ind)], x...)), levels(ind))
end;#function##

julia> using Random

julia> Random.seed!(0);

julia> randcateg4 = rand('a':'d', 100);			## sample testset: 4 categories

julia> hundredsquares= [1:100;].^2 ;

julia> using Statistics

julia> Rby( hundredsquares, randcateg4, mean )		## for each element, by randcateg4, calculate mean
4-element Array{Tuple{Char,Float64},1}:
('a', 3575.25)
('b', 3704.0)
('c', 3516.217391304348)
('d', 2854.310344827586)

julia> Rby( hundredsquares, randcateg4, quantile, [0.25, 0.50, 0.75] )	## or calculate the three quantiles
4-element Array{Tuple{Char,Array{Float64,1}},1}:
('a', [885.25, 2234.0, 6321.0])
('b', [720.25, 3192.5, 5932.0])
('c', [626.0, 1936.0, 6376.5])
('d', [625.0, 2209.0, 4489.0])```

See also Univariate Statistics -- Classifications for calculating an original-size vector of group means (R `ave()`).

## Suggestion: Avoid `Any` and Unitialized Arrays

An array of type `Any` can support elements of all types:

snippet.juliarepl
```julia> Any[1, 1.0, "1", true]  ## often evil; avoid this when you know the types of elements
4-element Array{Any,1}:
1
1.0
"1"
true```

You should avoid use of generic (Any) types, especially in arrays, unless you desperately need them (which is rare).

You can even create uninitialized arrays, whose values contain junk.

The following example shows both sins: an uninitialized 5-dimensional array of type T is using the `Array` constructor:

snippet.juliarepl
```julia> Array{Any, 5}  ## evil-squared
Array{Any,5}```

Using such constructs can not only rob you of compile-time type checking, but also of memory and speed if ever accidentally used the wrong way. After all, your computer is good at operating on its native types, and if this is all you need, then restrict yourself to it.

## Range Iterators

Range objects can be viewed as smart unexpanded sequences (consecutive tuples or arrays). To expand a range object into an array, use the `collect()` function:

snippet.juliarepl
```julia> 1:3
1:3

julia> typeof(1:3)
UnitRange{Int64}

julia> collect(1:3)		## convert iterator (range) now into an array of ints
3-element Array{Int64,1}:
1
2
3

julia> [1:3;]                    ## [ range ;]  (or [ range ... ]) means collect, too
3-element Array{Int64,1}:
1
2
3```

### Arbitrary Steps

For steps other than 1, use the “three colon” form:

snippet.juliarepl
```julia> collect(1:0.5:3)
5-element Array{Float64,1}:
1.0
1.5
2.0
2.5
3.0```

### Specific Number of Steps

To obtain a specific number of elements in a sequence, use `range`. For example, for six elements from 1 to 3 are

snippet.juliarepl
```julia> collect( range(1; stop=3, length=6) )
6-element Array{Float64,1}:
1.0
1.4
1.8
2.2
2.6
3.0```

# Backmatter

• DataStructures.jl contains many more useful data structures, including queues, stacks, accumulators, heaps, trees, etc.

## Notes

`AbstractArray{T,N}` can be useful, e.g., for sorting, but are too complex for this tutorial and thus ignored.

It is a pity that julia does not force type declarations of all variables, and especially of its `Any` type.

When an array display would exceed the terminal display, Julia fits it with dot indicators, omitting middle elements. If you really want to see a long display, you can use `show( [1:10000;] )`.

## References

arraysvector.txt · Last modified: 2018/12/28 11:31 (external edit)