arraysvector

Arrays Introduction | Vectors and Range Iterators | Sort and Sets | Matrices+ |
---|

- snippet.juliarepl
julia> pkgchk.

**(****[****"**julia**"**=> v**"**1.0.3**"****,****"**CategoricalArrays**"**=> v**"**0.4.0**"****]****)**;

`Vector`

is an alias for a 1-dimensional array. (The next chapter (arraysmatrix) discusses two-dimensional matrices and more linear algebra.)

- snippet.juliarepl
julia> Vector

**{**Float64**}**Array**{**Float64**,**1**}**julia>**[**1**,**2**,**3**]**## a 1-dimensional array**(**of Ints**)**of length 3 3-element Array**{**Int64**,**1**}**: 1 2 3 julia>**[**1 2 3**]**## but spaces between numbers: a 2-dimensional array**(**of Ints**)**with 3 columns and 1 row 1×3 Array**{**Int64**,**2**}**: 1 2 3

- Vectors offer stack, queue, and set operations.
`end`

is a valid abbreviation for`length()`

inside the index operator.

Julia can distinguish between ordinary vectors and LinearAlgebra vectors. LinearAlgebra vectors can have a (column) orientation, too. When needed, julia converts ordinary vectors into column vectors and vice-versa. A row vector is typically represented as a matrix with one row and many columns.

Thus, a transpose on a plain vector considers its input as a column vector, and renders it such that it can be used as if it were a row vector (which is really more like a matrix with one row, albeit with a different type name). This “transposed-vector” type is still different from a “1-row-matrix” type, but the “transposed-vector” values can seamlessly be treated as or compared to the “1-row-matrix” values.

- snippet.juliarepl
julia> a=

**[**1**,**2**,**3**]**## a plain**(**column**)**vector 3-element Array**{**Int64**,**1**}**: 1 2 3 julia> ta1= transpose**(**a**)**## transpose is *not fully* a matrix**,**but a similar *sort-of-row* vector now 1×3 LinearAlgebra.Transpose**{**Int64**,**Array**{**Int64**,**1**}****}**: 1 2 3 julia> b=**[**1 2 3**]**## this is a matrix**,**with one row 1×3 Array**{**Int64**,**2**}**: 1 2 3 julia> tb1= transpose**(**b**)**## this is a matrix**,**with one column 3×1 LinearAlgebra.Transpose**{**Int64**,**Array**{**Int64**,**2**}****}**: 1 2 3 julia>**(**a==b**,**a==ta1**,**a==tb1**,**b==a**,**b==ta1**,**b==tb1**)**## note the fifth element that is**'**true**'****(**false**,**false**,**false**,**false**,**true**,**false**)**

- Despite its type
`1×3 LinearAlgebra.Transpose{Int64,Array{Int64,1}}`

,`ta1`

can be used just like`b`

which is a`1×3 Array{Int64,2}`

. This is why`b==ta1`

. However`typeof(b) == typeof(ta1)`

is false. (The type information is useful, because if you transpose`ta1`

[again], you get the plain vector back; but if you transpose`b`

, you get the two-dimensional matrix back.) - Note that this is not symmetric: although
`b`

can be the same as the transpose of`a`

,`a`

can never be the same as a transpose of`b`

. This is because a transpose of`b`

remains a matrix with one column, and a plain one-dimensional vector is never the same as a two-dimensional matrix with one column. - Not shown: The REPL-printed name of the transpose vector changes when LinearAlgebra is loaded. This is
*not*because the type has changed, but because the display changes: after LinearAlgebra is loaded, the prefix is no longer shown.

- snippet.juliarepl
julia>

**[**1 2 3**]*****[**2**,**3**,**4**]**## even without LinearAlgebra.jl loaded**,**julia knows vector products 1-element Array**{**Int64**,**1**}**: 20 julia>**[**1**,**2**,**3**]**.***[**2**,**3**,**4**]**## with the dot**,**this is an element-by-element operation 3-element Array**{**Int64**,**1**}**: 2 6 12 julia> transpose**(****[**1**,**2**,**3**]****)**.***[**2**,**3**,**4**]**## this is quasi-element by element**,**but really linear algebra cross-product 3×3 Array**{**Int64**,**2**}**: 2 4 6 3 6 9 4 8 12 julia> transpose**(****[**1**,**2**,**3**]****)**.* transpose**(****[**2**,**3**,**4**]****)**## but this is again just element by element 1×3 Array**{**Int64**,**2**}**: 2 6 12

`[1,2,3] * [1,2,3]`

is an error. Column vectors have no vector product.

A vector can be created by specifying all elements in a comma-separated list. (Space separated values create matrices.) Vectors usually convert seamlessly and as expected.

A plain vector, without orientation, looks like a column vector for most purposes:

- snippet.juliarepl
julia> x1=

**[**1**,**2**,**3**]**3-element Array**{**Int64**,**1**}**: 1 2 3 julia> show**(**IOContext**(**stdout**,**:compact => true**)****,**x1**)**## compact display**[**1**,**2**,**3**]**

Mentioned in Special Loops for Vectors and Tuples, you can also initialize vector with comprehensions:

- snippet.juliarepl
julia>

**[****(**i+2**,**i/2**)**for i=4:6**]**3-element Array**{**Tuple**{**Int64**,**Float64**}****,**1**}**:**(**6**,**2.0**)****(**7**,**2.5**)****(**8**,**3.0**)**

As noted, Julia no longer features row vectors but uses two-dimensional matrices with one row instead. Equivalently, you can transpose a (column) vector. To initialize a quasi-row vector (which is really just a matrix with one row), you have two choices: create a matrix or a transposed vector

- snippet.juliarepl
julia> fill

**(**0.0**,**3**,**1**)**## 3 by 1 array = 2-dim array = matrix 3×1 Array**{**Float64**,**2**}**: 0.0 0.0 0.0 julia> fill**(**0.0**,**3**)****'**## the**'****'**is the transpose**(****)**operator 1×3 LinearAlgebra.Adjoint**{**Float64**,**Array**{**Float64**,**1**}****}**: 0.0 0.0 0.0

- snippet.juliarepl
julia>

**[**1.0**,**NaN**,**2.0**]**## Floats default to Float64 3-element Array**{**Float64**,**1**}**: 1.0 NaN 2.0 julia> Vector**{**Float32**}****(****[**1.0**,**NaN**,**2.0**]****)**## Force Float32 vector 3-element Array**{**Float32**,**1**}**: 1.0 NaN 2.0 julia> typeof**(****[**1.0f0**,**NaN32**,**2.0f0**]****)**## f0 designates F32**,**so this is the same Array**{**Float32**,**1**}**julia> typeof**(****[**1.0f0**,**NaN32**,**2.0**]****)**## 2.0 requires F64**,**so it is promoted Array**{**Float64**,**1**}**

For good programming practice, avoid uninitialized arrays. Please always initialize your variables. A common initialization is to zero or NaN which can be achieved, e.g., with `zeros`

or `repmat`

:

- snippet.juliarepl
julia> zeros

**(**2**)**## built in 2-element Array**{**Float64**,**1**}**: 0.0 0.0 julia> fill**(**NaN**,**3**)**## usually best for initialization 3-element Array**{**Float64**,**1**}**: NaN NaN NaN julia> using LinearAlgebra julia> repeat**(****[**NaN**,**1.0**]****,**3**)**## for more than one value**,**start with an array and use repeat 6-element Array**{**Float64**,**1**}**: NaN 1.0 NaN 1.0 NaN 1.0

Do not use fill instead of repeat, or you will get an array of arrays:

- snippet.juliarepl
julia> fill

**(****[**NaN**,**1.0**]****,**3**)**3-element Array**{**Array**{**Float64**,**1**}****,**1**}**:**[**NaN**,**1.0**]****[**NaN**,**1.0**]****[**NaN**,**1.0**]**

These functions also work on higher-dimensional arrays. Matrices also often use `fill()`

. Type-specialized `zeros(Float64,5,5)`

(five by five zeros), or `ones(Bool, 4, 4)`

(four by four trues), or `rand(3,5)`

function, explained in Random Variables, can also initialize arrays of any dimension.

- snippet.juliarepl
julia> isnumeric

**(**x::Vector**)**::Bool=**(**eltype**(**x**)**<: Union**{**Missing**,**Real**}****)**; julia> for x in**(****[**1**,**2**]****,****[**1**,**2.0**]****,****[**1**,**2.0**,**missing**]****,****[**1**,****'**a**'****,**2**]****)**; @info**(****"****'**$x**'**is **$**(**isnumeric**(**x**)****)******"****)**; end**[**Info:**'****[**1**,**2**]****'**is **true****[**Info:**'****[**1.0**,**2.0**]****'**is **true****[**Info:**'**Union**{**Missing**,**Float64**}****[**1.0**,**2.0**,**missing**]****'**is **true****[**Info:**'**Any**[**1**,****'**a**'****,**2**]****'**is **false**

- snippet.juliarepl
julia> using Formatting julia> join

**(**sprintf1.**(****"**%4.1f**"****,****[**0.123**,**pi**,**exp**(**3.0**)****]****)****,****"**:**"****)****"**0.1 : 3.1 : 20.1**"**

Already explained in the Inquiring chapter, functions with trailing '!' modify their first argument, while functions without it just return a result and leave their first argument contents untouched.

`vcat`

, `append`

, and `push`

all do the same thing.

- snippet.juliarepl
julia> x=

**[**1**,**2**,**3**]**; julia> vcat**(**x**,**4**)**## appends 4**,**but does not modify x 4-element Array**{**Int64**,**1**}**: 1 2 3 4 julia> x ## the original vector x is not modified 3-element Array**{**Int64**,**1**}**: 1 2 3

They also exist with an ending '!' to modify the array in place.

julia> push!(x,4); x ## push!() is really vcat!(); appends 4 *and* modifies x 4-element Array{Int64,1}: 1 2 3 4

There is a convenient short-hand “bracket” notation for vcat:

- snippet.juliarepl
julia>

**[****[**1**,**2**,**3**]**;**[**4**,**5**,**6**]****]**6-element Array**{**Int64**,**1**}**: 1 2 3 4 5 6

- snippet.juliarepl
julia> x=

**[**1**,**2**,**3**]**; julia> append!**(**x**,**4**)**## append = push. note**'**!**'**for changing contents of array 4-element Array**{**Int64**,**1**}**: 1 2 3 4 julia> pushfirst!**(**x**,**0**)**## shift right and place 0 into pos 1**(**often also called unshift**)**5-element Array**{**Int64**,**1**}**: 0 1 2 3 4 julia>**(**popfirst!**(**x**)****,****"**--**"****,**x**)**## drop 0 from pos 1 and shift left**(**0**,****"**--**"****,****[**1**,**2**,**3**,**4**]****)**julia>**(**pop!**(**x**)****,****"**--**"****,**x**)**## drop 4 from last pos**(**4**,****"**--**"****,****[**1**,**2**,**3**]****)**

- In other languages, shift and unshift play the same role as popfirst and pushfirst

- snippet.juliarepl
julia> a1=

**[****"**a**"****,****"**b**"****]**; a2=**[****"**c**"****,****"**d**"****]**; julia>**[**a1**,**a2**]**## not auto-flattened 2-element Array**{**Array**{**String**,**1**}****,**1**}**:**[****"**a**"****,****"**b**"****]****[****"**c**"****,****"**d**"****]**julia> vcat**(**a1**,**a2**)**## flattened. or use shortcut**[**a1 ; a2**]**4-element Array**{**String**,**1**}**:**"**a**"****"**b**"****"**c**"****"**d**"**

- snippet.juliarepl
julia> using Base.Iterators julia> function flattenall

**(**a::AbstractArray**)**while**(**any**(**x->**(**typeof**(**x**)**<: AbstractArray**)****,**a**)****)**a= collect**(**Base.Iterators.flatten**(**a**)****)**end a end#function## flattenall**(**generic function with 1 method**)**julia> flattenall**(****[****[**1**,**2**]****,**1**,****[****[**3**]****]****]****)**## PS: you may want to convert this to Vector**{**Int**}**4-element Array**{**Int64**,**1**}**: 1 2 1 3

- You can also often flatten and nest via reshaping. For example,

- snippet.juliarepl
julia> using Base.Iterators julia> arr= collect

**(**product**(****[**1**,**2**,**3**]****,****[**4**,**5**]****)****)**3×2 Array**{**Tuple**{**Int64**,**Int64**}****,**2**}**:**(**1**,**4**)****(**1**,**5**)****(**2**,**4**)****(**2**,**5**)****(**3**,**4**)****(**3**,**5**)**julia> reshape**(**ans**,****(**6**,**1**)****)**## A flatter list 6×1 Array**{**Tuple**{**Int64**,**Int64**}****,**2**}**:**(**1**,**4**)****(**2**,**4**)****(**3**,**4**)****(**1**,**5**)****(**2**,**5**)****(**3**,**5**)**julia> reshape**(**ans**,****(**2**,**3**)****)**2×3 Array**{**Tuple**{**Int64**,**Int64**}****,**2**}**:**(**1**,**4**)****(**3**,**4**)****(**2**,**5**)****(**2**,**4**)****(**1**,**5**)****(**3**,**5**)**

The '@.' expression is syntactic sugar that inserts dots whereever it is needed in function calls.

- snippet.juliarepl
julia>

**[**10 .+**[**–1**,**+1**]**.*2; 20 .+**[**–3**,**+5**]**.^1**]**## accurate 4-element Array**{**Int64**,**1**}**: 8 12 17 25 julia> @.**[**10 +**[**–1**,**+1**]***2; 20 +**[**–3**,**+5**]**^1**]**## lazy way of doing this 4-element Array**{**Int64**,**1**}**: 8 12 17 25 julia> vcat**(**@.**(**10 +**(**-i:i**)***2 for i=1:2**)**...**)**## a more useful example: changes iterator into range 8-element Array**{**Int64**,**1**}**: 8 10 12 6 8 10 12 14

It is often useful to lead or lag an array. For example, you may want to see whether values in a timeseries like `[ 1, 2, 4, 7, 11 ]`

can be explained by their past value.

- snippet.juliarepl
julia> lag

**(**x::Vector**{**Float64**}****,**num=1**)**::Vector**{**Float64**}**= vcat**(**fill**(**NaN**,**num**)****,****[**x**[**i**]**for i=1:length**(**x**)**-num**]****)**; julia> x=**[**1.0**,**2**,**4**,**7**,**11**]**;**[**x lag**(**x**)****]**5×2 Array**{**Float64**,**2**}**: 1.0 NaN 2.0 1.0 4.0 2.0 7.0 4.0 11.0 7.0

You could now run a regression explaining the first column with the second.

- The function dispatch chapter defines a better
`lag()`

vector function that works for more types. - ShiftedArrays.jl offers similar functionality (and even some event-study capability to shift multiple vectors for alignment). It is a little more efficient (working with views rather than copies, but it always uses Missing instead of NaN. Missing works for all types, but are much slower).
- TimeSeries.jl Package provides
`TimeArray`

object`lead()`

and`lag()`

functions. It is covered in Univariate Timeseries.

- snippet.juliarepl
julia> reverse

**(****[**1**,**2**,**3**]****)**3-element Array**{**Int64**,**1**}**: 3 2 1

Like `sort`

, the `reverse`

function creates a copy of the argument array and reverses that. If you'd like to reverse the input array, use the `reverse!`

function.

Many use cases for arrays involve iterating over each element, and applying some operation on each element. The result could be a new array, changes to the same array, or an aggregate of some kind.

Julia provides the `in`

operator which when used in conjunction with the for loop, can be used to iterate over every element in the list:

- snippet.juliarepl
julia> for item in

**[****"**A**"****,****"**B**"****,****"**C**"****]**; println**(**item**)**; end A B C julia>**[**item for item in**[****"**a**"****,****"**b**"****,****"**c**"****]****]**## a**"**comprehension**"**is often convenient**,**creating a new array 3-element Array**{**String**,**1**}**:**"**a**"****"**b**"****"**c**"**

- snippet.juliarepl
julia> circidx

**(**i::Int**,**arrlen::Int**)**::Int= mod**(**i–1**,**arrlen**)**+ 1; julia> for i=1:9; println**(**i**,****"**=>**"****,**circidx**(**i**,**4**)****)**; end 1 => 1 2 => 2 3 => 3 4 => 4 5 => 1 6 => 2 7 => 3 8 => 4 9 => 1

- Julia offers a set of common iteration tools in IterTools.

- snippet.juliarepl
julia> for

**(**index**,**item**)**in enumerate**(****[****"**A**"****,****"**B**"****,****"**C**"****]****)**; println**(****"**$**(**index**)**-> $**(**item**)****"****)**; end#for 1 -> A 2 -> B 3 -> C

- snippet.juliarepl
julia> for index in eachindex

**(****[****"**A**"****,****"**B**"****,****"**C**"****]****)**; println**(**index**)**; end 1 2 3

- snippet.juliarepl
julia> for

**(**item1**,**item2**)**in zip**(****[****"**A**"****,****"**B**"****,****"**C**"****]****,****[****"**a**"****,****"**b**"****,****"**c**"****]****)**; println**(**item1**,****"****"****,**item2**)**; end A a B b C c

Most of the time, you can just use a 'postfix-dot' function equivalent call (e.g., `sqrt.([1.0,2.0])`

) to apply a function to each element of a vector. Sometimes, the `map()`

function is more convenient:

- snippet.juliarepl
julia> map

**(**x->**(**x+1**)****,****[**1**,**2**,**3**]****)**3-element Array**{**Int64**,**1**}**: 2 3 4

For very large arrays and slow functions, the `pmap`

function can perform the operation in parallel, but it rarely pays. See Parallel Processing:

Use `.==`

(and equivalent, see funother#dot_postfix_functionsdot operators chapter):

- snippet.juliarepl
julia>

**[**1**,**5**,**1**,**5**,**1**,**5**]**.>=**[**2**,**10**,**10**,**2**,**1**,**5**]**6-element BitArray**{**1**}**: false false false true true true

Or use map:

- snippet.juliarepl
julia> y =

**[**2**,**1**,**1**,**1**,**1**]**; x =**[**1**,**3**,**5**,**0**,**2**]**; julia> map**(****(**x**,**y**)**->**(****(**x>y**)**? 1 : 2**)****,**x**,**y**)**5-element Array**{**Int64**,**1**}**: 2 1 1 2 1

To find the maximum or minimum values across two arrays (generating an array with the larger of the two values at each index), use the `max.`

or `min.`

function:

- snippet.juliarepl
julia> max.

**(****[**1**,**3**,**5**,**0**,**2**]****,****[**2**,**1**,**1**,**1**,**1**]****)**5-element Array**{**Int64**,**1**}**: 2 3 5 1 2 julia> min.**(****[**1**,**3**,**5**,**0**,**2**]****,****[**2**,**1**,**1**,**1**,**1**]****)**5-element Array**{**Int64**,**1**}**: 1 1 1 0 1

To perform an operation on every element in the array, and then perform some sort of aggregation on the result (min, max, sum etc):

- snippet.juliarepl
julia> mapreduce

**(**x->x^2**,**+**,****[**1**,**2**,**3**,**4**,**5**]****)**## sum of squares of first five numbers 55

This is often faster than `sum(map(x->x^2, [1,2,3,4,5]))`

. (In this example, `mapreduce()`

is twice as fast as `map(x->x^2, [1,2,3,4,5]) |> sum`

.)

- snippet.juliarepl
julia> using Missings julia> v=

**[**1.0**,**NaN**,**missing**,**4.0**]**; replace**(**v**,**NaN => missing**)**## special case: could use Missings.replace**(**v**,**NaN**)**4-element Array**{**Union**{**Missing**,**Float64**}****,**1**}**: 1.0 missing missing 4.0

**Important**:**M**issing is a type,**m**issing is a value.

It is not a great idea to keep R habits in Julia. It is better to learn to think the Julia way. However, when needed, the following function will work like `by`

in R, applying a function to all elements split by a (categorical) vector:

- snippet.juliarepl
julia> using CategoricalArrays julia> function Rby

**(**obj::AbstractVector**,**ind::AbstractVector**,**func::Function**,**x...**)**map**(**elem->**(**elem**,**func**(**obj**[**findall**(**elem .== ind**)****]****,**x...**)****)****,**levels**(**ind**)****)**end;#function## julia> using Random julia> Random.seed!**(**0**)**; julia> randcateg4 = rand**(****'**a**'**:**'**d**'****,**100**)**; ## sample testset: 4 categories julia> hundredsquares=**[**1:100;**]**.^2 ; julia> using Statistics julia> Rby**(**hundredsquares**,**randcateg4**,**mean**)**## for each element**,**by randcateg4**,**calculate mean 4-element Array**{**Tuple**{**Char**,**Float64**}****,**1**}**:**(****'**a**'****,**3575.25**)****(****'**b**'****,**3704.0**)****(****'**c**'****,**3516.217391304348**)****(****'**d**'****,**2854.310344827586**)**julia> Rby**(**hundredsquares**,**randcateg4**,**quantile**,****[**0.25**,**0.50**,**0.75**]****)**## or calculate the three quantiles 4-element Array**{**Tuple**{**Char**,**Array**{**Float64**,**1**}****}****,**1**}**:**(****'**a**'****,****[**885.25**,**2234.0**,**6321.0**]****)****(****'**b**'****,****[**720.25**,**3192.5**,**5932.0**]****)****(****'**c**'****,****[**626.0**,**1936.0**,**6376.5**]****)****(****'**d**'****,****[**625.0**,**2209.0**,**4489.0**]****)**

See also Univariate Statistics -- Classifications for calculating an original-size vector of group means (R `ave()`

).

An array of type `Any`

can support elements of all types:

- snippet.juliarepl
julia> Any

**[**1**,**1.0**,****"**1**"****,**true**]**## often evil; avoid this when you know the types of elements 4-element Array**{**Any**,**1**}**: 1 1.0**"**1**"**true

You should avoid use of generic (Any) types, especially in arrays, unless you desperately need them (which is rare).

You can even create uninitialized arrays, whose values contain junk.

The following example shows both sins: an uninitialized 5-dimensional array of type T is using the `Array`

constructor:

- snippet.juliarepl
julia> Array

**{**Any**,**5**}**## evil-squared Array**{**Any**,**5**}**

Using such constructs can not only rob you of compile-time type checking, but also of memory and speed if ever accidentally used the wrong way. After all, your computer is good at operating on its native types, and if this is all you need, then restrict yourself to it.

Range objects can be viewed as smart unexpanded sequences (consecutive tuples or arrays). To expand a range object into an array, use the `collect()`

function:

- snippet.juliarepl
julia> 1:3 1:3 julia> typeof

**(**1:3**)**UnitRange**{**Int64**}**julia> collect**(**1:3**)**## convert iterator**(**range**)**now into an array of ints 3-element Array**{**Int64**,**1**}**: 1 2 3 julia>**[**1:3;**]**##**[**range ;**]****(**or**[**range ...**]****)**means collect**,**too 3-element Array**{**Int64**,**1**}**: 1 2 3

For steps other than 1, use the “three colon” form:

- snippet.juliarepl
julia> collect

**(**1:0.5:3**)**5-element Array**{**Float64**,**1**}**: 1.0 1.5 2.0 2.5 3.0

To obtain a specific number of elements in a sequence, use `range`

. For example, for six elements from 1 to 3 are

- snippet.juliarepl
julia> collect

**(**range**(**1; stop=3**,**length=6**)****)**6-element Array**{**Float64**,**1**}**: 1.0 1.4 1.8 2.2 2.6 3.0

- DataStructures.jl contains many more useful data structures, including queues, stacks, accumulators, heaps, trees, etc.

`AbstractArray{T,N}`

can be useful, e.g., for sorting, but are too complex for this tutorial and thus ignored.

It is a pity that julia does not force type declarations of all variables, and especially of its `Any`

type.

When an array display would exceed the terminal display, Julia fits it with dot indicators, omitting middle elements. If you really want to see a long display, you can use `show( [1:10000;] )`

.

arraysvector.txt · Last modified: 2018/12/28 11:31 (external edit)