User Tools

Site Tools


arraysintro


snippet.juliarepl
julia> pkgchk.( [ "julia" => v"1.0.3" ] );

Arrays

It is the first-class treatment of arrays (of floats) that is the killer feature that turns a general-purpose language into a scientific language. Julia is a first-class scientific language!

In Python and R, execution speed is much better when users call specialized vector operations rather than work with iterative operations. This is because user programs are interpreted, but the underlying language libraries are carefully coded and compiled programs, often written in C. In contrast, expressing operations in vectors is usually not helpful in Julia, because Julia compiles the user code, too. Thus, the fastest Julia operations are often based on plain old iterative index-into-array manipulations—exactly the kind of operation that Python or R suggest users to avoid.

A Python or R user who wants to write new efficient language library (usually) needs to learn and code in C. A Julia user does not.

Arrays, Vectors, and Matrices are Composite Types

  • An array is an ordered mutable (i.e., read-write) collection of (same-type) elements. (PS: if the type is Any, then each cell may contain a different primitive type.)
  • Arrays can be of any type (e.g., an array of Bools, such as Array{Bool,1}, which means a 1-dimensional array [i.e., a vector] of Booleans). However, in practice, arrays are most often used for numerical containers, such as Array{Float64,1} (a vector of Floats) or Array{Int,2} (a 2-dimensional matrix of integers).
  • Avoid arrays of Any to keep the discipline and type-checking of narrow types. Ideally, use only arrays of machine-native types. Vector operations on machine-native types tend to be fast.
  • Vectors and matrices are aliases for one- and two-dimensional arrays:
snippet.juliarepl
julia> Array{Int64, 1} == Vector{Int64}
true

julia> Array{Int64, 2} == Matrix{Int64}
true

julia> Vector
Array{T,1} where T

julia> Matrix
Array{T,2} where T

“Dot functions” operate scalar functions on arrays on an element-by-element basis: abs.([-2,3,-2]) yields [2,3,2] in-situ. The dot is syntax, and not a new function definition. Only abs was defined. The use of an abs-dot function applies the scalar function abs to each element. Moreover, Julia cleverly unrolls dot operations. If A, B, and C are arrays, A .* (B .+ C) does not create an intermediate object for B .+ C, but each cell is calculated as a[,]*(b[,]+c[,]).

Primitive and Composite Types: Assignment and Passing

Arrays, like dictionaries, pairs, structures, dataframes, etc., are not primitive types (like numerics or bits) but composite types. Their behavior is different, especially in the context of assignments and passing. You have been warned. (A third type are abstract types, but they are mostly names for broader categories.)

Julia uses "call by sharing" (like perl or python). Effectively, a function receiving a parameter holds a “name alias” to the object. This is not like “call-by-value” or “call-by-reference,” as in more traditional languages like C. It means that the receiving function cannot alter the composite object, but it can alter the contents of the object. If the function assigns a new value to the alias, the function loses all access to the passed object of its caller.

It is more akin to “passing by references.” The function never gets its own unique copy of the composite type, but just the alias. Assignments are like function calls, in that the assignments of objects does not make copies, either. Instead, assignments also only create “aliases” to the same object.

snippet.juliarepl
julia> a= [1,2];

julia> b= a; ( b[1]= –99; b[2]= –98; );		## b is an alias to a; assigning to b alters a

julia> a					## a has been altered
2-element Array{Int64,1}:
 –9998

There are two ways to work with and alter a without clobbering it. The .= assignment can make copies of contents:

snippet.juliarepl
julia> a= [1,2]; b= [0,0];

julia> b.= a; ( b[1]= –99; b[2]= –98; );	## the '.=' operator assigns the contents of a to the contents of b

julia> a		   			## thus, a is unaffected.
2-element Array{Int64,1}:
 1
 2

You can also use copy() or deepcopy().

snippet.juliarepl
julia> a= [1,2];

julia>  b= copy(a); ( b[1]= –99; b[2]= –98; ); 	## the '.=' operator assigns the contents of a to the contents of b

julia> a		   			## thus, a is unaffected.
2-element Array{Int64,1}:
 1
 2

Accidental Confusion about Types vs Values

It is easy for newcomers to accidentally mix up types and values, which can be confusing.

statement result explanation
v= Vector{Float64} a type desired? v is now an alias for a type. x= v([NaN]) creates a float vector variable
v= Vector{Float64}(undef,10) an uninitialized variable bad. v is a 10-element vector that holds garbage noise.
v= Vector{Float64}([10]) an initialized variable good. v is 1-element vector object, containing the number 10.

Uninitialized variables can contain garbage and errors that are often difficult to track.

A common beginner's mistake is some variant of v= Vector{Float64}; v[1]= 1.0. A type cannot be assigned a value.

Tuples

Tuples may be so intuitive enough that you may not need to read the remainder of this chapter.

  • A tuple is like a read-only array.
  • Tuples use parentheses notation () instead of array bracket notation [].
  • Think of a tuple as a value that can only appear on the right side of an assignment. Think of an array as an object that can appear on the left or on the right side of an assignment.
  • Any tuples are less harmful than Any arrays, because tuples are read-only and create less havoc in the compiler optimizations.
  • Tuples can also be considered to be immutable structs (albeit nameless); or as the arguments to a function sans the function name.
  • Tuples are well-suited to return many function return values. They can be typed like Tuple{Float64,Float64,Float64} or NTuple{3,Float64}

A tuple can pack elements, of any type, into a container. Tuples can hold specific values that are handed off as arguments to functions (or assignment to other data structures).

Creating Tuples

snippet.juliarepl
julia> ( ( 12, ("ab", true), 13.0 ) )       ## nested tuple --- outside-most parens are grouping, not tuple!
( 12, ("ab", true), 13.0 )

julia> x= ( ( 12, ("ab", true), 13.0 ) )    ## variable x now holds (read-only) tuple
(12, ("ab", true), 13.0)

julia> typeof(x)
Tuple{Int64,Tuple{String,Bool},Float64}

julia> dump(x)
Tuple{Int64,Tuple{String,Bool},Float64}
  1: Int64 12
  2: Tuple{String,Bool}
    1: String "ab"
    2: Bool true
  3: Float64 13.0

Single-Element Tuples

Because parenthesis also serve for grouping, single-element tuples require special notation:

snippet.juliarepl
julia> x= (1); typeof(x)      ## parens are groupings, so they group evaluations, and now you get a scalar, not a tuple!
Int64

julia> x= ((1)); typeof(x)    ## multiple parens are no better
Int64

julia> x= (1,); typeof(x)     ## if you need a single-element tuple, the trailing comma is the special case notation
Tuple{Int64}

julia> typeof( (1,2,) ) == typeof( (1,2) )    ## here the trailing comma is optional
true

WARNING parenthesis notation is also used to group statements.

Accessing Elements of Tuples

snippet.juliarepl
julia> ('a','b',3)[1]
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

julia> x= ('a','b',3);  x[ [3,1,2,2] ]
(3, 'a', 'b', 'b')

julia> x[2]
'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)

julia> x[2]='c'                           ## tuples are readonly!
ERROR: MethodError: no method matching setindex!(::Tuple{Char,Char,Int64}, ::Char, ::Int64)
Stacktrace:

Nesting and Merging Tuples

snippet.juliarepl
julia> t1= (:a,:b); t2= (:c,:d);	## define two tuples

julia> (t1,t2)				## creates a nested tuple
((:a, :b), (:c, :d))

julia> vcat( t1, t2 )			## vertical concat: an *array* of tuples
2-element Array{Tuple{Symbol,Symbol},1}:
 (:a, :b)
 (:c, :d)

julia> (t1...,t2...)			## a merged single tuple
(:a, :b, :c, :d)

Tuples as Function Arguments

The contents of an tuple can be unpacked for passing as the arguments of a function, too. This can work even using the ... operator. For example:

snippet.juliarepl
julia> f(x, y) = x + y
f (generic function with 1 method)

julia> t= (1, 2)
(1, 2)

julia> f(t...)
3

Named Tuples

Tuples can have names:

snippet.juliarepl
julia> t= (a=1, b=2)
(a = 1, b = 2)

julia> typeof( t )
NamedTuple{(:a, :b),Tuple{Int64,Int64}}

julia> t[1]
1

Tuples and Arrays

You may never need the information in this section.

Tuple-Array Conversions

Tuple to Array

To convert x from a tuple into an array

snippet.juliarepl
julia> tup3= ( (12,3), [1.0, 2.0], ["A", 'n', 13] )
((12, 3), [1.0, 2.0], Any["A", 'n', 13])

julia> typeof(tup3)
Tuple{Tuple{Int64,Int64},Array{Float64,1},Array{Any,1}}

julia> [ i for i in tup3 ]
3-element Array{Any,1}:
 (12, 3)
 [1.0, 2.0]
 Any["A", 'n', 13]

Array to Tuple

snippet.juliarepl
julia> x= [ 1, 2, 3, "ab", true ]
5-element Array{Any,1}:
    1
    2
    3
     "ab"
 true

julia> tuple(x)                ## tuple with array inside, not a conversion
(Any[1, 2, 3, "ab", true],)

julia> tuple(x...)             ## tuple of elements from array
(1, 2, 3, "ab", true)

julia> Tuple(x)
(1, 2, 3, "ab", true)

The trailing comma on the tuple output helps to distinguish (1+2,) from (1+2)

Mixing Tuples and Arrays: Tuples of Tuples, Arrays of Tuples, and Tuples of Arrays

A tuple is not an array. A variable can hold either a tuple or an array.

snippet.juliarepl
julia> ( ( 12, ("ab", true), 13.0 ), ( 12, 13.0 ) )
((12, ("ab", true), 13.0), (12, 13.0))

julia> [ ( 12, ("ab", true), 13.0 ), ( 12, 13.0 ) ]
2-element Array{Tuple{Int64,Any,Vararg{Float64,N} where N},1}:
 (12, ("ab", true), 13.0)
 (12, 13.0)

For a mixed version

snippet.juliarepl
julia> x= ( (12,3), [1.0, 2.0], ["A", 'n', 13] )
((12, 3), [1.0, 2.0], Any["A", 'n', 13])

julia> typeof(x)
Tuple{Tuple{Int64,Int64},Array{Float64,1},Array{Any,1}}

You can now change x[2][2] (because it sits in an array), but not x[1][2] (because it sits in a tuple).

Convert Mixed Into Tuples or Arrays Only

snippet.juliarepl
julia> x= ( (12,3, (1,2,3), [1,2,3]), [1.0, 2.0], ["A", 'n', 13, (3,4), [5,6]], )
((12, 3, (1, 2, 3), [1, 2, 3]), [1.0, 2.0], Any["A", 'n', 13, (3, 4), [5, 6]])

julia> typeof(x)
Tuple{Tuple{Int64,Int64,Tuple{Int64,Int64,Int64},Array{Int64,1}},Array{Float64,1},Array{Any,1}}

FIXME How to descend into mixed tuple/array structure, and make all tuple or all array out of it.

FIXME (Andreas:) Maybe (t -> tuple(t...)).(x) and (t -> [t...]).(x) do what you want here. — nope, this does not do the job

snippet.juliafix
[download only julia statements]
julia> tupelize( x )
((12, 3, (1, 2, 3), (1, 2, 3)), (1.0, 2.0), ("A", 'n', 13, (3, 4), (5, 6)))
 
julia> arraylize( x )
[[12, 3, [1, 2, 3], [1, 2, 3]], [1.0, 2.0], ["A", 'n', 13, [3, 4], [5, 6]]]

Backmatter

Notes

  • Unfortunately, it is not possible to turn on a compiler warning whenever an uninitialized variable is used before assignment.
  • There is also a more AbstractArray type. It is usually used as function arguments when the function should work not only on arrays but also on similar types (data structures like Diagonal). This chapter ignores this. AbstractArray{T,N} can also be useful, e.g., for sorting, but is too complex for this tutorial and thus ignored.
  • It is a pity that Julia does not force type declarations for objects, but silently obliges to create them. This makes it easy to accidentally declare arrays with too broad a type, especially Any array types—which hoses both julia's type-checking and efficiency.

FIXME (Andreas:) This is indeed a common pitfall in Julia but sometimes you do want a heterogeneous array — ivo: yes, but this the language's fault. for these obscure cases, the user you should have to declare 'Any' explicitly!

References

arraysintro.txt · Last modified: 2018/12/28 13:40 (external edit)