User Tools

Site Tools


arraysintro

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
arraysintro [2018/12/11 01:22]
julia [Notes]
arraysintro [2018/12/28 13:40] (current)
Line 11: Line 11:
  
 ```juliarepl ```juliarepl
-julia> pkgchk( [ "​julia"​ => v"1.0.2" ] )+julia> pkgchk.( [ "​julia"​ => v"1.0.3" ] );
 ``` ```
- 
  
 # Arrays # Arrays
  
-First-class treatment of arrays (of floats) ​are the feature that turns general-purpose ​languages ​into scientific ​languages.  Julia is first-class!+It is the first-class treatment of arrays (of floats) ​that is the killer ​feature that turns general-purpose ​language ​into scientific ​language.  Julia is first-class ​scientific language!
  
  
-- An array is an ordered mutable (i.e., read-write) collection of (same-type) elements. FIXME (Andreas:) "same type" might be misleading here since an `Any` array can contain elements ​of any types. This is a key difference between concretely and abstractly typed arrays.+In Python and R, execution speed is much better when users call specialized vector operations rather than work with iterative operations This is because user programs are interpreted,​ but the underlying language libraries are carefully coded and compiled programs, often written in C In contrastexpressing operations in vectors is usually *not* helpful in Julia, because Julia compiles the user code, too.  Thus, the fastest Julia operations are often based on plain old iterative index-into-array ​manipulations---exactly the kind of operation that Python or R suggest users to avoid.
  
-- Arrays can be of any type (e.g., an array of Bools, such as `Array{Bool,​1}`,​ which is a vector of Booleans).  ​However, in practice, arrays are most often used for numerical containers, such as `Array{Float64,​1}` (a vector of Floats) or `Array{Int,​2}` (a matrix of integers). FIXME (Andreas:) I think it would be more clear if you briefly explain the two type parameters here.+A Python or R user who wants to write new efficient language library ​(usuallyneeds to learn and code in C.  ​A Julia user does not.
  
-- Avoid arrays of `Any` to keep the discipline and type-checking of narrow types. ​ Ideally, use only arrays of machine-native types. ​FIXME (Andreas:) Briefly explain why.+ 
 +## Arrays, Vectors, and Matrices are Composite Types 
 + 
 +- An array is an ordered mutable (i.e., read-write) collection of (same-type) elements. (PS: if the type is `Any`, then each cell may contain a different primitive type.) 
 + 
 +- Arrays can be of any type (e.g., an array of Bools, such as `Array{Bool,​1}`,​ which means a 1-dimensional array [i.e., a vector] of Booleans). ​ However, in practice, arrays are most often used for numerical containers, such as `Array{Float64,​1}` (a vector of Floats) or `Array{Int,​2}` (a 2-dimensional matrix of integers). 
 + 
 +- Avoid arrays of `Any` to keep the discipline and type-checking of narrow types. ​ Ideally, use only arrays of machine-native types. ​ ​Vector operations on machine-native types tend to be fast.
  
 - Vectors and matrices are aliases for one- and two-dimensional arrays: - Vectors and matrices are aliases for one- and two-dimensional arrays:
Line 44: Line 50:
 ``` ```
  
-Dot functions operate scalar functions on arrays on an element-by-element basis: `abs.([-2,​3,​-2])` yields `[2,3,2]` in-situ. ​FIXME (Andreas:) explain that this is *syntaxand not special functions.+"Dot functions" ​operate scalar functions on arrays on an element-by-element basis: `abs.([-2,​3,​-2])` yields `[2,3,2]` in-situ. ​ The dot is syntaxand not a new function definition. ​ Only `abs` was defined. ​ The use of an `abs`-dot function applies the scalar function `abs` to each element. ​ Moreover, Julia cleverly unrolls dot operations. ​ If A, B, and C are arrays, `A .* (B .+ C)` does not create an intermediate object for `B .+ C`, but each cell is calculated as `a[,​]*(b[,​]+c[,​])`.
  
  
 +## Primitive and Composite Types: Assignment and Passing
  
 +Arrays, like dictionaries,​ pairs, structures, dataframes, etc., are not [[https://​docs.julialang.org/​en/​v1/​manual/​types/​index.html|primitive types (like numerics or bits) but composite types]]. ​ Their behavior is different, especially in the context of assignments and passing. ​ You have been warned. ​ (A third type are abstract types, but they are mostly names for broader categories.)
  
-* WARNING: JULIA PASSES EFFECTIVELY REFERENCES 
  
  
-Functions never pass arraysbut references ​to arrays ("objects").  ​Thus, the function ​does not have copies ​of the vectors ​but aliases that refer to the same vector contentFIXME (Andreas:) I don't think the ("objects"​) part helps here.+Julia uses [[https://​stackoverflow.com/​questions/​35235597/​julia-function-argument-by-reference%E2%80%8C%E2%80%8B|"​call by sharing"​]] (like perl or python). ​ Effectivelya function receiving a parameter holds a "name alias" ​to the object. ​ This is not like "call-by-value" ​or "​call-by-reference,"​ as in more traditional languages like C.  ​It means that the receiving function cannot alter the composite objectbut it can alter the contents of the object. ​ If the function ​assigns a new value to the alias, the function loses all access to the passed object of its caller. 
 + 
 +It is more akin to "​passing by references." ​ The function never gets its own unique copy of the composite type, but just the alias ​**Assignments are like function calls**, in that the assignments of objects ​does not make copies, either. ​ Instead, assignments also only create ​"aliases"​ to the same object.
  
-- Assignments are function calls, so even assignments do not make copies: FIXME (Andreas:) I don't think it is correct to say that assignments are function calls. Maybe just say that assignments don't create copies. 
  
 ```juliarepl ```juliarepl
-julia> a= [1,​2]; ​ ​( ​b= a; ); ( b[1]= -99; b[2]= -98; );     ​a+julia> a= [1,2]; 
 + 
 +julia> ​b= a; ( b[1]= -99; b[2]= -98; ); ## b is an alias to a; assigning to b alters a 
 + 
 +julia> a ## ​has been altered
 2-element Array{Int64,​1}:​ 2-element Array{Int64,​1}:​
  -99  -99
Line 63: Line 75:
  
 ``` ```
-FIXME (Andreas:) I still think one-liners are bad for readability but I'll not comment further on that. 
  
-- To make copy of the contents *in situ*, you can use the `.=` operator:+There are two ways to work with and alter `a` without clobbering it.  The `.=` assignment can make copies of contents:
  
 ```juliarepl ```juliarepl
-julia> a= [1,2]; b= [0,​0]; ​   ​( ​b.= a ); ( b[1]= -99; b[2]= -98; );    a+julia> a= [1,2]; b= [0,0]; 
 + 
 +julia> ​b.= a; ( b[1]= -99; b[2]= -98; ); ## the '​.='​ operator assigns the contents of to the contents of b 
 + 
 +julia> a     ## thus, a is unaffected.
 2-element Array{Int64,​1}:​ 2-element Array{Int64,​1}:​
  1  1
Line 74: Line 89:
  
 ``` ```
-FIXME (Andreas:) I know what you mean but `.=` is not really an operator and I think it would be better explain what it actually is to avoid confusion later on. 
  
-You can also use `copy()` or `deepcopy()` ​if you really need a copy and you do not want to clobber your original inadvertently.+You can also use `copy()` or `deepcopy()`.
  
 +```juliarepl
 +julia> a= [1,2];
  
 +julia> ​ b= copy(a); ( b[1]= -99; b[2]= -98; ); ## the '​.='​ operator assigns the contents of a to the contents of b
  
-* WARNINGACCIDENTAL TYPES AND OBJECTS DEFINITION MIXUPS+julia> a     ## thus, a is unaffected. 
 +2-element Array{Int64,​1}: 
 + 1 
 + 2
  
 +```
  
-^ **statement** ​ ^ **result** ​ ^ **explanation** ​ ^ 
-| `v= Vector{Float64}` | a type | desired? v is now an alias for a type. `x= v([NaN])` creates a float vector variable ​ | 
-| `v= Vector{Float64}(undef,​10)` | an uninitialized variable | bad.  v is a 10-element vector that holds garbage noise. ​ | 
-| `v= Vector{Float64}([10])` | an initialized variable | good. v is 1-element vector object. ​ | 
  
-FIXME (Andreas:) Maybe explain what the last example is actually doing here. Also explain why uninitialized memory might be bad. 
  
-A common beginner'​s mistake is some variant of `v= Vector{Float64}; ​ v + 0.0`.  A type and a value cannot be added.+## Accidental Confusion about Types vs Values
  
 +It is easy for newcomers to accidentally mix up types and values, which can be confusing.
  
-Arrays are discussed in great detail in the next chapters.+^ **statement** ​ ^ **result** ​ ^ **explanation** ​ ^ 
 +| `v= Vector{Float64}` | a type | desired? v is now an alias for a type. `x= v([NaN])` creates a float vector variable ​ | 
 +| `v= Vector{Float64}(undef,​10)` | an uninitialized variable | bad.  v is a 10-element vector that holds garbage noise. ​ | 
 +| `v= Vector{Float64}([10])` | an initialized variable | good. v is 1-element vector object, containing ​the number 10 |
  
 +Uninitialized variables can contain garbage and errors that are often difficult to track.
  
-- In R, execution speed demands expressing every operation in vectors. ​ This is because the R user program is interpreted,​ but its underlying libraries are compiled. ​ Expressing operations in vectors is usually *not* helpful in Julia, because Julia compiles the user code, too.  ​Thus, the fastest Julia operations are often based on plain old index-into-array manipulations---exactly the kind of operation that R suggests you to avoid.+A common beginner'​s mistake ​is some variant of `v= Vector{Float64}; ​ v[1]= 1.0`.  ​A type cannot be assigned a value.
  
  
Line 192: Line 213:
 ((:a, :b), (:c, :d)) ((:a, :b), (:c, :d))
  
-julia> vcat( t1, t2 ) ## an *array* of tuples+julia> vcat( t1, t2 ) ## ​vertical concat: ​an *array* of tuples
 2-element Array{Tuple{Symbol,​Symbol},​1}:​ 2-element Array{Tuple{Symbol,​Symbol},​1}:​
  (:a, :b)  (:a, :b)
Line 206: Line 227:
 ## Tuples as Function Arguments ## Tuples as Function Arguments
  
-The contents of an tuple can be unpacked for passing as the arguments of a [[functions#​tuples_as_arguments|function]],​ too.  This can work even using the `...` operator. For example:+The contents of an tuple can be unpacked for passing as the arguments of a [[funother#​tuples_as_arguments|function]],​ too.  This can work even using the `...` operator. For example:
  
 ```juliarepl ```juliarepl
Line 219: Line 240:
 ``` ```
  
 +
 +## Named Tuples
 +
 +Tuples can have names:
 +
 +```juliarepl
 +julia> t= (a=1, b=2)
 +(a = 1, b = 2)
 +
 +julia> typeof( t )
 +NamedTuple{(:​a,​ :​b),​Tuple{Int64,​Int64}}
 +
 +julia> t[1]
 +1
 +
 +```
  
  
Line 246: Line 283:
 ``` ```
  
-Julia could not know whether you meant 
- 
-```juliarepl 
-julia> ( (1,2), (3,4), (5,6) ) 
-((1, 2), (3, 4), (5, 6)) 
-``` 
- 
-to be a matrix or a 3-tuple of 2-tuples, so if you want to interpret it and convert it to a 3x2 matrix, then use the [[arraysmatrix#​creating_multidimensional_grids|`reinterpret()`]] function. 
- 
-FIXME (Andreas:) I don't understand the last sentence here. In what sense would the tuple be a matrix and how would you use `reinterpret`?​ 
  
  
Line 314: Line 341:
  
 ```juliarepl ```juliarepl
-julia> x= ( (12,3), [1.0, 2.0], ["​A",​ '​n',​ 13, (3,4), [5,6]], ) + 
-((12, 3), [1.0, 2.0], Any["​A",​ '​n',​ 13, (3,4), [5,6]])+julia> x= ( (12,3, (1,2,3), [1,2,3]), [1.0, 2.0], ["​A",​ '​n',​ 13, (3,4), [5,6]], ) 
 +((12, 3, (1, 2, 3), [1, 2, 3]), [1.0, 2.0], Any["​A",​ '​n',​ 13, (3, 4), [5, 6]]) 
 + 
 +julia> typeof(x) 
 +Tuple{Tuple{Int64,​Int64,​Tuple{Int64,​Int64,​Int64},​Array{Int64,​1}},​Array{Float64,​1},​Array{Any,​1}} 
 ``` ```
  
 FIXME How to descend into mixed tuple/array structure, and make all tuple or all array out of it. FIXME How to descend into mixed tuple/array structure, and make all tuple or all array out of it.
-FIXME (Andreas:) Maybe `(t -> tuple(t...)).(x)` and `(t -> [t...]).(x)` do what you want here.+ 
 +FIXME (Andreas:) Maybe `(t -> tuple(t...)).(x)` and `(t -> [t...]).(x)` do what you want here. --- nope, this does not do the job
  
 ```juliafix ```juliafix
 julia> tupelize( x ) julia> tupelize( x )
-((12, 3), (1.0, 2.0), ("​A",​ '​n',​ 13, (3,​4), ​[5,6]))+((12, 3, (1, 2, 3), (1, 2, 3)), (1.0, 2.0), ("​A",​ '​n',​ 13, (3, 4), (5, 6)))
  
 julia> arraylize( x ) julia> arraylize( x )
-[[12, 3], [1.0, 2.0], Any["​A",​ '​n',​ 13, [3,4], [5,6]]]+[[12, 3, [1, 2, 3], [1, 2, 3]], [1.0, 2.0], ["​A",​ '​n',​ 13, [3, 4], [5, 6]]]
 ``` ```
 +
  
  
Line 343: Line 377:
 - There is also a more `AbstractArray` type.  It is usually used as function arguments when the function should work not only on arrays but also on similar types (data structures like `Diagonal`). ​ This chapter ignores this.   ​`AbstractArray{T,​N}` can also be useful, e.g., for sorting, but is too complex for this tutorial and thus ignored. - There is also a more `AbstractArray` type.  It is usually used as function arguments when the function should work not only on arrays but also on similar types (data structures like `Diagonal`). ​ This chapter ignores this.   ​`AbstractArray{T,​N}` can also be useful, e.g., for sorting, but is too complex for this tutorial and thus ignored.
  
-- It is a pity that Julia does not force type declarations for objects, but silently obliges to create them.  This makes it easy to accidentally declare arrays with too broad a type, especially Any array types---which hoses both julia'​s type-checking and efficiency.+- It is a pity that Julia does not force type declarations for objects, but silently obliges to create them.  This makes it easy to accidentally declare arrays with too broad a type, especially ​`Anyarray types---which hoses both julia'​s type-checking and efficiency.
  
-FIXME (Andreas:) This is indeed a common pitfall in Julia but sometimes you do want a heterogeneous array+FIXME (Andreas:) This is indeed a common pitfall in Julia but sometimes you do want a heterogeneous array --- ivo: yes, but this the language'​s fault. ​ for these obscure cases, the user you should have to declare '​Any'​ explicitly!
  
-- NamedTuples used to allow `x= @NT( a=1, b=2 )`, but are not working in Julia 1.0 in Sep 2018. 
- 
-FIXME (Andreas:) It's just that named tuples are now in Base and have a different syntax. It's `(a = 1, b = 2)`. 
  
  
 ## References ## References
- 
-- [Julia DataArrays](https://​github.com/​JuliaStats/​DataArrays.jl) which allow for Missing data 
  
 - [Julia Array documentation](http://​docs.julialang.org/​en/​release-0.5/​stdlib/​arrays/​) - [Julia Array documentation](http://​docs.julialang.org/​en/​release-0.5/​stdlib/​arrays/​)
arraysintro.txt · Last modified: 2018/12/28 13:40 (external edit)