User Tools

Site Tools


arraysintro

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
arraysintro [2018/12/11 02:05]
julia [References]
arraysintro [2018/12/28 13:40] (current)
Line 11: Line 11:
  
 ```juliarepl ```juliarepl
-julia> pkgchk( [ "​julia"​ => v"1.0.2" ] )+julia> pkgchk.( [ "​julia"​ => v"1.0.3" ] );
 ``` ```
- 
  
 # Arrays # Arrays
  
-First-class treatment of arrays (of floats) ​are the feature that turns general-purpose ​languages ​into scientific ​languages.  Julia is first-class!+It is the first-class treatment of arrays (of floats) ​that is the killer ​feature that turns general-purpose ​language ​into scientific ​language.  Julia is first-class ​scientific language!
  
  
-- An array is an ordered mutable (i.e., read-write) collection of (same-type) elements. FIXME (Andreas:) "same type" might be misleading here since an `Any` array can contain elements ​of any types. This is a key difference between concretely and abstractly typed arrays.+In Python and R, execution speed is much better when users call specialized vector operations rather than work with iterative operations This is because user programs are interpreted,​ but the underlying language libraries are carefully coded and compiled programs, often written in C In contrastexpressing operations in vectors is usually *not* helpful in Julia, because Julia compiles the user code, too.  Thus, the fastest Julia operations are often based on plain old iterative index-into-array ​manipulations---exactly the kind of operation that Python or R suggest users to avoid.
  
-- Arrays can be of any type (e.g., an array of Bools, such as `Array{Bool,​1}`,​ which is a vector of Booleans).  ​However, in practice, arrays are most often used for numerical containers, such as `Array{Float64,​1}` (a vector of Floats) or `Array{Int,​2}` (a matrix of integers). FIXME (Andreas:) I think it would be more clear if you briefly explain the two type parameters here.+A Python or R user who wants to write new efficient language library ​(usuallyneeds to learn and code in C.  ​A Julia user does not.
  
-- Avoid arrays of `Any` to keep the discipline and type-checking of narrow types. ​ Ideally, use only arrays of machine-native types. ​FIXME (Andreas:) Briefly explain why.+ 
 +## Arrays, Vectors, and Matrices are Composite Types 
 + 
 +- An array is an ordered mutable (i.e., read-write) collection of (same-type) elements. (PS: if the type is `Any`, then each cell may contain a different primitive type.) 
 + 
 +- Arrays can be of any type (e.g., an array of Bools, such as `Array{Bool,​1}`,​ which means a 1-dimensional array [i.e., a vector] of Booleans). ​ However, in practice, arrays are most often used for numerical containers, such as `Array{Float64,​1}` (a vector of Floats) or `Array{Int,​2}` (a 2-dimensional matrix of integers). 
 + 
 +- Avoid arrays of `Any` to keep the discipline and type-checking of narrow types. ​ Ideally, use only arrays of machine-native types. ​ ​Vector operations on machine-native types tend to be fast.
  
 - Vectors and matrices are aliases for one- and two-dimensional arrays: - Vectors and matrices are aliases for one- and two-dimensional arrays:
Line 44: Line 50:
 ``` ```
  
-Dot functions operate scalar functions on arrays on an element-by-element basis: `abs.([-2,​3,​-2])` yields `[2,3,2]` in-situ. ​FIXME (Andreas:) explain that this is *syntaxand not special functions.+"Dot functions" ​operate scalar functions on arrays on an element-by-element basis: `abs.([-2,​3,​-2])` yields `[2,3,2]` in-situ. ​ The dot is syntaxand not a new function definition. ​ Only `abs` was defined. ​ The use of an `abs`-dot function applies the scalar function `abs` to each element. ​ Moreover, Julia cleverly unrolls dot operations. ​ If A, B, and C are arrays, `A .* (B .+ C)` does not create an intermediate object for `B .+ C`, but each cell is calculated as `a[,​]*(b[,​]+c[,​])`.
  
  
 +## Primitive and Composite Types: Assignment and Passing
  
 +Arrays, like dictionaries,​ pairs, structures, dataframes, etc., are not [[https://​docs.julialang.org/​en/​v1/​manual/​types/​index.html|primitive types (like numerics or bits) but composite types]]. ​ Their behavior is different, especially in the context of assignments and passing. ​ You have been warned. ​ (A third type are abstract types, but they are mostly names for broader categories.)
  
-* WARNING: JULIA PASSES EFFECTIVELY REFERENCES 
  
  
-Functions never pass arraysbut references ​to arrays ("objects").  ​Thus, the function ​does not have copies ​of the vectors ​but aliases that refer to the same vector contentFIXME (Andreas:) I don't think the ("objects"​) part helps here.+Julia uses [[https://​stackoverflow.com/​questions/​35235597/​julia-function-argument-by-reference%E2%80%8C%E2%80%8B|"​call by sharing"​]] (like perl or python). ​ Effectivelya function receiving a parameter holds a "name alias" ​to the object. ​ This is not like "call-by-value" ​or "​call-by-reference,"​ as in more traditional languages like C.  ​It means that the receiving function cannot alter the composite objectbut it can alter the contents of the object. ​ If the function ​assigns a new value to the alias, the function loses all access to the passed object of its caller. 
 + 
 +It is more akin to "​passing by references." ​ The function never gets its own unique copy of the composite type, but just the alias ​**Assignments are like function calls**, in that the assignments of objects ​does not make copies, either. ​ Instead, assignments also only create ​"aliases"​ to the same object.
  
-- Assignments are function calls, so even assignments do not make copies: FIXME (Andreas:) I don't think it is correct to say that assignments are function calls. Maybe just say that assignments don't create copies. 
  
 ```juliarepl ```juliarepl
-julia> a= [1,​2]; ​ ​( ​b= a; ); ( b[1]= -99; b[2]= -98; );     ​a+julia> a= [1,2]; 
 + 
 +julia> ​b= a; ( b[1]= -99; b[2]= -98; ); ## b is an alias to a; assigning to b alters a 
 + 
 +julia> a ## ​has been altered
 2-element Array{Int64,​1}:​ 2-element Array{Int64,​1}:​
  -99  -99
Line 63: Line 75:
  
 ``` ```
-FIXME (Andreas:) I still think one-liners are bad for readability but I'll not comment further on that. 
  
-- To make copy of the contents *in situ*, you can use the `.=` operator:+There are two ways to work with and alter `a` without clobbering it.  The `.=` assignment can make copies of contents:
  
 ```juliarepl ```juliarepl
-julia> a= [1,2]; b= [0,​0]; ​   ​( ​b.= a ); ( b[1]= -99; b[2]= -98; );    a+julia> a= [1,2]; b= [0,0]; 
 + 
 +julia> ​b.= a; ( b[1]= -99; b[2]= -98; ); ## the '​.='​ operator assigns the contents of to the contents of b 
 + 
 +julia> a     ## thus, a is unaffected.
 2-element Array{Int64,​1}:​ 2-element Array{Int64,​1}:​
  1  1
Line 74: Line 89:
  
 ``` ```
-FIXME (Andreas:) I know what you mean but `.=` is not really an operator and I think it would be better explain what it actually is to avoid confusion later on. 
  
-You can also use `copy()` or `deepcopy()` ​if you really need a copy and you do not want to clobber your original inadvertently.+You can also use `copy()` or `deepcopy()`.
  
 +```juliarepl
 +julia> a= [1,2];
  
 +julia> ​ b= copy(a); ( b[1]= -99; b[2]= -98; ); ## the '​.='​ operator assigns the contents of a to the contents of b
  
-* WARNINGACCIDENTAL TYPES AND OBJECTS DEFINITION MIXUPS+julia> a     ## thus, a is unaffected. 
 +2-element Array{Int64,​1}: 
 + 1 
 + 2
  
 +```
  
-^ **statement** ​ ^ **result** ​ ^ **explanation** ​ ^ 
-| `v= Vector{Float64}` | a type | desired? v is now an alias for a type. `x= v([NaN])` creates a float vector variable ​ | 
-| `v= Vector{Float64}(undef,​10)` | an uninitialized variable | bad.  v is a 10-element vector that holds garbage noise. ​ | 
-| `v= Vector{Float64}([10])` | an initialized variable | good. v is 1-element vector object. ​ | 
  
-FIXME (Andreas:) Maybe explain what the last example is actually doing here. Also explain why uninitialized memory might be bad. 
  
-A common beginner'​s mistake is some variant of `v= Vector{Float64}; ​ v + 0.0`.  A type and a value cannot be added.+## Accidental Confusion about Types vs Values
  
 +It is easy for newcomers to accidentally mix up types and values, which can be confusing.
  
-Arrays are discussed in great detail in the next chapters.+^ **statement** ​ ^ **result** ​ ^ **explanation** ​ ^ 
 +| `v= Vector{Float64}` | a type | desired? v is now an alias for a type. `x= v([NaN])` creates a float vector variable ​ | 
 +| `v= Vector{Float64}(undef,​10)` | an uninitialized variable | bad.  v is a 10-element vector that holds garbage noise. ​ | 
 +| `v= Vector{Float64}([10])` | an initialized variable | good. v is 1-element vector object, containing ​the number 10 |
  
 +Uninitialized variables can contain garbage and errors that are often difficult to track.
  
-- In R, execution speed demands expressing every operation in vectors. ​ This is because the R user program is interpreted,​ but its underlying libraries are compiled. ​ Expressing operations in vectors is usually *not* helpful in Julia, because Julia compiles the user code, too.  ​Thus, the fastest Julia operations are often based on plain old index-into-array manipulations---exactly the kind of operation that R suggests you to avoid.+A common beginner'​s mistake ​is some variant of `v= Vector{Float64}; ​ v[1]= 1.0`.  ​A type cannot be assigned a value.
  
  
Line 192: Line 213:
 ((:a, :b), (:c, :d)) ((:a, :b), (:c, :d))
  
-julia> vcat( t1, t2 ) ## an *array* of tuples+julia> vcat( t1, t2 ) ## ​vertical concat: ​an *array* of tuples
 2-element Array{Tuple{Symbol,​Symbol},​1}:​ 2-element Array{Tuple{Symbol,​Symbol},​1}:​
  (:a, :b)  (:a, :b)
Line 206: Line 227:
 ## Tuples as Function Arguments ## Tuples as Function Arguments
  
-The contents of an tuple can be unpacked for passing as the arguments of a [[functions#​tuples_as_arguments|function]],​ too.  This can work even using the `...` operator. For example:+The contents of an tuple can be unpacked for passing as the arguments of a [[funother#​tuples_as_arguments|function]],​ too.  This can work even using the `...` operator. For example:
  
 ```juliarepl ```juliarepl
Line 219: Line 240:
 ``` ```
  
 +
 +## Named Tuples
 +
 +Tuples can have names:
 +
 +```juliarepl
 +julia> t= (a=1, b=2)
 +(a = 1, b = 2)
 +
 +julia> typeof( t )
 +NamedTuple{(:​a,​ :​b),​Tuple{Int64,​Int64}}
 +
 +julia> t[1]
 +1
 +
 +```
  
  
Line 246: Line 283:
 ``` ```
  
-Julia could not know whether you meant 
- 
-```juliarepl 
-julia> ( (1,2), (3,4), (5,6) ) 
-((1, 2), (3, 4), (5, 6)) 
-``` 
- 
-to be a matrix or a 3-tuple of 2-tuples, so if you want to interpret it and convert it to a 3x2 matrix, then use the [[arraysmatrix#​creating_multidimensional_grids|`reinterpret()`]] function. 
- 
-FIXME (Andreas:) I don't understand the last sentence here. In what sense would the tuple be a matrix and how would you use `reinterpret`?​ 
  
  
Line 314: Line 341:
  
 ```juliarepl ```juliarepl
-julia> x= ( (12,3), [1.0, 2.0], ["​A",​ '​n',​ 13, (3,4), [5,6]], ) + 
-((12, 3), [1.0, 2.0], Any["​A",​ '​n',​ 13, (3,4), [5,6]])+julia> x= ( (12,3, (1,2,3), [1,2,3]), [1.0, 2.0], ["​A",​ '​n',​ 13, (3,4), [5,6]], ) 
 +((12, 3, (1, 2, 3), [1, 2, 3]), [1.0, 2.0], Any["​A",​ '​n',​ 13, (3, 4), [5, 6]]) 
 + 
 +julia> typeof(x) 
 +Tuple{Tuple{Int64,​Int64,​Tuple{Int64,​Int64,​Int64},​Array{Int64,​1}},​Array{Float64,​1},​Array{Any,​1}} 
 ``` ```
  
 FIXME How to descend into mixed tuple/array structure, and make all tuple or all array out of it. FIXME How to descend into mixed tuple/array structure, and make all tuple or all array out of it.
-FIXME (Andreas:) Maybe `(t -> tuple(t...)).(x)` and `(t -> [t...]).(x)` do what you want here.+ 
 +FIXME (Andreas:) Maybe `(t -> tuple(t...)).(x)` and `(t -> [t...]).(x)` do what you want here. --- nope, this does not do the job
  
 ```juliafix ```juliafix
 julia> tupelize( x ) julia> tupelize( x )
-((12, 3), (1.0, 2.0), ("​A",​ '​n',​ 13, (3,​4), ​[5,6]))+((12, 3, (1, 2, 3), (1, 2, 3)), (1.0, 2.0), ("​A",​ '​n',​ 13, (3, 4), (5, 6)))
  
 julia> arraylize( x ) julia> arraylize( x )
-[[12, 3], [1.0, 2.0], Any["​A",​ '​n',​ 13, [3,4], [5,6]]]+[[12, 3, [1, 2, 3], [1, 2, 3]], [1.0, 2.0], ["​A",​ '​n',​ 13, [3, 4], [5, 6]]]
 ``` ```
 +
  
  
Line 343: Line 377:
 - There is also a more `AbstractArray` type.  It is usually used as function arguments when the function should work not only on arrays but also on similar types (data structures like `Diagonal`). ​ This chapter ignores this.   ​`AbstractArray{T,​N}` can also be useful, e.g., for sorting, but is too complex for this tutorial and thus ignored. - There is also a more `AbstractArray` type.  It is usually used as function arguments when the function should work not only on arrays but also on similar types (data structures like `Diagonal`). ​ This chapter ignores this.   ​`AbstractArray{T,​N}` can also be useful, e.g., for sorting, but is too complex for this tutorial and thus ignored.
  
-- It is a pity that Julia does not force type declarations for objects, but silently obliges to create them.  This makes it easy to accidentally declare arrays with too broad a type, especially Any array types---which hoses both julia'​s type-checking and efficiency.+- It is a pity that Julia does not force type declarations for objects, but silently obliges to create them.  This makes it easy to accidentally declare arrays with too broad a type, especially ​`Anyarray types---which hoses both julia'​s type-checking and efficiency.
  
-FIXME (Andreas:) This is indeed a common pitfall in Julia but sometimes you do want a heterogeneous array+FIXME (Andreas:) This is indeed a common pitfall in Julia but sometimes you do want a heterogeneous array --- ivo: yes, but this the language'​s fault. ​ for these obscure cases, the user you should have to declare '​Any'​ explicitly!
  
-- NamedTuples used to allow `x= @NT( a=1, b=2 )`, but are not working in Julia 1.0 in Sep 2018. 
- 
-FIXME (Andreas:) It's just that named tuples are now in Base and have a different syntax. It's `(a = 1, b = 2)`. 
  
  
 ## References ## References
- 
-- [Julia DataArrays](https://​github.com/​JuliaStats/​DataArrays.jl) which allow for Missing data 
- 
-FIXME (Andreas) DataArrays.jl is deprecated 
  
 - [Julia Array documentation](http://​docs.julialang.org/​en/​release-0.5/​stdlib/​arrays/​) - [Julia Array documentation](http://​docs.julialang.org/​en/​release-0.5/​stdlib/​arrays/​)
arraysintro.1544522718.txt.gz · Last modified: 2018/12/11 02:05 by julia