User Tools

Site Tools


Table of Contents

julia> pkgchk.( [ "julia" => v"1.0.3" ] );


Strings are perhaps the most important building blocks of general-purpose computer languages. In modern languages, regular expressions have become central to dealing with strings. Regex are treated in Regular Expressions.

  • A Julia string is an immutable sequence of characters (like a tuple of many indexed characters starting at 1). Because strings cannot be modified, programmers create new strings from existing strings. (When building strings sequentially, to avoid the copying cost, use a character vector and then join, or use an `IOBuffer()`.)
  • All characters in Julia are UTF8, so you may need to refer back to the Julia UTF string docs.
Quoting Example Explanation
“mytext” A standard double-quoted string, with interpolation
r” r“[abc]” Preceded by letter means a special type of string
' 'c' A single quote designates a (UTF-8) character
`` `ls` A backquote (backtick), used for operating system commands
: :sym A symbols_and_variable_names_symbol is a string limited to julia identifier characters (and w/o interpolation)
  • When defining a function that takes a string argument, use AbstractString instead of String. This will be explained in functions. Basically, it allows the function to work on string-like objects (like substrings), too.

Creating Strings

Most strings are created with double quotes. Backslashes are used for quoting. Strings can contain newlines:

[download only julia statements]
julia> "ab\"cd				## string with embedded newline
julia> "\u00a5 \u20ac ¥"		## string with UTF-8, quoted and direct
"¥ € ¥"

Raw Quoting of Strings

There are a number of variants of strings, which can have different meanings. For example r“[abc]” is a regular expression. A “raw” string eliminates interpolation and many special character interpretation, which can make it easier to create a String with many special-meaning characters. The result of entering a raw quote is an ordinary string, though:

julia> raw"a$x\ab\n\u00A5"

julia> typeof( ans )

Triple-Quoting of Strings

Triple-quoted strings have special meaning and (primarily) make it easier to include doublequotes:

julia> """		## start of string


       """		## end of string ##
"\t\t## start of string\n\nab\"cd\n\n"

Repeat Patterns

julia> repeat("-+", 10)		## two times ten is twenty characters

Random Strings

julia> using Random

julia> Random.seed!(0);

julia> randstring(12)		## ASCII only, not UTF–8

Converting ASCII and UTF-8


julia>  Int('A'), convert(Int, 'A') 					## UTF–8 Index
(65, 65)

julia> convert(Char, 65')
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> '\u41'								## unicode entry
'A': ASCII/Unicode U+0041 (category Lu: Letter, uppercase)

julia> string( UInt32('A'); base=16 )

julia> string( UInt32('A'); base=2 )

UTF-8 Strings to Pure Ascii

julia> ascii("ab")					## only tests that entire string is ascii, throws exception otherwise

julia> ascii("abπ")					## Julia does not know how to convert UTF–8 into ASCII
ERROR: ArgumentError: invalid ASCII at index 3 in "abπ"

julia>  map( c->(isascii(c) ? c : '?'), "abπ")		## replace all utf–8 characters with '?'

To convert into UTF-8 into ASCII with proper escape sequences

julia> hex(a::Char; kwargs...)= string( UInt32(a); base=16, kwargs... )
hex (generic function with 1 method)

julia> function escape_unicode(s::AbstractString)
	    buf= IOBuffer()
	    for c in s
	        if isascii(c)
	            print(buf, c)
	            i= UInt32(c)
	            if i < 0x10000
	                print( buf, "\\u", hex(c; pad= 4) )
	                print( buf, "\\U", hex(c; pad= 8) )
	    return String( take!(buf) )
escape_unicode (generic function with 1 method)

julia> const yen= unescape_string("\\u00a5")

julia> const oneyen= unescape_string("1 \\u00a5")
"1 ¥"

julia> escape_unicode(oneyen)
"1 \\u00a5"

Finding the String Length

In UTF-8, characters can be more than one byte long.

julia> length("AéB𐅍CD")				## character length, not byte length!

julia> sizeof(''), sizeof(""), length("")		## in UTF–8, a heart requires 3 bytes
(4, 3, 1)

julia> lastindex("AéB𐅍CD")				## byte length, not character length
  • All characters are 4 bytes, but their string encoding can be smaller. This is why sizeof('❤') < sizeof("❤").
  • length(s) is always less than or equal lastindex(s).
  • ind2chr(s,i) and chr2ind(s,i) convert indexes from character to byte index and vice-versa.

Testing for String Content with Digits, Alphas, UTF-8, Etc.

For Characters

julia> function whatis(c)
	    for isfunction in (isletter, isascii, iscntrl, isdigit, islowercase,
            			isnumeric, isprint, ispunct, isspace, isuppercase, isxdigit)
		println("$(isfunction)($c)= \t$(isfunction(c))")
whatis (generic function with 1 method)

julia> whatis('\u20AC')			## for characters
isletter()= 	false
isascii()= 	false
iscntrl()= 	false
isdigit()= 	false
islowercase()=	false
isnumeric()= 	false
isprint()= 	true
ispunct()= 	false
isspace()= 	false
isuppercase()=	false
isxdigit()= 	false

For Strings

julia> all(isletter, "abc23")

julia> all(isascii, "AéB𐅍CD")

julia> any(isascii, "AéB𐅍CD")

These functionalities can also be accomplished with regex expressions.

String Concatenation

julia> "1" * "2" * "3" * "h"		## works only with strings, not with mixed numbers and strings.

Strings can also be concatenated with the string function. Unlike with the * operator, non-string objects are converted into strings if they have a show() method function:

julia> string("One+", "Two+", 3, '+', :four)			## the last is a "Symbol" type

julia> const many= ( "one|" , "two|" , 3 , 3.0 , '|' , :four)	## a tuple of elements  (PS: internally uses show() methods)
("one|", "two|", 3, 3.0, '|', :four)

julia> typeof(many)						## most suitable type of each element; const is always ignored

julia> string(many)						## the tuple is one string() argument
"(\"one|\", \"two|\", 3, 3.0, '|', :four)"

julia> string(many...)						## the tuple is turned into many string() arguments

String Interpolation (Expanding Variables in Strings)

The coolest feature of julia strings is that the $ notation can be used to substitute a user defined variable or expression (its string equivalent to be more precise) into any position in a string:

julia> const w= "world"; const x= 1; "hello $w $x"
"hello world 1"

julia> "two= $(x + 1)"
"two= 2"
  • Dollar signs in strings must be quoted (\$) in order not to be confused with interpolation.

Converting Strings to Numbers or Anything (Meta.Parse and TryMeta.Parse)

julia> parse( Float32, "1.1" )                          ## basic parse with known type

julia> typeof( Meta.parse("12") )			## Meta.parse tries to convert strings into a most suitable type

julia> typeof( Meta.parse("12.0") )

julia> typeof( Meta.parse("haha") )

julia> typeof( Meta.parse("5+6") )

julia> Float64( Meta.parse("12") )			## request specific type, can throw exception

julia> Meta.parse("a12"; raise=false)			## don't except; if fails, result is a symbol
  • Int("9") is illegal. Int('9') gives the ASCII code (57), not the integer value (9).
  • the optional 'raise' allows specifying whether an impossible parse should raise an exception or not
  • parse() is not only less subject to surprises than Meta.parse(), but also far more efficient:
[download only julia statements]
julia> @btime parse(Float64, "1.1")
  23.581 ns (0 allocations: 0 bytes)
julia> @btime Meta.parse("1.1")		## a factor 1,000 slower!
  18.917 μs (10 allocations: 256 bytes)

Converting Numbers to Strings


julia> string(8.89)			## lowercase string()

C-Style Macros: @printf and @sprintf

C-style printf and sprintf work when used as macros, requiring '@' function prefixes:

julia> using Printf

julia> @printf("%12.5f", pi)

julia> @sprintf("%.3f",pi)		## macros cannot use computed arguments from the program, just constants

C-Style Julia: printf and sprintf

julia> using Formatting			## for a compiled version

julia> sprintf1("%'d", 1000000)		## note the quote, and commas in the output

julia> sprintf1("%'f", 1000000.0)
  • These functions cannot deal with vectors. To convert a vector of numbers into a vector of strings, use

Vectorized (S)printing

Just use the comprehension expression itself:

julia> x= 1:3;

julia> using Printf;   [ @sprintf("%.3f", xi ) for xi in x ]
3-element Array{String,1}:

julia> using Formatting;  sprintf1.("%.3f", x)
3-element Array{String,1}:
  • You could write this into a function that operates on a vector, but this is not the Julia way.

Converting Function Name to Strings

julia> f= [ sqrt, exp, sin ] ; [ "$fi(10) = $(fi(10))" for fi in f ]
3-element Array{String,1}:
 "sqrt(10) = 3.1622776601683795"
 "exp(10) = 22026.465794806718" 
 "sin(10) = –0.5440211108893698"

Converting Strings to Function Names

julia> s= Symbol(sqrt)		## first convert to symbol

julia> eval(s)(9)		## dangerous: an eval on a user input string could wreak havoc!

julia> f= [ :sqrt, :exp, :sin ] ; [ "$fi(10) = $(eval(fi)(10))" for fi in f ]
3-element Array{String,1}:
 "sqrt(10) = 3.1622776601683795"
 "exp(10) = 22026.465794806718"
 "sin(10) = –0.5440211108893698"

Preview: Arrays or Tuples With Strings

Arrays and Tuples can hold strings. For example,

julia> [ "a" "b"; "c" "d" ]		## a two-dimensional array of strings
2×2 Array{String,2}:
 "a"  "b"
 "c"  "d"
  • WARNING: [1,2]' transposes numerical arrays just fine, but this does not work for arrays of strings ["1","2"].

Stringifying Numeric Arrays

To convert an array of numbers into an array of strings, use the element-wise version of string() or use map():

julia> string.( [ 1.0, 2.0, 3.0 ] )	## or map( string, [1.0, 2.0, 3.0] )
3-element Array{String,1}:
  • you could also use the sprintf1 or @sprintf facilities
  • for multidimensional arrays, the output is “[1 2; 3 4]”

Converting Character Ranges to String Ranges

julia> string.( '1':'4'; )		## convert character array to string array; note semicolon
4-element Array{String,1}:

julia> map( x-> x[1], ans )		## convert back to char array
4-element Array{Char,1}:

Converting A String to A Numeric Array

julia> using DelimitedFiles

julia> readdlm( IOBuffer("1 2 3\n4 5 6"), Int )
2×3 Array{Int64,2}:
 1  2  3
 4  5  6
  • readdlm() is usually a file operation. However, by wrapping a string into an IOStream, file operations work.
  • readdlm() has many options. Try ?readdlm.

Replacing Characters

julia> replace("ab cd ef gh", " " => "|"; count=2)			## only the first two replacements
"ab|cd|ef gh"

Escaping and Unescaping C Strings

julia> print("a\tb\n");
a	b

julia> escape_string("a\tb\n")

julia> unescape_string("a\\tb\\n")

Trimming Spaces from Starts and/or Ends of a String

The rstrip() (lstrip())function can be used to remove trailing (leading) blanks from a string.

julia> const sa1= ("ab\n", "ab\r", "ab\r\n", "ab\n\r", "ab\r\n\r\n");		## 5 test strings

julia> chomp.(sa1)							## removes just trailing \n or \r\n
("ab", "ab\r", "ab", "ab\n\r", "ab\r\n")

julia> rstrip.(sa1)							## trailing [\n\r]
("ab", "ab", "ab", "ab", "ab")

julia> const s2= "  ab\ncd  ";						## another test string

julia> strip(s2), rstrip(s2), lstrip(s2)				## leaves intermittent [\n\r]
("ab\ncd", "  ab\ncd", "ab\ncd  ")

For more lines,

julia> const s3=  "  ab  \n  cd  \r  ef  \r\n   gh   \n\r  ij  ";		## messy multi-line test string

julia> join(  strip.(  split( s3, r"[\n\r]+" )  ), "\n"  )		## split lines, wipe \n\r, and rejoin w/ \n

Left- and Right-Justifying (Padding With Spaces)

julia> rpad("ab", 10), lpad("abcdefgh", 10)				## result is ten characters long
("ab        ", "  abcdefgh")

Changing Capitalization

julia> uppercase("aBcD"), " | ", lowercase("aBcD"), " | ", titlecase("aBcD efG"), " | ", uppercasefirst("aBcD efG")
("ABCD", " | ", "abcd", " | ", "Abcd Efg", " | ", "ABcD efG")

julia> titlecase( lowercase("aBcD efG") )
"Abcd Efg"

Checking Capitalization

julia> all(isuppercase,"aBcD"), all(isuppercase,"ABCD"), all(isuppercase,"abcd"), all(isuppercase,"Abcd")
(false, true, false, false)


The getindex notation [] can be used on String to generate substrings (just as in arrays or tuples):

julia> const str= "abcdefg"

julia> str[1:3]

julia> str[5:end]

Note that s[1] is a character, while s[[1]] or s[1:1] is a string. To convert a character to a string, use string('c').

WARNING This does not work with wide characters:

julia> x="æble"

julia> x[1]
'æ': Unicode U+00e6 (category Ll: Letter, lowercase)

julia> x[2]
ERROR: StringIndexError("æble", 2)
 [1] string_index_err(::String, ::Int64) at ./strings/string.jl:12
 [2] getindex_continued(::String, ::Int64, ::UInt32) at ./strings/string.jl:216
 [3] getindex(::String, ::Int64) at ./strings/string.jl:209
 [4] top-level scope at none:0

Processing a String One Character at a Time

Strings in Julia are similar to arrays of character. Thus, it is possible to iterate over the characters in a String:

julia> const s= "abcd"

julia> for ch in s;  println(ch+1); end#for

julia> for chi in eachindex(s); println( "$chi is $(s[chi]+1)" ); end#if
1 is b
2 is c
3 is d
4 is e

The map() function can be more convenient when operating over every character in the String:

julia> map(x -> x + 1, "abcd")			## operate and return as string

See also Create a String Character by Character below.

Splitting Strings by Character

julia> const ssa= split("1,2,3,4,5", ',')		## comma is splitting char. default is space.
5-element Array{SubString{String},1}:

julia> String.(ssa) 				## (rarely) needed, convert from Substrings to pure String
5-element Array{String,1}:

julia> split("abc","")				## collect("abc") does the same thing
3-element Array{SubString{String},1}:

Joining Strings with Character

Vector objects can be joined-and-stringified with inserted characters between them:

julia> join( ("1", "2", "3", "4", "5"), '|' )		## join strings

julia> join( (1, 2, 3, 4, 5, 6), ", " )			## convert ints to strings and then join
"1, 2, 3, 4, 5, 6"

julia> join( (1, 2, 3, 4, 5, 6), ", ", ", and " )	## Julia's clever (optional) last argument
"1, 2, 3, 4, 5, and 6"



julia> reverse("abcdefg")


If you want to reverse the individual words in the String, first split into words, reverse the string array, and recombine them:

julia> join(  reverse( split("abc def ghi jkl", ' ') ), ' '  )
"jkl ghi def abc"

Non-Regex Finding, Searching, Parsing, Replacing

Most searching, replacing, etc., functions work not only with regular expressions, but also with plain strings.

Testing if one String is a Substring of the Other

julia> occursin("ab", "abcd")

julia> startswith("abcd", "ab")

julia> endswith("abcd", "ab")

Filtering (Grepping) Only Matching Strings in Vector of Strings

The Julia way is to use a comprehension:

julia> heystack= [ "ab1", "ab2", "cd1", "ab3", "ef5" ];

julia> filter( x -> occursin("ab", x), heystack )	## method 1
3-element Array{String,1}:

julia> w= occursin.( "ab", heystack )		## method 2
5-element BitArray{1}:

julia>  heystack[ w ]
3-element Array{String,1}:

The non-Julia way is to define a vector function

julia> gnep(needle,heystack::Vector)= filter( x -> occursin(needle, x), heystack )
gnep (generic function with 1 method)

julia> gnep( "ab", [ "ab1", "ab2", "cd1", "ab3", "ef5" ] )
3-element Array{String,1}:

Finding the Location of Substring(s) in a String

julia> s= "ab cd as cd more cd end";

julia> search( needle::AbstractString, heystack::AbstractString )= something(findfirst(heystack,needle), 0:–1);

julia> search( needle::AbstractString, heystack::AbstractString, nmatch::Int )= something(findnext(heystack,needle, nmatch), 0:–1);

julia> rsearch( needle::AbstractString, heystack::AbstractString )= something(findlast(heystack,needle), 0:–1);

julia> rsearch( needle::AbstractString, heystack::AbstractString, nmatch::Int )= something(findprev(heystack,needle, nmatch), 0:–1);

julia> search(s, "cd")				## first match -> type is UnitRange{Int64}

julia> first(search( s, "cd" ))			## first match, but just start index

julia> search(s, "cd", 5)			## start search beginning char #5

julia> rsearch(s, "cd")				## last match

Finding the Location of Char in a String

julia> s= "ab cd as cd more cd end";

julia> findfirst(isequal('c'),s)

julia> findlast(isequal('c'),s)

julia> findfirst(isequal('0'),s)		## can be tested against 'nothing'

Replacing String Inside Other String

julia> replace("abc.txt", ".txt" => ".csv")

Counting Incidences

Characters in Strings

julia> in('a', "abcabcabc")			## also works as 'a' in "abcabcabc"

julia> count( c-> (c == 'a') , "abcabcabc")	## count takes a function and an object

Strings in Strings

julia> matchall(r::Regex,s::AbstractString; overlap::Bool=false)= collect((m.match for m = eachmatch(r, s, overlap=overlap)));

julia> length( matchall(r"ab", "abcabsdabkab") )

julia> ( matchall(r"abc", "abcabsdabkab") )
1-element Array{SubString{String},1}:

Parsing Data Fields from a CSV-Delimited String

CSV us an old Microsoft Excel, but nowadays ubiquitous standard. Parsing CSV can be challenging, because string fields can contain commas themselves, and strings can but need not be quoted.


julia> s= "\"abcd\" , \"abc,d\" , \"ab,c,d\" , \"a,b,c,d\"";

julia> split(s, ",")					## works only if quoted strings do not contain commas; here, yikes
10-element Array{SubString{String},1}:
 "\"abcd\" "
 " \"abc"
 "d\" "
 " \"ab"
 "d\" "
 " \"a"


readcsv() and readdlm() understand csv and can be used on individual lines or on whole files (dataio)

julia> using DelimitedFiles

julia> readdlm( IOBuffer("\"abcd\",\"abc,d\",\"ab,c,d\",\"a,b,c,d\""), ',' )
1×4 Array{Any,2}:
 "abcd"  "abc,d"  "ab,c,d"  "a,b,c,d"
  • For nontrivial cases, use the optimized CSV.jl.
  • TextParse offers a TextParse.csvread(filename) function.

Reading Strings from and Writing Strings to Files

See also File IO.


julia> filename= "/tmp/myhellostring.txt"; mytext= "hello\n\n";

julia> write(filename, mytext)						## returns # characters written

julia> open(filename, "w") do ofile;  print(ofile, mytext);  end#do#		## another way to write to file filename


julia> filename= "/tmp/myhellostring.txt"

julia> read(filename, String)				## reading back from the file

julia> open(filename) do ifile
		for ln in enumerate(eachline(ifile)); println(ln); end#for#
(1, "hello")
(2, "")

Processing a String as an Input File

See [#convertingastringtoanumericarray|Converting String to Numerical Array]. Wrap the string into an IOBuffer first, and use the provided IO file-like operations. For example,

julia> readline( IOBuffer( "abc\nde\n" ) )

Create a String Character by Character

Strings are read-only. Thus, it is often convenient and fast to write to an IOBuffer first, and then convert this IOBuffer into a string.

julia> using Random;  Random.seed!(0); 

julia> destbuf= IOBuffer();

julia> for srs in randstring(50); print(destbuf, srs); end;#for  ## just give me random stuff

julia> s= String(take!(destbuf))

julia> close(destbuf);

See also Processing a String One Character at a Time.

Writing String Arrays to Files

writedlm can work with arrays of any type, but you must make sure that your strings do not contain the delimiting character. See below and the chapter for more file IO. For example.

julia> sar= string.( "5", [".1" ".2"; ".3" ".4"], "6" )		## create an example string array by clever concatenation
2×2 Array{String,2}:
 "5.16"  "5.26"
 "5.36"  "5.46"

julia> writedlm("/tmp/fourstrings1.txt", sar, ",");		## simplest way to write a string array.

julia> open("/tmp/fourstrings2.txt", "w") do ofile		## an alternative
		writedlm(ofile, sar, ",")


Commonly Useful Packages on Julia Repository


  • In previous versions of Julia, the string type was implemented through an abstract type, AbstractString and concrete types such as ASCIIString.
  • Julia allows pure binary strings, e.g., b"\xff" or b"\uff hello" and raw strings, e.g., r"the us$".
  • Possibly add tabify and untabify
  • The colon prefix is used not only for a Symbol, but also for expressions, such as a=3; :($a+3).
  • FIXME Look into StringLiterals:
[download only julia statements]
The StringLiterals package is an attempt to bring a cleaner string literal syntax to Julia, as well as having an easier way of producing formatted strings, borrowing from both Python and C formatted printing syntax. It also adds support for using LaTex, Emoji, HTML, or Unicode entity names that are looked up at compile-time.
Currently, it adds a Swift style string macro, f"...", which uses the Swift syntax for interpolation, i.e. \(expression). 


strings.txt · Last modified: 2018/12/28 11:19 (external edit)