Skip to content

Commit 1af3c2b

Browse files
author
José Valim
committed
Add sigils chapter
1 parent 8634ef0 commit 1af3c2b

File tree

1 file changed

+174
-0
lines changed

1 file changed

+174
-0
lines changed

‎getting_started/19.markdown‎

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,177 @@ guide: 19
55
---
66

77
# {{page.title }}
8+
9+
We have already learned Elixir provides double-quoted strings and single-quoted char lists. However, this only covers the surface of structures that have textual representation in the language. Atoms are, for example, another structure which are mostly created via the `:atom` representation.
10+
11+
One of Elixir's goals is extensibility: developers should be able to extend the language to particular domains. Computer science has become such a wide field that it is impossible for a language to tackle many fields as part of its core. Our best bet is to rather make the language extensible, so developers, companies and communities can extend the language to their relevant domains.
12+
13+
In the chapter, we are going to explore sigils, which are one of the mechanisms provided by the language for working with textual representations.
14+
15+
## 19.1 Regular expressions
16+
17+
Sigils start with the tilde (`~`) character which is followed by a letter and then a separator. The most common sigil in Elixir is `~r` for [regular expressions](https://en.wikipedia.org/wiki/Regular_Expressions):
18+
19+
```iex
20+
# A regular expression that returns true if the text has foo or bar
21+
iex> regex = ~r/foo|bar/
22+
~r/foo|bar/
23+
iex> "foo" =~ regex
24+
true
25+
iex> "bat" =~ regex
26+
false
27+
```
28+
29+
Elixir provides Perl-compatible regular expressions (regexes), as provided by the [PCRE](http://www.pcre.org/) library. Regexes also support modifiers. For example, the `i` modifier makes a regular expression case insensitive:
30+
31+
```iex
32+
iex> "HELLO" =~ ~r/hello/
33+
false
34+
iex> "HELLO" =~ ~r/hello/i
35+
true
36+
```
37+
38+
Check out the [`Regex` module](/docs/stable/Regex.html) for more information on other modifiers and the supported operations with regular expressions.
39+
40+
So far, all examples have used `/` to delimit a regular expression. However sigils support 8 different separators:
41+
42+
```
43+
~r/hello/
44+
~r|hello|
45+
~r"hello"
46+
~r'hello'
47+
~r(hello)
48+
~r[hello]
49+
~r{hello}
50+
~r<hello>
51+
```
52+
53+
The reasoning in supporting different operators is that different separators can be more convenient to different sigils. For example, using parentheses for regular expressions may be a confusing choice as they can get mixed with the parentheses inside the regex. However, parentheses can be handy for for other sigils, as we will see in the next section.
54+
55+
## 19.2 Strings, char lists and words sigils
56+
57+
Besides regular expressions, Elixir ships with three other sigils.
58+
59+
The `~s` sigil is used to generate strings, similar to double quotes:
60+
61+
```iex
62+
iex> ~s(this is a string with "quotes")
63+
"this is a string with \"quotes\""
64+
```
65+
66+
While `~c` is used to generate char lists:
67+
68+
```iex
69+
iex> ~c(this is a string with "quotes")
70+
'this is a string with "quotes"'
71+
```
72+
73+
The `~w` sigil is used to generate a list of words separated by white space:
74+
75+
```iex
76+
iex> ~w(foo bar bat)
77+
["foo", "bar", "bat"]
78+
```
79+
80+
The `~w` sigil also accepts the `c`, `s` and `a` modifiers to choose the format of the result:
81+
82+
```iex
83+
iex> ~w(foo bar bat)a
84+
[:foo, :bar, :bat]
85+
```
86+
87+
Besides lowercase sigils, Elixir supports uppercase sigils. While both `~s` and `~S` will return strings, the first one allows escape codes and interpolation while the second does not:
88+
89+
```elixir
90+
iex>~s(String with escape codes \x26 interpolation)
91+
"String with escape codes & interpolation"
92+
iex>~S(String without escape codes and without #{interpolation})
93+
"String without escape codes and without \#{interpolation}"
94+
```
95+
96+
The following escape codes applies to strings and char lists:
97+
98+
*`\"` – double quote
99+
*`\'` – single quote
100+
*`\\` – single backslash
101+
*`\a` – bell/alert
102+
*`\b` – backspace
103+
*`\d` - delete
104+
*`\e` - escape
105+
*`\f` - form feed
106+
*`\n` – newline
107+
*`\r` – carriage return
108+
*`\s` – space
109+
*`\t` – tab
110+
*`\v` – vertial tab
111+
*`\DDD`, `\DD`, `\D` - character with octal representation DDD, DD or D (example: `\377`)
112+
*`\xDD` - character with hexadecimal representation DD (example: `\x13`)
113+
*`\x{D...}` - character with hexadecimal representation with one or more hexadecimal characters (example: `\x{abc13}`)
114+
115+
Sigils also support heredocs which is when tripe double- or single-quotes are used as separators:
116+
117+
```iex
118+
iex> ~s"""
119+
...> this is
120+
...> a heredoc string
121+
...> """
122+
```
123+
124+
The most common case for heredoc sigils is when writing documentation. For example, if you need to write escape characters in your documentation, it can become error prone as we would need to double escape some characters:
125+
126+
```elixir
127+
@doc """
128+
Converts double-quotes to single-quotes.
129+
130+
## Examples
131+
132+
iex> convert("\\\"foo\\\"")
133+
"'foo'"
134+
135+
"""
136+
defconvert(...)
137+
```
138+
139+
By using using `~S`, we can avoid this problem altogether:
140+
141+
```elixir
142+
@doc ~S"""
143+
Converts double-quotes to single-quotes.
144+
145+
## Examples
146+
147+
iex> convert("\"foo\"")
148+
"'foo'"
149+
150+
"""
151+
defconvert(...)
152+
```
153+
154+
## 19.3 Custom sigils
155+
156+
As hinted at the beginning of this chapter, sigils inElixir are extensible. In fact, the sigil `~r/foo/i` is equivalent to calling the `sigil_r` function with two arguments:
157+
158+
```iex
159+
iex>sigil_r(<<"foo">>, 'i')
160+
~r"foo"i
161+
```
162+
163+
That said, we can access the documentation for the `~r` sigil via the `sigil_r` function:
164+
165+
```iex
166+
iex> h sigil_r
167+
...
168+
```
169+
170+
We can also provide our own sigils by simply implementing the proper function. For example, let's implement the `~i(13)` sigil that returns an integer:
171+
172+
```iex
173+
iex> defmodule MySigils do
174+
...> def sigil_i(binary, []), do: binary_to_integer(binary)
175+
...> end
176+
iex> import MySigils
177+
iex> ~i(13)
178+
13
179+
```
180+
181+
Sigils can also be used to do compile-time work when required. For example, regular expressions in Elixir are compiled during compilation time, therefore avoiding doing repetitive work at runtime. However, in order to understand how such works, we need to talk about macros. And that's the direction we will take in the next chapters.

0 commit comments

Comments
(0)