Learn Elixir with a Rubyist, episode IV


Elixir Types, Data Structures and Pattern Matching

It’s already the fourth episode of Learn Elixir with a Rubyist! If you have followed the episodes so far you probably already have a great sense around Elixir and some of its concepts, congratz! In this episode, we’ll take some steps back to check some basics we skipped to get some real taste on Elixir, but it’ll be fun! If you’re liking it, let me know on twitter :D. In case you haven’t checked the past episodes I’d recommend you to do so, that will help you to understand what we have seen so far.

Types

Elixir, for now, is what we call a dynamically typed language, this basically means that your variable’s type (like integer, string and others that we will see bellow) are defined on run-time and not on compile time, this mean you can program a little bit faster by not specifying types all over, but it’s a trade-off, because you won’t have type checking when compiling, what could avoid some silly bugs.

Here is some of the basic types elixir have:

1          # integer
0x1F       # integer / hexadecimal
1.0        # float
true       # boolean
:atom      # atom / symbol
"elixir"   # string
[1, 2, 3]  # list
{1, 2, 3}  # tuple

Most of these types are pretty straightforward but I do think there are three we should check closely: atoms, lists and tuples.

Atoms and Maps

Atoms, known in many other languages as Symbols, is a pretty usual type in Ruby, we’re used to have it everywhere and it is the way-to-go when creating a Hash or to pass data when creating an instance, you probably have done this over and over:

# This is the most usual way to create a Hash.
# The keys are all Symbols and the values, in this case,
# are Strings or Booleans
user = {
  name: "Joao M. D. Moura", # String
  email: "me@awesome.com",  # String
  admin: true               # Boolean
}

# In Ruby we use it all the time, if you ever used Rails
# you're probably really familiar with using it to create
# an instance, or a record on the database

# Creating an instance
User.new(user)

# Creating an record on database when using Rails conventions
User.create(user)

# Talking about Rails, it also uses it a lot behind the scenes,
# the main example is the usual `params` on controllers
# that's filtered by StrongParameters.
# This is be a representation of a `params` Hash:
#
# param = {
#   user: {
#     name: "Joao M. D. Moura",
#     email: "me@awesome.com"
#   }
# }

# This is the Rails conventions to filter the params, here you
# can see Symbols being used again.
def user_params
  params.required(:user).permit(:name, :email)
end

# We can easily access then by the following syntax
user[:name]
# => "Joao M. D. Moura"
user[:email]
# => "me@awesome.com"

In Elixir they work pretty much the same way, and it’s mainly used on a Data structure known as Map. As we already discussed a bit on our previous episode, - Episode III - Maps, Functions + Pattern Matching = ❤️, maps are the “go to” data structure in Elixir, you can easily create one, let’s check some examples bellow:

# Pretty similar to a Ruby Hash syntax, but appending a `%`
# before the curly brackets.
user = %{
  name: "Joao M. D. Moura", # String
  email: "me@awesome.com",  # String
  admin: true               # Boolean
}

# You can also access the values following the same syntax
# you would use on Ruby
user[:name]
# => "Joao M. D. Moura"

# Or even better, by using the `.` syntax when you have
# an Atom-Keyed Map (The keys are atoms instead of strings)
user.email
# => "me@awesome.com"

# And the same way, insted of `atoms` you could have the keys
# of a Map as `strings`, they are known as String-Keyed Maps
# and are preferebale in some cases
string_map_user = %{
  "name" => "Joao M. D. Moura", # String
  "email" => "me@awesome.com",  # String
  "admin" =>true                # Boolean
}

# This will also change the way you access the values, the
# same way it would do on Ruby, forcing you to use strings
# instead of atoms as well to reference a key
string_map_user["name"]
# => "Joao M. D. Moura"

There is some details you need to know, Atoms are constants where their name are their own value and they are not garbage-collected(basically, it means that atoms are not cleaned from the memory), so allocating them at runtime can be dangerous, leading to RAM exhaustion over time, and because of that, we need to take some precautions.

There is a bunch of data coming from outside your application, and by that I mean anything that can be put in there by a user (forms, query-strings and sutff), this data can’t be trusted, taking into account that atoms allocation can lead to memory exhaustion, using it for external data could open breaches for a potential denial of service (DoS) attack. That’s why we now have a new rule:

Always Use String-Keyed Maps for External Data

Strings

Strings can be another tricky thing, because in Elixir strings are immutable. We haven’t talked a lot about immutability yet, but that’s a main characteristic of functional languages, it’s another key point into achieving fully concurrency. If you haven’t yet, you can check more about initial concurrency on - Episode II - Actor Model, Modules and functions.

So strings are immutable, what this means for us? In order to understand that let’s check a quick comparison:

Ruby Code

# Let's starting by creating a simple method that
# receives a string and convert it to upcase
def scream(word)
  word.upcase!
end

# Here we define a variable `real_word` with
# the value `'hey'`
real_word = 'hey'

# Now we can call the scream method passing
# real_word as the argument
scream(real_word)
# => HEY

# Okay, now my question is, what is the value
# of `real_word` now?
puts real_word
# => HEY

# This shows how strings are mutable objects in
# Ruby. So there are method that will change
# the value of strings even if used inside
# other methods.

Elixir Code

# Let's start by defining our string
real_word = "hey"

# Here we convert the string to upcase
word = String.upcase(real_word)
# => HEY

# But if we check the value of `real_word`
# it will remain de same, becuase strings are
# imutable
IO.inspect(real_word)
# => "hey"

This might seem a small thing, but it can be really helpful once you realize you don’t need to worry about some crazy-function changing the value of your variable, that’s one of the small things that once it strikes you it makes your application way easier to understand and debug.

Now the biggest question, if strings are immutable, why do we have atoms? The answer is pretty straightforward: Atoms are more efficient. Atoms maps to an integer index in a table that’s kept in memory, so, for example, when comparing atoms you are actually comparing integers what’s is really simple and efficient, by the way, fun-fact: this atom’s table is shared between all processes.

Tuples

You will see tuples a lot in Elixir, it’s a pretty usual type, it’s used as convention for some function’s responses, it’s mostly seen in a format like {:ok, “hello”}, tuples are also stored contiguously in memory, what makes it really fast to access by index or get the its size.

Let’s check an example bellow where we use a function of the Map module that has a tuple as its return.

# We start by declaring a new atom keyed Map, with
# keys `:a` and `:b`
map = %{a: 1, b: 2}

# We use the `fetch` method from Map module to get a value for
# a specific key.
# We could have used  the `.` syntax  to get the value as well,
# but the `fetch` method allow you to treat errors when the key
# requested doesn't exist and the `.` would raise an error.
# Another alternative would be the regular brackets, that would
# return only the value itself, but here we will see a tuple as
# response.
Map.fetch(map, :a)
# => {:ok, 1}

# If it doesn't find the key the `fetch` function will return the
# :error atom
Map.fetch(map, :c)
# => :error

Bonus: Tuples + Pattern Matching

You may be wondering why to use tuples at all and what benefit you could get from it, so let’s check one of the good ways to take advantage of it by using it with Pattern Matching.

We have already seen how pattern matching can be used with functions, but it does integrates with other features of Elixir as well, like conditionals, let’s check bellow how we could use tuples and pattern matching to smoothly treat some cases.

The case is one of the control flow structures available in Elixir, we also have cond and if but we will talk about those in a different episode, for now let’s stick with case and how we can use it together with pattern matching

# Let's create two maps, one representing an user and another
# one representing an admin
user = %{name: "Joe Doe"}
admin = %{name: "Jane Doe", admin: true}

defmodule MapTest do
  def run(map) do
    # We'll use the same `fetch` method we have used in the previous
    # example, it will return a tuple `{:ok, value}` if it
    # finds the key, or the atom :error if it doesn't
    case Map.fetch(map, :admin) do

      # Here we treat the first option using pattern matching, it will
      # try to match the result of `Map.fetch(map, :admin)` with
      # `{ok, var}`, so if it finds the key it will match it and
      # assign `var` and the value for that `key`
      {:ok, var} ->
        IO.inspect "Admin: #{var}"

      # After failing to match the first assert it'll then try to match
      # the next one, in this case the atom `:error`, so it will match
      # in case the key is not found in the map
      :error ->
        IO.inspect "No :admin key found"
    end
  end
end

# It won't find the `admin` key on user map, so it will match
# the second declaration on the `case`
MapTest.run(user)
# => "No :admin key found"

# It will find the `admin` key on admin map, so the return will be
# `{:ok, true}` and it will match `{:ok, var}` assigning `true` as
# `var` value
MapTest.run(admin)
# => "Admin: true"

What’s Next?

This is the fourth episode of Elixir with a Rubyist, yay! It’s awesome to have you here, this is a series of short bar-like conversations around Elixir, it’s aimed to help mostly Ruby developers trying to understand it. On the next episodes we will take some traction, talk about recursion and diving a little bit more on the Task module and concurrency on Elixir. If you liked please let me know on the comments bellow and over twitter.