In a recent post, we started discussing how to scrape the SEC website for 13F filings.

I have spent most of my career so far working with object oriented languages like Ruby and Javascript, so Elixir’s functional approach threw me off a bit at first.

In Ruby, you often work with arrays, hashes, and combinations of those. Each of these have easy ways to access the data inside.

For an array:

array = [1, 2, 3.2, "bob", Object.new]
array[0]
=> 1
array[3]
=> "bob"

For a hash:

stock = { ticker: "AAPL", price: 262.64 }
stock[:ticker]
=> "AAPL"
stock[:price]
=> 262.64

Easy.

In Ruby, hashes and arrays are objects, which means when you use these square bracket access methods, you are actually just calling methods on an object.

For complex objects, you might see a chain of accessors like:

stock[:history][1][:prices][:max][1]

In Elixir, this doesn’t seem to be possible. Let’s look at how we access these types of things in Elixir.

First, Elixir has an array-like data type called a List. Lists implement the “Enumerable” protocol, and therefore we can use Enum functions on them.

list = [1, 2, 3.2, "abcd"]
Enum.at(list, 0)
=> 1
Enum.at(list, 3)
=> "abcd"

Elixir has another data type called a Tuple that turned up a lot in the previously discussed article. Here is a quick look at a Tuple:

items = {"apple", "banana", "carrot", 42}
elem(items, 0)
=> "apple"
elems(items, 3)
=> 42

What makes this different from a List? From the docs:

Tuples are intended as fixed-size containers for multiple elements. To manipulate a collection of elements, use a list instead. Enum functions do not work on tuples.

I haven’t gotten super deep into the philosophy yet, so we’ll just leave it there for now. What I’m interested in is how to access the nested data types I’ve been getting back from scraping the web. Something that looks like:

data = [
  { 
    "stock", ["AAPL"], [
      245.21, 0.4, "positive"
    ]
  }
]

This is fake data, but suppose the 245.21 is the last price for the stock AAPL and we want to access it. We can do that like:

stock_tuple = Enum.at(data, 0)
price_info = elem(stock_tuple, 2)
price = Enum.at(price_info, 0)

Breaking that out like that onto different lines works. If we wanted to shorten this up, we can merge everything onto one line like this:

Enum.at(elem(Enum.at(data, 0), 2), 0)

That works, but it’s basically unreadable. Elixir comes with a really cool pipe operator that passes the result of one function into whatever comes on the left side of the operator as the first argument. For example, our previous code becomes:

Enum.at(data, 0) |> elem(2) |> Enum.at(0)

In the first part, Enum.at(data, 0) returns what we previously stored in stock_tuple above. In the middle section, elem(2) receives the result of the first function (stock_tuple) as an automatic first argument. Then the same thing happens in the last section.

I find this to be particularly handy when I’m working in the terminal doing data exploration. I can run a command to see what data I get back, cycle up with the up arrow, run the command again but add an additional pipe to explore the data further.

Need Help?

We are a full service software engineering firm. We can help you with everything from design to project management to development.

Got a project you’d like to talk about? Get in touch at info@centralstandard.tech.