Lazy functional ruby

Today, I was working with some ruby code that had to find the first product in one of the current contexts. Here is the code:

1
2
3
4
5
6
7
8
def find_product_in_current_contexts
context_ids = [1, 2, 3]

context_ids.each do |context_id|
product = Product.find_by(context_id: context_id)
return product if product
end
end

This code tries to find the first product in the current contexts in the order they are defined. However, the above code has a tiny bug. Can you figure out what it is?

In cases where there are no products in any of the contexts this function returns the array [1, 2, 3] instead of returning nil because Array.each returns the array and in the case where we don’t find the product we don’t return early.

We can easily fix this by adding an extra return at the end of the function.

1
2
3
4
5
6
7
8
9
10
11
def find_product_in_current_contexts
context_ids = [1, 2, 3]

context_ids.each do |context_id|
product = Product.find_by(context_id: context_id)
return product if product
end

# if it reaches this point we haven't found a product
return nil
end

The fix is awkward, let us see if we can improve this.

We could use .map to find a product for every context and return the first not nil record like so:

1
2
3
4
5
6
7
def find_product_in_current_contexts
context_ids = [1, 2, 3]

context_ids
.map { |context_id| Product.find_by(context_id: context_id)}
.find{|x| x }
end

This looks much cleaner! And it doesn’t have the previous bug either. However, this code is not efficient, we want to return the first product we find for all the contexts, and the above code always looks in all contexts even if it finds a product for the first context. We need to be lazy!

Lazy enumerator for the win!

Calling .lazy on an enumerable gives you a lazy enumerator and the neat thing about that is it only executes the chain of functions as many times as needed.

Here is a short example which demonstrates its use:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def find(id)
puts "> finding #{id}"
return :product if id == 2
end

# without lazy
(1..3).map{|id| find(id)}.find{|x| x}
# > finding 1
# > finding 2
# > finding 3
# => :product

# The above `.map` gets executed for every element in the range every time!


# using the lazy enumerator
(1..3).lazy.map{|id| find(id)}.find{|x| x}
# > finding 1
# > finding 2
# => :product

As you can see from the above example, the lazy enumerator executes only as many times as necessary. Here is another example from the ruby docs, to drive the point home:

1
2
3
irb> (1..Float::INFINITY).lazy.select(&:odd?).drop(10).take(2).to_a
# => [21, 23]
# Without the lazy enumerator, this would crash your console!

Now applying this to our code is pretty straightforward, we just need to add a call to #.lazy before we map and we are all set!

1
2
3
4
5
6
7
8
def find_product_in_current_contexts
context_ids = [1, 2, 3]

context_ids
.lazy # this gives us the lazy enumerator
.map { |context_id| Product.find_by(context_id: context_id)}
.find{|x| x }
end

Ah, nice functional ruby!

How to know which of the Enum functions to use in Elixir

When you are writing functional code, it is sometimes difficult to figure out which of the Enum functions you want to use. Here are a few common use cases.

Use Enum.map

You can use Enum.map when you want to transform a set of elements into another set of elements. Also, note that the count of elements remains unchanged. So, if you transform a list of 5 elements using Enum.map, you get an output list containing exactly 5 elements, However, the shape of the elements might be different.

Examples

1
2
3
# transform names into their lengths
iex> Enum.map(["jack", "mujju", "danny boy"], fn x -> String.length(x) end)
[4, 5, 9]

If you look at the count of input and output elements it remains the same, However, the shape is different, the input elements are all strings whereas the output elements are all numbers.

1
2
3
# get ids of all users from a list of structs
iex> Enum.map([%{id: 1, name: "Danny"}, %{id: 2, name: "Mujju"}], fn x -> x.id end)
[1, 2]

In this example we transform a list of maps to a list of numbers.

Use Enum.filter

When you want to whittle down your input list, use Enum.filter, Filtering doesn’t change the shape of the data, i.e. you are not transforming elements, and the shape of the input data will be the same as the shape of the output data. However, the count of elements will be different, to be more precise it will be lesser or the same as the input list count.

Examples

1
2
3
# filter a list to only get names which start with `m`
iex> Enum.filter(["mujju", "danny", "min", "moe", "boe", "joe"], fn x -> String.starts_with?(x, "m") end)
["mujju", "min", "moe"]

The shape of data here is the same, we use a list of strings as the input and get a list of strings as an output, only the count has changed, in this case, we have fewer elements.

1
2
3
# filter a list of users to only get active users
iex> Enum.filter([%{id: 1, name: "Danny", active: true}, %{id: 2, name: "Mujju", active: false}], fn x -> x.active end)
[%{active: true, id: 1, name: "Danny"}]

In this example too, the shape of the input elements is a map (user) and the shape of output elements is still a map.

Use Enum.reduce

The last of the commonly used Enum functions is Enum.reduce and it is also one of the most powerful functions. You can use Enum.reduce when you need to change the shape of the input list into something else, for instance a map or a number.

Examples

Change a list of elements into a number by computing its product or sum

1
2
3
4
5
6
7
8
9
10
11
iex> Enum.reduce(
_input_enumberable = [1, 2, 3, 4],
_start_value_of_acc = 1,
fn x, acc -> x * acc end)
24

iex> Enum.reduce(
_input_list = [1, 2, 3, 4],
_start_value_of_acc = 0,
fn x, acc -> x + acc end)
10

Enum.reduce takes three arguments, the first is the input enumerable, which is usually a list or map, the second is the starting value of the accumulator and the third is a function which is applied for each element whose result is then sent to the next function application as the accumulator.

Let’s try and understand this using an equivalent javascript example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

// input list
const inputList = [1, 2, 3, 4]

// starting value of accumulator, we want to chose this wisely, for instance
// when we want addition, we should use a `0` as the start value to avoid
// impacting the output and if you want to compute a product we use a `1`, this
// is usually called the identity element for the function: https://en.wikipedia.org/wiki/Identity_element
// It is also the value that is returned when the input list is empty
let acc = 0

// loop over all the input elements and for each element compute the new
// accumulator as the sum of the current accumulator and the current element
for(const x of inputList) {
// compute the next value of our accumulator, in our Elixir code this is
// done by the third argument which is a function which gets `x` and `acc`
acc = acc + x
}

// in Elixir, the final value of the accumulator is returned

Let’s look at another example of converting an employee list into a map containing an employee id and their name.

1
2
3
4
5
6
iex> Enum.reduce(
_input_list = [%{id: 1, name: "Danny"}, %{id: 2, name: "Mujju"}],
_start_value_of_acc = %{},
fn x, acc -> Map.put(acc, x.id, x.name) end)

%{1 => "Danny", 2 => "Mujju"}

So, in a map you end up reducing an input list into one output value.

How to control the enqueuing speed of Sidekiq jobs and their concurrency

At my work, we use ruby heavily and sidekiq is an essential part of our stack. Sometimes, I long for the concurrency primitives from Elixir, but that’s not what today’s post is about.

A few days ago, I caused a minor incident by overloading our databases. Having been away from ruby for a bit, I had forgotten that sidekiq runs multiple threads per each worker instance. So, I ended up enqueuing about 10K jobs on Sidekiq, and Sidekiq started executing them immediately. We have 50 worker instances and run Sidekiq with a concurrency of 20. So, essentially we had 400 worker threads ready to start crunching these jobs. Coincidentally we have 400 database connections available and my batch background job ended up consuming all the connections for 5 minutes during which the other parts of the application were connection starved and started throwing errors 😬.

That was a dumb mistake. Whenever you find yourself making a dumb mistake, make sure that no one else can repeat that mistake. To fix that, we could set up our database with multiple users in such a way that the web app would connect with a user which could only open a maximum of 100 connections, the background worker with a user with its own limits and, so on. This would stop these kinds of problems from happening again. However, we’ll get there when we get there, as this would require infrastructure changes.

I had another batch job lined up which had to process millions of rows in a similar fashion. And, I started looking for solutions. A few solutions that were suggested were to run these jobs on a single worker or a small set of workers, you can do this by having a custom queue for this job and executing a separate sidekiq instance just for this one queue. However, that would require some infrastructure work. So, I started looking at other options.

I thought that redis might have something to help us here, and it did! So, redis allows you to make blocking pops from a list using the BLPOP function. So, if you run BLPOP myjob 10, it will pop the first available element in the list, However, if the list is empty, it will block for 10 seconds during which if an element is inserted, it will pop it and return its value. Using this knowledge, I thought we could control the enqueuing based on the elements in the list. The idea is simple.

  1. Before the background job starts, I would seed this list with n elements where n is the desired concurrency. So, if I seed this list with 2 elements, Sidekiq would execute only 2 jobs at any point in time, regardless of the number of worker instances/concurrency of sidekiq workers.
  2. The way this is enforced is by the enqueue function using a BLPOP before it enqueues, so, as soon as the enqueuer starts, it pops the first 2 elements from the redis list and enqueues 2 jobs. At this point, the enqueuer is stuck till we add more elements to the list.
  3. That’s where the background jobs come into play, at the end of each background job, we add one element back to the list using LPUSH and as soon as an element is added the enqueuer which is blocked at BLPOP pops this element and enqueues another job. This goes on till all your background jobs are enqueued, all the while making sure that there are never more than 2 jobs at any given time.

Let’s put this into concrete ruby code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
module ControlledConcurrency
# I love module_function
module_function

# The name of our list needs to be constant per worker type, you could
# probably extract this into a Sidekiq middleware with a little effort
LIST_NAME = "migrate"

def setup(concurrency:)
# if our list already has elements before we start, our concurrency will be
# screwed, so, this is a safety check!
slot_count = Redis.current.llen(LIST_NAME)
raise "Key '#{LIST_NAME}' is being used, it already has #{slot_count} slots" if slot_count > 0

# Seed our list with as many items as the concurrency, the contents of this
# list don't matter.
Redis.current.lpush(LIST_NAME, concurrency.times.to_a)
end

# A helper function to bump up concurrency if you need to
def increase_concurrency(n = 1)
Redis.current.lpush(LIST_NAME, n.times.to_a)
end

# A helper function to bump the concurrency down if you need to
def decrease_concurrency(n = 1)
n.times do
puts "> waiting"
Redis.current.blpop(LIST_NAME)
puts "> decrease by 1"
end
end

# This is our core enqueuer, it runs in a loop because our blpop might get a
# timeout and return nil, we keep trying till it returns a value
def nq(&block)
loop do
puts "> waiting to enqueue"
slot = Redis.current.blpop(LIST_NAME)
if slot
puts "> found slot #{slot}"
yield
return
end
end
end

# Function which allow background workers to signal that a job has been
# completed, so that the enqueuer can nq more jobs.
def return_slot
puts "> returning slot"
Redis.current.lpush(LIST_NAME, 1)
end

end

# This is our Sidekiq worker
class HardWorker
include Sidekiq::Worker

# Our set up doesn't enforce concurrency across retries, if you want this,
# you'll probably have to tweak the code a little more :)
sidekiq_options retry: false

# the only custom code here is in the ensure block
def perform(user_id)
puts "> start: #{user_id}"
# mock work
sleep 1
puts "> finish: #{user_id}"
ensure
# make sure that we return this slot at the end of the background job, so
# that the next job can be enqueued. This doesn't handle retries because of
# failures, we disabled retries for our job, but if you have them enabled,
# you might end up having more jobs than the set concurrency because of
# retried jobs.
ControlledConcurrency.return_slot
end
end

# ./concurrency_setter.rb
ControlledConcurrency.setup(concurrency: ARGV.first.to_i)

# ./enqueuer.rb
# Before running the enqueuer, we need to set up the concurrency using the above script
# This our enqueuer and it makes sure that the block passed to
# ControlledConcurrency.nq doesn't enqueue more jobs that our concurrency
# setting.
100.times do |i|
ControlledConcurrency.nq do
puts "> enqueuing user_id: #{i}"
HardWorker.perform_async(i)
end
end

That’s all folks! Hope you find this useful!

The full code for this can be found at: https://github.com/minhajuddin/sidekiq-controlled-concurrency

How to create a web server using Cowboy without Plug or Phoenix - Part 01

Cowboy is an amazing web server that is used by Plug/Phoenix out of the box, I don’t think Phoenix supports any other web servers at the moment. However, the plug adapter is fairly abstracted, and plug implements this adapter for cowboy through the plug_cowboy hex package. In theory, you should be able to write a new adapter if you just implement the Plug adapter behaviour. The plug cowboy adapter has a lot of interesting code and you’ll learn a lot from reading it. Anyway, this blog post isn’t about Plug or Phoenix. I wanted to show off how you can create a simple Cowboy server without using Plug or Phoenix (I had to learn how to do this while creating my side project webpipe)

We want an application which spins up a cowboy server and renders a hello world message. Here is the required code for that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
defmodule Hello do
# The handler module which handles all requests, its `init` function is called
# by Cowboy for all matching requests.
defmodule Handler do
def init(req, _opts) do
resp =
:cowboy_req.reply(
_status = 200,
_headers = %{"content-type" => "text/html; charset=utf-8"},
_body = "<!doctype html><h1>Hello, Cowboy!</h1>",
_request = req
)

{:ok, resp, []}
end
end

def start do
# compile the routes
routes =
:cowboy_router.compile([
{:_,
[
# { wildcard, handler module (needs to have an init function), options }
{:_, Handler, []}
]}
])

require Logger
Logger.info("Staring server at http://localhost:4001/")

# start an http server
:cowboy.start_clear(
:hello_http,
[port: 4001],
%{env: %{dispatch: routes}}
)
end
end

And, here is a quick test to assert that it works!

1
2
3
4
5
6
7
8
defmodule HelloTest do
use ExUnit.Case

test "returns hello world" do
assert {:ok, {{'HTTP/1.1', 200, 'OK'}, _headers, '<!doctype html><h1>Hello, Cowboy!</h1>'}} =
:httpc.request('http://localhost:4001/')
end
end

Full code on GitHub

My first SVG creation

SVG is amazing, I want to design the logo of my next company using it!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<svg style='border: solid 1px #0f0' viewbox='0 0 200 200' stroke="#44337a" fill='#6b46c1'>
<circle cx=100 cy=100 r=80 fill=none />

<circle cx=60 cy=60 r=10 fill=none stroke=black />
<circle cx=60 cy=60 r=6 fill=#0074D9 stroke=none />

<circle cx=140 cy=60 r=10 fill=none stroke=black />
<circle cx=140 cy=60 r=6 fill=#0074D9 stroke=none />



<path d="
M 90,140
A 8 5 0 1 0 110,140
z
" fill=none />

<circle cx=100 cy=100 r=20 fill=#FF4136 stroke=none />


</svg>

How to use a single aurora cluster for multiple databases each with its own restricted user

I have been playing around with terraform for the last few days and it is an amazing tool to manage infrastructure. For my AWS infrastructure I needed an aurora postgresql cluster which would allow hosting of multiple databases, each for one of my side projects, while also keeping them isolated and preventing an app user from accessing other app’s databases.

Terraform has an awesome postgresql provider which can be used for managing databases, However there are a few parts which are tricky and needed trial and error to get right.

Connecting to an RDS database via an SSH tunnel

The first roadblock was that my RDS cluster wasn’t accessible publicly (which is how it should be for security reasons). I do have a way to connect to my postgres servers via a bastion host. I thought we could use an SSH tunnel over the bastion host to get to our RDS cluster from my local computer. However, terraform doesn’t support connecting to the postgres server over an SSH tunnel via its configuration.

So, it required a little bit of jerry-rigging. The postgresql provider was happy as long as it could reach the postgres cluster using a host, port and password. So, I set up a local tunnel outside terraform via my SSH config like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
Host bastion
Hostname ec2-180-21-145-48.us-east-2.compute.amazonaws.com
IdentityFile ~/.ssh/aws_ssh.pem

Host ecs1-pg
LocalForward localhost:3333 hn-aurora-pg-1.hosturl.us-east-2.rds.amazonaws.com:5432

Host ecs1 ecs1-pg
Hostname 20.10.22.214
User ec2-user
IdentityFile ~/.ssh/aws_ssh.pem
ForwardAgent yes
ProxyJump bastion

The relevant line here is the LocalForward declaration which wires up a local port forward so that when you network traffic hits port 3333 on your localhost it is tunneled over the bastion and then the ecs server and is routed to your cluster’s port 5432. One thing to note here is that your ecs cluster should be able to connect to your RDS cluster via proper security group rules.

Setting up the postgres provider

Once you have the ssh tunnel set up, you can start wiring up your postgres provider for terraform like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
provider "postgresql" {
version = "~> 1.5"

# LocalForwarded on the local computer via an SSH tunnel to
# module.hn_db.this_rds_cluster_endpoint
# via
# LocalForward localhost:3333 module.hn_db.this_rds_cluster_endpoint:5432
host = "localhost"
port = 3333
username = "root"
superuser = false
password = module.hn_db.this_rds_cluster_master_password
sslmode = "require"
connect_timeout = 15
}

The provider config is pretty straightforward, we point it to localhost:3333 with a root user (which is the master user created by the rds cluster). So, when you connect to localhost:3333, you are actually connecting to the RDS cluster through an SSH tunnel (make sure that your ssh connection is open at this point via ssh ecs1-pg in a separate terminal). We also need to set the superuser to false because RDS doesn’t give us a postgres superuser, getting this wrong initially caused me a lot of frustration.

Setting up the database and it’s user

Now that our cluster connectivity is set up, we can start creating the databases and users, each for one of our apps.

Below is a sensible configuration for a database called liveform_prod and it’s user called liveform.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
locals {
lf_connection_limit = 5
lf_statement_timeout = 60000 # 1 minute
}

resource "postgresql_database" "liveform_db" {
name = "liveform_prod"
owner = postgresql_role.liveform_db_role.name
connection_limit = local.lf_connection_limit
}

resource "postgresql_role" "liveform_db_role" {
name = "liveform"
login = true
password = random_password.liveform_db_password.result
connection_limit = local.lf_connection_limit
statement_timeout = local.lf_statement_timeout
}

resource "random_password" "liveform_db_password" {
length = 40
special = false
}

output "liveform_db_password" {
description = "Liveform db password"
value = random_password.liveform_db_password.result
}

A few things to note here:

  1. The database liveform_prod is owned by a new user called liveform.
  2. It has a connection limit of 5, You should always set a sensible connection limit to prevent this app from crashing the cluster.
  3. The db user too has a connection limit of 5 and a statement timeout of 1 minute which is big enough for web apps, you should set it to the least duration which works for your app.
  4. A random password (via the random_password resource) is used as the password of our new liveform role. This can be viewed by running terraform show

Isolating this database from other users

By default postgres allows all users to connect to all databases and create/view from all the tables. We want our databases to be isolated properly so that a user for one app cannot access another app’s database. This requires running of some SQL on the newly created database. We can easily do this using a null_resource and a local-exec provisioner like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
resource "null_resource" "liveform_db_after_create" {
depends_on = [
postgresql_database.liveform_db,
postgresql_role.liveform_db_role
]

provisioner "local-exec" {
command = "./pg_database_roles_setup.sh"
environment = {
PG_DB_ROLE_NAME = postgresql_role.liveform_db_role.name
PG_DB_NAME = postgresql_database.liveform_db.name
PGPASSWORD = random_password.liveform_db_password.result
}
}
}

./pg_database_roles_setup.sh script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash

set -e

# This needs an SSH TUNNEL to be set up
# password needs to be supplied via the PGPASSWORD env var
psql --host "localhost" \
--port "3333" \
--username "$PG_DB_ROLE_NAME" \
--dbname "$PG_DB_NAME" \
--file - <<SQL
REVOKE CONNECT ON DATABASE $PG_DB_NAME FROM PUBLIC;
GRANT CONNECT ON DATABASE $PG_DB_NAME TO $PG_DB_ROLE_NAME;
GRANT CONNECT ON DATABASE $PG_DB_NAME TO root;
SQL

The pg_database_roles_setup.sh script connects to our rds cluster over the SSH tunnel to the newly created database as the newly created user and revokes connect privileges for all users on this database, and then adds connect privileges to the app user and the root user. You can add more queries to this script that you might want to run after the database is set up. Finally, the local-exec provisioner passes the right data via environment variables and calls the database setup script.

Gotchas

If you create a posgresql_role before setting the connection’s superuser to false, you’ll get stuck trying to update or delete the new role. To work around this, manually log in to the rds cluster via psql and DROP the role, and remove this state from terraform using: terraform state rm postgresql_role.liveform_db_role

How to create temporary bastion EC2 instances using Terraform

I have recently started learning Terraform to manage my AWS resources, And it is a great tool for maintaining your infrastructure! I use a Bastion host to SSH into my main servers and bring up the bastion host on demand only when I need it giving me some cost savings. Here are the required Terraform files to get this working.

Set up the bastion.tf file like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# get a reference to aws_ami.id using a data resource by finding the right AMI
data "aws_ami" "ubuntu" {
# pick the most recent version of the AMI
most_recent = true

# Find the 20.04 image
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}

# With the right virtualization type
filter {
name = "virtualization-type"
values = ["hvm"]
}

# And the image should be published by Canonical (which is a trusted source)
owners = ["099720109477"] # Canonical's owner_id don't change this
}

# Configuration for your bastion EC2 instance
resource "aws_instance" "bastion" {
# Use the AMI from the above step
ami = data.aws_ami.ubuntu.id

# We don't need a heavy duty server, t2.micro should suffice
instance_type = "t2.micro"

# We use a variable which can be set to true or false in the terraform.tfvars
# file to control creating or destroying the bastion resource on demand.
count = var.bastion_enabled ? 1 : 0

# The ssh key name
key_name = var.ssh_key_name

# This should refer to the subnet in which you want to spin up the Bastion host
# You can even hardcode this ID by getting a subnet id from the AWS console
subnet_id = aws_subnet.subnet[0].id

# The 2 security groups here have 2 important rules
# 1. hn_bastion_sg: opens up Port 22 for just my IP address
# 2. default: sets up an open network within the security group
vpc_security_group_ids = [aws_security_group.hn_bastion_sg.id, aws_default_security_group.default.id]

# Since we want to access this via internet, we need a public IP
associate_public_ip_address = true

# Some useful tags
tags = {
Name = "Bastion"
}
}

# We want to output the public_dns name of the bastion host when it spins up
output "bastion-public-dns" {
value = var.bastion_enabled ? aws_instance.bastion[0].public_dns : "No-bastion"
}

Set up the terraform.tfvars file like so:

1
2
3
4
5
6
7
8
9
10
# Set this to `true` and do a `terraform apply` to spin up a bastion host
# and when you are done, set it to `false` and do another `terraform apply`
bastion_enabled = false

# My SSH keyname (without the .pem extension)
ssh_key_name = "hyperngn_aws_ohio"

# The IP of my computer. Do a `curl -sq icanhazip.com` to get it
# Look for the **ProTip** down below to automate this!
myip = ["247.39.103.23/32"]

Set up the vars.tf file like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
variable "ssh_key_name" {
description = "Name of AWS key pair"
}

variable "myip" {
type = list(string)
description = "My IP to allow SSH access into the bastion server"
}

variable "bastion_enabled" {
description = "Spins up a bastion host if enabled"
type = bool
}

Relevant sections from my vpc.tf, you could just hardcode these values in the bastion.tf or use data if you’ve set these up manually and resources if you use terraform to control them

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
resource "aws_subnet" "subnet" {
# ...
}

# Allows SSH connections from our IP
resource "aws_security_group" "hn_bastion_sg" {
name = "hn_bastion_sg"
vpc_id = aws_vpc.vpc.id

ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = var.myip
}

egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}

}

# Allow inter security group connections
resource "aws_default_security_group" "default" {
vpc_id = aws_vpc.vpc.id

ingress {
protocol = -1
self = true
from_port = 0
to_port = 0
}

egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}

Finally you need to set up your ~/.ssh/config to use the bastion as the jump host like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Bastion config
Host bastion
# Change the hostname to whatever you get from terraform's output
Hostname ec2-5-55-128-160.us-east-2.compute.amazonaws.com
IdentityFile ~/.ssh/hyperngn_aws_ohio.pem

# ECS cluster machines
Host ecs1
Hostname 20.10.21.217
User ec2-user
IdentityFile ~/.ssh/hyperngn_aws_ohio.pem
ForwardAgent yes
ProxyJump bastion

# This section is optional but allows you to reuse SSH connections
Host *
User ubuntu
Compression yes
# every 10 minutes send an alive ping
ServerAliveInterval 60
ControlMaster auto
ControlPath /tmp/ssh-%r@%h:%p

Once you are done, you can just login by running the following command and it should run seamlessly:

1
ssh ecs1

Pro-Tip Put the following in your terraform folder’s .envrc, so that you don’t have to manually copy paste your IP every time you bring your bastion host up (You also need to have direnv for this to work).

1
2
$ cat .envrc
export TF_VAR_myip="[\"$(curl -sq icanhazip.com)/32\"]"

Gotchas

  1. If you run into any issues use the ssh -vv ecs1 command to get copious logs and read through all of them to figure out what might be wrong.
  2. Make sure you are using the correct User, Ubuntu AMIs create a user called ubuntu whereas Amazon ECS optimized AMIs create an ec2-user user, If you get the user wrong ssh will fail.
  3. Use private IPs for the target servers that you are jumping into and the public IP or public DNS for your bastion host.
  4. Make sure your Bastion host is in the same VPC with a default security group which allows inter security group communication and a security group which opens up the SSH port for your IP. If they are not on the same VPC make sure they have the right security groups to allow communication from the bastion host to the target host, specifically on port 22. You can use VPC flow logs to figure problems in your network.

From a security point of view this is a pretty great set up, your normal servers don’t allow any SSH access (and in my case aren’t even public and are fronted by ALBs). And your bastion host is not up all the time, and even when it is up, it only allows traffic from your single IP. It also saves cost by tearing down the bastion instance when you don’t need it.

many_to_many relationships in Ecto and Phoenix for Products and Tags

The other day I was helping a friend set up a phoenix app which required the use of tags on products, we all have used tags in our day to day to add information about notes, images, and other stuff. Tags are just labels/chunks-of-text which are used to associated with an entity like a product, blog post, image, etc. This blog post has a few tags too (Ecto, Elixir, Phoenix, etc.). Tags help us organize information by annotating records with useful fragments of information. And modeling these in a database is pretty straightforward, it is usually implemented like the following design.

As you can see, we have a many-to-many relation between the products and tags tables via a products_tags table which has just 2 columns the product_id and the tag_id and it has a composite primary key (while also having an index on the tag_id to make lookups faster). The use of a join table is required, however, you usually want the join table to be invisible in your domain, as you don’t want to deal with a ProductTag model, it doesn’t serve any purpose other than helping you bridge the object model with the relational model. Anyway, here is how we ended up building the many-to-many relationship in Phoenix and Ecto.

Scaffolding the models

We use a nondescript Core context for our Product model by running the following scaffold code:

1
mix phx.gen.html Core Product products name:string description:text

This generates the following migration (I’ve omitted the boilerplate to make reading the relevant code easier):

1
2
3
4
5
6
create table(:products) do
add :name, :string
add :description, :text

timestamps()
end

Don’t forget to add the following to your router.ex

1
resources "/products", ProductController

Then, we add the Tag in the same context by running the following scaffold generator:

1
mix phx.gen.html Core Tag tags name:string:unique

This generates the following migration, note the unique index on name, as we don’t want tags with duplicate names, you might have separate tags per user in which case you would have a unique index on [:user_id, :name].

1
2
3
4
5
6
7
create table(:tags) do
add :name, :string

timestamps()
end

create unique_index(:tags, [:name])

Finally, we generate the migration for the join table products_tags(by convention it uses the pluralized names of both entities joined by an underscore so products and tags joined by an _ gives us the name products_tags).

1
mix phx.gen.schema Core.ProductTag products_tags product_id:references:products tag_id:references:tags

This scaffolded migration requires a few tweaks to make it look like the following:

1
2
3
4
5
6
7
create table(:products_tags, primary_key: false) do
add :product_id, references(:products, on_delete: :nothing), primary_key: true
add :tag_id, references(:tags, on_delete: :nothing), primary_key: true
end

create index(:products_tags, [:product_id])
create index(:products_tags, [:tag_id])

Note the following:

  1. We added a primary_key: false declaration to the table() function call to avoid creating a wasted id column.
  2. We got rid of the timestamps() declaration as we don’t want to track inserts and updates on the joins. You might want to track inserts if you want to know when a product was tagged with a specific tag which makes things a little more complex, so, we’ll avoid it for now.
  3. We added a , primary_key: true to the :product_id and :tag_id lines to make [:product_id, :tag_id] a composite primary key

Now our database is set up nicely for our many-to-many relationship. Here is how our tables look in the database:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
product_tags_demo_dev=# \d products
Table "public.products"
┌─────────────┬────────────────────────────────┬───────────┬──────────┬─────────────────────────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├─────────────┼────────────────────────────────┼───────────┼──────────┼─────────────────────────────┤
│ id │ bigint │ │ not null │ nextval('products_id_seq'::…│
│ │ │ │ │…regclass) │
│ name │ character varying(255) │ │ │ │
│ description │ text │ │ │ │
│ inserted_at │ timestamp(0) without time zone │ │ not null │ │
│ updated_at │ timestamp(0) without time zone │ │ not null │ │
└─────────────┴────────────────────────────────┴───────────┴──────────┴─────────────────────────────┘
Indexes:
"products_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "products_tags" CONSTRAINT "products_tags_product_id_fkey" FOREIGN KEY (product_id) REFERENCES products(id)

product_tags_demo_dev=# \d tags
Table "public.tags"
┌─────────────┬────────────────────────────────┬───────────┬──────────┬─────────────────────────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├─────────────┼────────────────────────────────┼───────────┼──────────┼─────────────────────────────┤
│ id │ bigint │ │ not null │ nextval('tags_id_seq'::regc…│
│ │ │ │ │…lass) │
│ name │ character varying(255) │ │ │ │
│ inserted_at │ timestamp(0) without time zone │ │ not null │ │
│ updated_at │ timestamp(0) without time zone │ │ not null │ │
└─────────────┴────────────────────────────────┴───────────┴──────────┴─────────────────────────────┘
Indexes:
"tags_pkey" PRIMARY KEY, btree (id)
"tags_name_index" UNIQUE, btree (name)
Referenced by:
TABLE "products_tags" CONSTRAINT "products_tags_tag_id_fkey" FOREIGN KEY (tag_id) REFERENCES tags(id)

product_tags_demo_dev=# \d products_tags
Table "public.products_tags"
┌────────────┬────────┬───────────┬──────────┬─────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├────────────┼────────┼───────────┼──────────┼─────────┤
│ product_id │ bigint │ │ not null │ │
│ tag_id │ bigint │ │ not null │ │
└────────────┴────────┴───────────┴──────────┴─────────┘
Indexes:
"products_tags_pkey" PRIMARY KEY, btree (product_id, tag_id)
"products_tags_product_id_index" btree (product_id)
"products_tags_tag_id_index" btree (tag_id)
Foreign-key constraints:
"products_tags_product_id_fkey" FOREIGN KEY (product_id) REFERENCES products(id)
"products_tags_tag_id_fkey" FOREIGN KEY (tag_id) REFERENCES tags(id)

Getting tags to work!

Now comes the fun part, modifying our controllers and contexts to get our tags working!

The first thing we need to do is add a many_to_many relationship on the Product schema like so:

1
2
3
4
5
6
7
schema "products" do
field :description, :string
field :name, :string
many_to_many :tags, ProductTagsDemo.Core.Tag, join_through: "products_tags"

timestamps()
end

(Note, that we don’t need to add this relationship on the other side, i.e., Tag to get this working)

Now, we need to modify our Product form to show an input mechanism for tags, the easy way to do this is to ask the users to provide a comma-separated list of tags in an input textbox. A nicer way is to use a javascript library like select2. For us, a text box with comma-separated tags will suffice.

The easiest way to do this is to add a text field like so:

1
2
3
<%= label f, :tags %>
<%= text_input f, :tags %>
<%= error_tag f, :tags %>

However, as soon as you wire this up you’ll get an error on the /products/new page like below:

1
protocol Phoenix.HTML.Safe not implemented for #Ecto.Association.NotLoaded<association :tags is not loaded> of type Ecto.Association.NotLoaded (a struct).

This is telling us that the to_string function can’t convert an Ecto.Association.NotLoaded struct into a string, When you have a relation like a belongs_to or has_one or many_to_many that isn’t loaded on a struct, it has this default value. This is coming from our controller, we can remedy this by changing our action to the following:

1
2
3
4
def new(conn, _params) do
changeset = Core.change_product(%Product{tags: []})
render(conn, "new.html", changeset: changeset)
end

Notice the tags: [], we are creating a new product with an empty tags collection so that it renders properly in the form.

Now that we have fixed our form, we can try submitting some tags through this form, However, when you enter any tags and hit Save it doesn’t do anything which is not surprising because we haven’t set up the handling of these tags on the backend yet.

We know that the tags field has comma-separated tags, so we need to do the following to be able to save a product.

  1. Split tags on a comma.
  2. Strip them of whitespace.
  3. Lowercase them to get them to be homogeneous (If you want your tag names to be persisted using the input casing and still treat the uppercased version the same as the lowercased or capitalized versions, you can use :citext (short for case insensitive text) read more about how to set up :citext columns in my blog post about storing username/email in a case insensitive fashion).
  4. Once we have all the tag names we can insert any new tags and then fetch the existing tags, combine them, and use put_assoc to put them on the product.

Step #4 creates a race condition in your code which can happen when 2 requests try to create tags with the same name at the same time. An easy way to work around this is to treat all the tags as new and do an upsert using Repo.insert_all with an on_conflict: :nothing option which adds the fragment ON CONFLICT DO NOTHING to your SQL making your query run successfully even if there are tags with the same name in the database, it just doesn’t insert new tags. Also, note that this function inserts all the tags in a single query doing a bulk insert of all the input tags. Once you upsert all the tags, you can then find them and use a put_assoc to create an association.

This is what ended up as the final Core.create_product function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
def create_product(attrs \\ %{}) do
%Product{}
|> Product.changeset(attrs)
# use put_assoc to associate the input tags to the product
|> Ecto.Changeset.put_assoc(:tags, product_tags(attrs))
|> Repo.insert()
end

defp parse_tags(nil), do: []

defp parse_tags(tags) do
# Repo.insert_all requires the inserted_at and updated_at to be filled out
# and they should have time truncated to the second that is why we need this
now = NaiveDateTime.utc_now() |> NaiveDateTime.truncate(:second)

for tag <- String.split(tags, ","),
tag = tag |> String.trim() |> String.downcase(),
tag != "",
do: %{name: tag, inserted_at: now, updated_at: now}
end

defp product_tags(attrs) do
tags = parse_tags(attrs["tags"]) # => [%{name: "phone", inserted_at: ..}, ...]

# do an upsert ensuring that all the input tags are present
Repo.insert_all(Tag, tags, on_conflict: :nothing)

tag_names = for t <- tags, do: t.name
# find all the input tags
Repo.all(from t in Tag, where: t.name in ^tag_names)
end

It does the following:

  1. Normalize our tags
  2. Ensure that all the tags are in our database using Repo.insert_all with on_conflict: :nothing in a single SQL query.
  3. Load all the tag structs using the names.
  4. Use put_assoc to associate the tags with the newly created product.
  5. From here Ecto takes over and makes sure that our product has the right association records in the products_tags table

Notice, how through all of our code we haven’t used the products_tags table except for defining the many_to_many relationship in the Product schema.

This is all you need to insert a product with multiple tags, However, we still want to show the tags of a product on the product details page. We can do this by tweaking our action and the Core module like so:

1
2
3
4
5
6
7
8
9
defmodule Core do
def get_product_with_tags!(id), do: Product |> preload(:tags) |> Repo.get!(id)
end
defmodule ProductTagsDemoWeb.ProductController do
def show(conn, %{"id" => id}) do
product = Core.get_product_with_tags!(id)
render(conn, "show.html", product: product)
end
end

Here we are preloading the tags with the product and we can use it in the view like below to show all the tags for a product:

1
Tags: <%= (for tag <- @product.tags, do: tag.name) |> Enum.join(", ") %>

This takes care of creating and showing a product with tags, However, if we try to edit a product we are greeted with the following error:

1
protocol Phoenix.HTML.Safe not implemented for #Ecto.Association.NotLoaded<association :tags is not loaded> of type Ecto.Association.NotLoaded (a struct).

Hmmm, we have seen this before when we rendered a new Product without tags, However, in this case, our product does have tags but they haven’t been loaded/preloaded. We can remedy that easily by tweaking our edit action to the following:

1
2
3
4
5
def edit(conn, %{"id" => id}) do
product = Core.get_product_with_tags!(id)
changeset = Core.change_product(product)
render(conn, "edit.html", product: product, changeset: changeset)
end

This gives us a new error:

1
lists in Phoenix.HTML and templates may only contain integers representing bytes, binaries or other lists, got invalid entry: %ProductTagsDemo.Core.Tag{__meta__: #Ecto.Schema.Metadata<:loaded, "tags">, id: 1, inserted_at: ~N[2020-05-04 05:20:45], name: "phone", updated_at: ~N[2020-05-04 05:20:45]}

This is because we are using a text_input for a collection of tags and when phoenix tries to convert the list of tags into a string it fails. This is a good place to add a custom input function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
defmodule ProductTagsDemoWeb.ProductView do
use ProductTagsDemoWeb, :view

def tag_input(form, field, opts \\ []) do
# get the input tags collection
tags = Phoenix.HTML.Form.input_value(form, field)
# render text using the text_input after converting tags to text
Phoenix.HTML.Form.text_input(form, field, value: tags_to_text(tags), opts)
end

defp tags_to_text(tags) do
tags
|> Enum.map(fn t -> t.name end)
|> Enum.join(", ")
end
end

With this helper we can tweak our form to:

1
2
3
4
<%= label f, :tags %>
<%= tag_input f, :tags %>
<%= error_tag f, :tags %>
<small class="help-text">tags separated by commas</small>

Note that the text_input has been changed to tag_input.

Now, when we go to edit a product, it should render the form with the tags separated by commas. However, updating the product by changing tags still doesn’t work because we haven’t updated our backend code to handle this. To complete this, we need to tweak the controller and the Core context like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
defmodule ProductTagsDemoWeb.ProductController do
def update(conn, %{"id" => id, "product" => product_params}) do
product = Core.get_product_with_tags!(id)
# ... rest is the same
end
end
defmodule ProductTagsDemo.Core do
def update_product(%Product{} = product, attrs) do
product
|> Product.changeset(attrs)
|> Ecto.Changeset.put_assoc(:tags, product_tags(attrs))
|> Repo.update()
end
end

Note that in the controller we are using get_product_with_tags! and in the context, we inserted a line to put_assoc similar to the create_product function which does the same things as create_product.

Astute readers will observe that our create and update product implementation doesn’t rollback newly created tags, when create_product or update_product fails. Let us handle this case and wrap our post!

Ecto provides Ecto.Multi to allow easy database transaction handling. This just needs changes to our context and our view like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
defmodule ProductTagsDemo.Core do
alias Ecto.Multi

def create_product(attrs \\ %{}) do
multi_result =
Multi.new()
# use multi to insert all the tags, so the tags are rolled back when there
# is an error in product creation
|> ensure_tags(attrs)
|> Multi.insert(:product, fn %{tags: tags} ->
# This chunk of code remains the same, the only difference is we let
# Ecto.Multi handle insertion of the product
%Product{}
|> Product.changeset(attrs)
|> Ecto.Changeset.put_assoc(:tags, tags)
end)
# Finally, we run all of this in a single transaction
|> Repo.transaction()

# a multi result can be an :ok tagged tuple with the data from all steps
# or an error tagged tuple with the failure step's atom and relevant data
# in this case we only expect failures in Product insertion
case multi_result do
{:ok, %{product: product}} -> {:ok, product}
{:error, :product, changeset, _} -> {:error, changeset}
end
end

# This is identical to `create_product`
def update_product(%Product{} = product, attrs) do
multi_result =
Multi.new()
|> ensure_tags(attrs)
|> Multi.update(:product, fn %{tags: tags} ->
product
|> Product.changeset(attrs)
|> Ecto.Changeset.put_assoc(:tags, tags)
end)
|> Repo.transaction()

case multi_result do
{:ok, %{product: product}} -> {:ok, product}
{:error, :product, changeset, _} -> {:error, changeset}
end
end

# parse_tags is unchanged

# We have created an ensure tags to use the multi struct passed along and the
# repo associated with it to allow rolling back tag inserts
defp ensure_tags(multi, attrs) do
tags = parse_tags(attrs["tags"])

multi
|> Multi.insert_all(:insert_tags, Tag, tags, on_conflict: :nothing)
|> Multi.run(:tags, fn repo, _changes ->
tag_names = for t <- tags, do: t.name
{:ok, repo.all(from t in Tag, where: t.name in ^tag_names)}
end)
end
end

defmodule ProductTagsDemoWeb.ProductView do
use ProductTagsDemoWeb, :view
import Phoenix.HTML.Form

def tag_input(form, field, opts \\ []) do
text_input(form, field, value: tag_value(form.source, form, field))
end

# if there is an error, pass the input params along
defp tag_value(%Ecto.Changeset{valid?: false}, form, field) do
form.params[to_string(field)]
end

defp tag_value(_source, form, field) do
form
|> input_value(field)
|> tags_to_text
end

defp tags_to_text(tags) do
tags
|> Enum.map(fn t -> t.name end)
|> Enum.join(", ")
end
end

Whew, that was long, but hopefully, this gives you a comprehensive understanding of how to handle many_to_many relationships in Ecto and Phoenix.

The source code associated with this blog post can be found at https://github.com/minhajuddin/product_tags_demo

P.S. There is a lot of duplication in our final create_product and update_product functions, try removing the duplication in an elegant way! I’ll share my take on it in the next post!

How to dump a partial/sample table(1000 rows) in postgres using pg_dump

The other day, I wanted to export a sample of one of my big Postgres tables from the production server to my local computer. This was a huge table and I didn’t want to move around a few GBs just to get a sample onto my local environment. Unfortunately pg_dump doesn’t support exporting of partial tables. I looked around and found a utility called pg_sample which is supposed to help you with this. However, I wasn’t comfortable with installing this on my production server or letting my production data through this script. Thinking a little more made the solution obvious. The idea was simple:

  1. Create a table called tmp_page_caches where page_caches is the table that you want to copy using pg_dump using the following SQL in psql, this gives you a lot of freedom on SELECTing just the rows you want.

    1
    CREATE TABLE tmp_page_caches AS (SELECT * FROM page_caches LIMIT 1000);
  2. Export this table using pg_dump as below. Here we are exporting the data to a sql file and transforming our table name to the original table name midstream.

    1
    pg_dump app_production --table tmp_page_caches | sed 's/public.tmp_/public./' > page_caches.sql
  3. Copy this file to the local server using scp and now run it against the local database:

    1
    2
    scp minhajuddin@server.prod:page_caches.sql .
    psql app_development < page_caches.sql
  4. Get rid of the temporary table on the production server

    1
    DROP TABLE tmp_page_caches; -- be careful not to drop the real table!

Voila! We have successfully copied over a sample of our production table to our local environment. Hope you find it useful.

How to copy output of a function to your clipboard in Elixir or Ruby

Having the ability to drive your development using just a keyboard is very productive. However, when you are using a terminal and have to copy the output of a command to use it somewhere else, it breaks your flow, you need to move your hands away from your keyboard, use the mouse to select the text and then copy it.

When I want to copy passwords to be used elsewhere from my browser, I usually open the developer tools console, inspect element and click on the password input box and then run the following code:

1
copy($0.value)

Chrome sets $0 to refer to the currently selected DOM element and $0.value will give us the value of the password field and sending it to the copy function copies this text to the OS clipboard.

I have a similar script set up for my terminal, when I want to copy the output of a command like rake secret I run the following command:

1
2
rake secret | xc # copies a new secret to the clipboard.
echo "Hello" | xc # copies the string `Hello` to the clipboard.

xc is aliased to the following in my bashrc:

1
alias xc='tee /dev/tty | xclip -selection clipboard'

This command prints the output to the terminal (using tee /dev/tty) and copies it to the OS clipboard using the xclip package.

I wanted the same ability in my ruby and elixir REPLs. It was pretty straightforward to do in ruby. Here is the annotated code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
puts 'loading ~/.pryrc ...'

require 'open3'

# copy takes an argument and converts it into a string and copies it to the OS
# clipboard using the `xclip` command line package.
def copy(text)
# start running the `xclip` command to copy the stdin to the OS primary
# clipboard. Also pass the stdin and stdout, stderr to the block
Open3.popen3('xclip', '-selection', 'clipboard') do |stdin, _stdout, _stderr, _wait_thr|
# convert the input argument to a string and write it to the stdin of the
# spawned `xclip` process and the close the input stream
stdin.puts text.to_s
stdin.close
end

# print out an informational message to signal that the argument has been
# copied to the clipboard.
puts "copied to clipboard: #{text.to_s[0..10]}..."
end

# e.g. running `copy SecureRandom.uuid` will print the following
# pry(main)> copy SecureRandom.uuid
# copied to clipboard: 14438d5c-62...
# and copies: `14438d5c-62b9-40a1-a324-5d2bd2205990` to the OS clipboard

Below is a similar script for Elixir:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
IO.puts("loading ~/.iex.exs")

# open a module called `H` as we can't have functions outside modules
defmodule H do
# copy function takes the input and converts it into a string before copying
# it to the OS clipboard.
def copy(text) do
# convert input argument to a string
text = to_s(text)

# spawn a new xclip process configured to copy the stdin to the OS's primary
# clipboard
port = Port.open({:spawn, "xclip -selection clipboard"}, [])
# send the input text as stdin to the xclip process
Port.command(port, text)
# close the port
Port.close(port)

# print out an informational message to signal that the text has been copied
# to the OS's clipboard"
IO.puts("copied to clipboard: #{String.slice(text, 0, 10)}...")
end

# to_s converts an elixir term to a string if it implements the `String.Chars`
# protocol otherwise it uses `inspect` to convert it into a string.
defp to_s(text) do
to_string(text)
rescue
_ -> inspect(text)
end
end
1
2
3
iex(2)> :crypto.strong_rand_bytes(16) |> Base.encode16 |> H.copy
# copied to clipboard: 347B175C6F...
# it has also copied `347B175C6F397B2808DE7168444ED428` to the OS's clipboard

All these utilities (except for the browser’s copy function) depend on the xclip utility which can be installed on ubuntu using sudo apt-get install xclip. You can emulate the same behaviour on a Mac using the pbcopy utility, you might have to tweak things a little bit, but it should be pretty straightforward.

You can do the same in your favorite programming language too, just find the right way to spawn an xclip process and send the text you want to be copied to its’ stdin. Hope this makes your development a little more pleasant :)

How to store username or email with case insensitive search using Ecto - Part 2

In a previous blog post I was trying to store username/email in a case insensitive way in postgres. A few folks commented that the citext postgres extension actually did this in a very easy and straightforward way. So, I went back to my code and ripped out the unnecessary complexity and here is what I ended up with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
defmodule SF.Repo.Migrations.EnableCitextExtension do
use Ecto.Migration

def change do
execute "CREATE EXTENSION citext", "DROP EXTENSION citext"
end
end

SF.Repo.Migrations.CreateUsers do
use Ecto.Migration

def change do
create table(:users, primary_key: false) do
add :id, :binary_id, primary_key: true
add :email, :citext, null: false
add :magic_token, :uuid
add :magic_token_created_at, :naive_datetime
add :confirmation_token, :uuid
add :confirmed_at, :naive_datetime

timestamps()
end

create index(:users, [:email], unique: true)
create index(:users, [:magic_token], unique: true)
create index(:users, [:confirmation_token], unique: true)
end
end

defmodule SF.User do
use Ecto.Schema
import Ecto.Changeset

@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
schema "users" do
field :email, :string
field :magic_token, Ecto.Base64UUID
field :confirmation_token, Ecto.Base64UUID
field :confirmed_at, :naive_datetime

timestamps()
end

@doc false
def changeset(user, attrs) do
user
|> cast(attrs, [:email, :confirmation_token])
|> validate_required([:email])
|> unique_constraint(:email)
end
end

defmodule SF.UserService do

def find_by_email(email) do
Repo.one(from u in User, where: u.email == ^email)
end

end

So, the way citext works is similar to our previous approach. If you want to get into all the gory details about how citext is implemented you can check out the code on GitHub at: https://github.com/postgres/postgres/blob/6dd86c269d5b9a176f6c9f67ea61cc17fef9d860/contrib/citext/citext.c

How to store username or email with case insensitive search using Ecto

I am building a small personal project which stores users in a users table and every user has a unique email. So, my first model looked something like below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
defmodule SF.Repo.Migrations.CreateUsers do
use Ecto.Migration

def change do
create table(:users, primary_key: false) do
add :id, :binary_id, primary_key: true
add :email, :string, null: false
add :magic_token, :uuid
add :confirmation_token, :uuid
add :confirmed_at, :naive_datetime

timestamps()
end

create index(:users, [:email], unique: true)
create index(:users, [:magic_token], unique: true)
create index(:users, [:confirmation_token], unique: true)
end
end

defmodule SF.User do
use Ecto.Schema
import Ecto.Changeset

@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
schema "users" do
field :email, :string
field :magic_token, Ecto.Base64UUID
field :confirmation_token, Ecto.Base64UUID
field :confirmed_at, :naive_datetime

timestamps()
end

@doc false
def changeset(user, attrs) do
user
|> cast(attrs, [:email, :confirmation_token])
|> validate_required([:email])
|> unique_constraint(:email)
end
end

Like all good developers I had a unique index on the email field to make the searches faster. So, when I do a Repo.get_by(User, email: "danny@m.com"), postgres doesn’t have to scan the whole table to find my user. However, users sometimes enter email in mixed case, so some people might enter the above email as `DANNY@m.com`, and since postgres makes a distinction between upper cased and lower cased strings, we would end up returning a 404 Not found error to the user. To work around this I would usually lower case the email whenever it entered the system, in Rails you would do something like below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class CreateUsers < ActiveRecord::Migration[5.2]
def change
create_table :users, id: :uuid do |t|
# ...
end
add_index :users, %i[email], unique: true
end
end

class User < ActiveRecord::Base
# downcase email before saving
before_save :normalize_email

def normalize_email
self.email = email&.downcase
end

# always downcase before you find a record
def find_by_email
find_by(email: email.downcase)
end
end

One downside of this approach is the need to ensure that all the emails in the database are stored as lower case. If you mess up on your data entry code, you might end up with a table containing the same email with different cases.

A better way to do this in Ecto would be to create an index on a lower cased email like so:

1
create index(:users, ["(lower(email))"], unique: true)

This way you would never end up with a table with duplicate emails, and when you want to find a user with an email you can do something like below:

1
2
3
4
5
6
7
8
9
10
11
12
13
defmodule SF.UserService do
def find_by_email(email) do
email = String.downcase(email)

user =
Repo.one(
from u in User,
where: fragment("lower(?)", u.email) == ^email
)

if user != nil, do: {:ok, user}, else: {:error, :not_found}
end
end

This would also make sure that your index is actually used. You can take the SQL logged in your IEx and run a quick EXPLAIN to make sure that your index is properly being used:

1
2
3
4
5
6
7
8
9
10
11
12
13
# EXPLAIN ANALYZE SELECT u0."id", u0."email", u0."magic_token", u0."confirmation_token", u0."confirmed_at", u0."inserted_at", u0."updated_at" FROM "users" AS u0 WHERE (lower(u0
."email") = 'foobar@x.com');
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Index Scan using users__lower_email_index on users u0 (cost=0.14..8.16 rows=1 width=588) (actual time=0.013..0.014 rows=0 loops=1) │
│ Index Cond: (lower((email)::text) = 'foobar@x.com'::text) │
│ Planning time: 0.209 ms │
│ Execution time: 0.064 ms │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(4 rows)

Time: 1.086 ms

A common rookie mistake is creating an index on the email column and then comparing in sql using the lower function like so:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
simpleform_dev=# EXPLAIN ANALYZE select * from users where lower(email) = 'danny';
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Seq Scan on users (cost=10000000000.00..10000000001.01 rows=1 width=580) (actual time=0.034..0.034 rows=0 loops=1) │
│ Filter: (lower((email)::text) = 'danny'::text) │
│ Rows Removed by Filter: 1 │
│ Planning time: 0.158 ms │
│ Execution time: 0.076 ms │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

Time: 1.060 ms
simpleform_dev=#

How to view documentation of callbacks in IEx for Elixir

The other day, I was playing around with GenServers and I needed to see the documentation for the handle_call hook. I knew that this wasn’t a function defined on the GenServer, So I couldn’t just do a h GenServer.callback. I thought to myself that there must be a way to get callback documentation using h, so I typed h h in IEx.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
iex(9)> h h

def h()

Prints the documentation for IEx.Helpers.


defmacro h(term)

Prints the documentation for the given module or for the given function/arity
pair.

## Examples

iex> h(Enum)

It also accepts functions in the format fun/arity and module.fun/arity, for
example:

iex> h(receive/1)
iex> h(Enum.all?/2)
iex> h(Enum.all?)

iex(10)>

No luck with that! Nothing that references getting callback documentation, I still wanted to do the naive thing and just see what h GenServer.callback returned. And, to my surprise it ended up returning something useful:

1
2
3
4
5
iex(10)> h GenServer.handle_call
No documentation for function GenServer.handle_call was found, but there is a callback with the same name.
You can view callback documentation with the b/1 helper.

iex(11)>

Aha! These are the little things which make me love Elixir so much :’) So, the next time you want to look up documentation about callbacks just use the b helper in IEx, hope that saves you some time :) It even accepts a module and shows you all the callbacks that a module defines!

P.S: The curse of knowledge is real, if I hadn’t tried the naive way, I wouldn’t know that it was so easy to get documentation for callbacks and I would have ended up creating a GenServer, sending a message and inspecting the arguments to figure out what they were. So, the next time you run into a problem, it might be worth your while to take a step back and ask yourself, How would an Elixir beginner do this?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
iex(13)> b GenServer.handle_call
@callback handle_call(request :: term(), from(), state :: term()) ::
{:reply, reply, new_state}
| {:reply, reply, new_state,
timeout() | :hibernate | {:continue, term()}}
| {:noreply, new_state}
| {:noreply, new_state,
timeout() | :hibernate | {:continue, term()}}
| {:stop, reason, reply, new_state}
| {:stop, reason, new_state}
when reply: term(), new_state: term(), reason: term()

Invoked to handle synchronous call/3 messages. call/3 will block until a reply
is received (unless the call times out or nodes are disconnected).

request is the request message sent by a call/3, from is a 2-tuple containing
the caller's PID and a term that uniquely identifies the call, and state is the
current state of the GenServer.

Returning {:reply, reply, new_state} sends the response reply to the caller and
continues the loop with new state new_state.

Returning {:reply, reply, new_state, timeout} is similar to {:reply, reply,
new_state} except handle_info(:timeout, new_state) will be called after timeout
milliseconds if no messages are received.

Returning {:reply, reply, new_state, :hibernate} is similar to {:reply, reply,
new_state} except the process is hibernated and will continue the loop once a
message is in its message queue. If a message is already in the message queue
this will be immediately. Hibernating a GenServer causes garbage collection and
leaves a continuous heap that minimises the memory used by the process.

Returning {:reply, reply, new_state, {:continue, continue}} is similar to
{:reply, reply, new_state} except c:handle_continue/2 will be invoked
immediately after with the value continue as first argument.

Hibernating should not be used aggressively as too much time could be spent
garbage collecting. Normally it should only be used when a message is not
expected soon and minimising the memory of the process is shown to be
beneficial.

Returning {:noreply, new_state} does not send a response to the caller and
continues the loop with new state new_state. The response must be sent with
reply/2.

There are three main use cases for not replying using the return value:

• To reply before returning from the callback because the response is
known before calling a slow function.
• To reply after returning from the callback because the response is not
yet available.
• To reply from another process, such as a task.

When replying from another process the GenServer should exit if the other
process exits without replying as the caller will be blocking awaiting a reply.

Returning {:noreply, new_state, timeout | :hibernate | {:continue, continue}}
is similar to {:noreply, new_state} except a timeout, hibernation or continue
occurs as with a :reply tuple.

Returning {:stop, reason, reply, new_state} stops the loop and c:terminate/2 is
called with reason reason and state new_state. Then the reply is sent as the
response to call and the process exits with reason reason.

Returning {:stop, reason, new_state} is similar to {:stop, reason, reply,
new_state} except a reply is not sent.

This callback is optional. If one is not implemented, the server will fail if a
call is performed against it.

iex(14)>

Pearls of Elixir - Interesting patterns from popular Elixir packages

I had a wonderful time giving a talk at the Elixir January Tech Meetup here in Toronto. Big thanks to Mattia for organizing and PagerDuty for hosting the meetup!

I wanted to capture the talk in a blog post and here it is.

1. Canada

Many of us have used cancan for authorization in our Rails applications. When I was searching for a similar package in Elixir, I found the awesome canada package.

It’s DSL is pretty straightforward

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# In this example we have a User and a Post entity.
defmodule User do
defstruct id: nil, name: nil, admin: false
end

defmodule Post do
defstruct user_id: nil, content: nil
end

# Followed by a protocol definition which allows you to define the rules on what
# is allowed and what is forbidden.

defimpl Canada.Can, for: User do
def can?(%User{id: user_id}, action, %Post{user_id: user_id})
when action in [:update, :read, :destroy, :touch], do: true

def can?(%User{admin: admin}, action, _)
when action in [:update, :read, :destroy, :touch], do: admin

def can?(%User{}, :create, Post), do: true
end

# Finally, when we want to use this we just use the following syntax which reads
# very nicely.

import Canada, only: [can?: 2]

if some_user |> can? read(some_post) do
# render the post
else
# sorry (raise a 403)
end

When using packages, I try to take a peek at the source code and understand how things work. And, I was shocked when I saw just 10 lines of code in the lib folder! See for yourself:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# lib/canada.ex
defmodule Canada do
defmacro can?(subject, {action, _, [argument]}) do
quote do
Canada.Can.can? unquote(subject), unquote(action), unquote(argument)
end
end
end

# lib/canada/can.ex
defprotocol Canada.Can do
@doc "Evaluates permissions"
def can?(subject, action, resource)
end

The protocol is what allows you to define your custom rules for authorization and the Canada module defines a neat little macro which allows you to test if a user is authorized to perform an action using syntax like: can? user, read(post). How cool is that!

2. Readable binary match specs

Postgrex is another one of those packages which is filled with neat Elixir code. When I was skimming through the code, I ran into a piece of code which surprised me:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
defmodule Postgrex.BinaryUtils do
@moduledoc false

defmacro int64 do
quote do: signed-64
end

# ...

defmacro uint16 do
quote do: unsigned-16
end

end

I was having a difficult time understanding how signed-64 could be valid Elixir code. I quickly spun up an iex console and typed in signed-64 and unsurprisingly it threw an error. Upon further searching I found that this was actually used in binary pattern matches all over the code:

1
2
3
4
5
6
7
8
9
defmodule Postgrex.Messages do
import Postgrex.BinaryUtils
# ....
def parse(<<type :: int32, rest :: binary>>, ?R, size) do
# ....

def parse(<<pid :: int32, key :: int32>>, ?K, _size) do
# ....
end

So, the macro int32 would actually be spliced inside of a binary pattern match. I would never have thought of doing this! And it makes the code so much more readable and easy to follow.

3. Compiling lookup tables in Modules

While browsing through postgrex, I found a text file called errcodes.txt which I thought was a bit strange. Here is a snippet of that file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#
# errcodes.txt
# PostgreSQL error codes
#
# Copyright (c) 2003-2015, PostgreSQL Global Development Group

# ...

Section: Class 00 - Successful Completion

00000 S ERRCODE_SUCCESSFUL_COMPLETION successful_completion

Section: Class 01 - Warning

# do not use this class for failure conditions
01000 W ERRCODE_WARNING warning
0100C W ERRCODE_WARNING_DYNAMIC_RESULT_SETS_RETURNED dynamic_result_sets_returned
01008 W ERRCODE_WARNING_IMPLICIT_ZERO_BIT_PADDING implicit_zero_bit_padding
01003 W ERRCODE_WARNING_NULL_VALUE_ELIMINATED_IN_SET_FUNCTION null_value_eliminated_in_set_function
01007 W ERRCODE_WARNING_PRIVILEGE_NOT_GRANTED privilege_not_granted
01006 W ERRCODE_WARNING_PRIVILEGE_NOT_REVOKED privilege_not_revoked
01004 W ERRCODE_WARNING_STRING_DATA_RIGHT_TRUNCATION string_data_right_truncation
01P01 W ERRCODE_WARNING_DEPRECATED_FEATURE deprecated_feature

# ...

This file maps error codes to their symbols. The reason this was in the lib folder was because it was supposed to be used as a source for error codes mapping. Upon further reading I found that this was being used in a module called Postgrex.ErrorCode. Here are the interesting pieces of that module:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defmodule Postgrex.ErrorCode do
@external_resource errcodes_path = Path.join(__DIR__, "errcodes.txt")

errcodes = for line <- File.stream!(errcodes_path),
# ...

# errcode duplication removal

# defining a `code_to_name` function for every single error code which maps
# the code to a name.
for {code, errcodes} <- Enum.group_by(errcodes, &elem(&1, 0)) do
[{^code, name}] = errcodes
def code_to_name(unquote(code)), do: unquote(name)
end
def code_to_name(_), do: nil

end

This code file uses our errorcodes text file to define around 400 functions which embed the actual code to name mapping. And whenever you wanted to do the actual lookup you could just use Postgrex.ErrorCode.code_to_name(error_code)

4. Validating UUIDs

Did you know that you don’t need the uuid package to generate UUIDs? UUID generation is available in Ecto as part of the Ecto.UUID module. And it even has a function which allows you to validate a UUID. Most of us would quickly reach for a regex pattern to validate a UUID, However, the Ecto library uses an interesting approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
defmodule Ecto.UUID do

@doc """
Casts to UUID.
"""
@spec cast(t | raw | any) :: {:ok, t} | :error
def cast(<< a1, a2, a3, a4, a5, a6, a7, a8, ?-,
b1, b2, b3, b4, ?-,
c1, c2, c3, c4, ?-,
d1, d2, d3, d4, ?-,
e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12 >>) do
<< c(a1), c(a2), c(a3), c(a4),
c(a5), c(a6), c(a7), c(a8), ?-,
c(b1), c(b2), c(b3), c(b4), ?-,
c(c1), c(c2), c(c3), c(c4), ?-,
c(d1), c(d2), c(d3), c(d4), ?-,
c(e1), c(e2), c(e3), c(e4),
c(e5), c(e6), c(e7), c(e8),
c(e9), c(e10), c(e11), c(e12) >>
catch
:error -> :error
else
casted -> {:ok, casted}
end
def cast(<< _::128 >> = binary), do: encode(binary)
def cast(_), do: :error

@compile {:inline, c: 1}

defp c(?0), do: ?0
defp c(?1), do: ?1
defp c(?2), do: ?2
defp c(?3), do: ?3
defp c(?4), do: ?4
defp c(?5), do: ?5
defp c(?6), do: ?6
defp c(?7), do: ?7
defp c(?8), do: ?8
defp c(?9), do: ?9
defp c(?A), do: ?a
defp c(?B), do: ?b
defp c(?C), do: ?c
defp c(?D), do: ?d
defp c(?E), do: ?e
defp c(?F), do: ?f
defp c(?a), do: ?a
defp c(?b), do: ?b
defp c(?c), do: ?c
defp c(?d), do: ?d
defp c(?e), do: ?e
defp c(?f), do: ?f
defp c(_), do: throw(:error)

end

This code is pretty self explanatory and is a literal translation of how you would validate a UUID using a pen and paper.

5. Honorable Mentions

Static struct assertions/checks in functions

With Elixir you can assert that the argument your function receives is of a specific type by using a pattern like below:

1
2
3
4
5
defmodule User do
def authorized?(%User{} = user) do
# ....
end
end

This code would blow up if the argument passed was not a User struct. This is a nice way of asserting the type. However, you can overdo this by using it everywhere. A good rule of thumb is to use this pattern in your public API at the periphery where data comes in.

Tagged with blocks

You can wrap your with matches in tagged tuples like below if you want to handle errors differently for different failures.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
with {:parse, {:ok, user_attrs}} <- {:parse, Jason.parse(body)},
{:persist, {:ok, user}} <- {:persist, Users.create(user_attrs)},
{:welcome_email, :ok} <- {:welcome_email, Emailer.welcome(user)} do
:ok
else
{:parse, err} ->
# raise an error
{:error, :parse_error}
{:persist, {:error, changeset}} ->
# return validation errors
{:error, changeset}
{:welcome_email, err} ->
# it is ok if email sending failed, we just log this
Logger.error("SENDING_WELCOME_EMAIL_FAILED")
:ok
end

Delegating function calls on your root API module

defdelegate allows you to delegate function calls to a different module using the same arguments.

1
2
3
4
5
6
7
8
9
defmodule API do
defdelegate create_customer(customer_json), to: API.CustomerCreator
end

defmodule API.CustomerCreator do
def create_customer(customer_json) do
# ...
end
end

Enforcing Keys

While defining a struct you can also define which keys are mandatory.

1
2
3
4
defmodule User do
@enforce_keys [:email, :name]
defstruct [:email, :name]
end

Interpolation in docs

1
2
3
4
5
6
7
8
defmodule HTTPClient do
@timeout 60_000
@doc """
Times out after #{@timeout} seconds
"""
def get(url) do
end
end

Suppressing logs in your tests

1
2
3
4
5
6
7
8
9
10
11
12
ExUnit.start

defmodule HTTPTest do
use ExUnit.Case
require Logger

@moduletag :capture_log

test "suppress logs" do
Logger.info "AAAAAAAAAAAAAAAAAAAHHHHHHHHH"
end
end

Solution to Advent of Code 2018 Day 5 in Elixir

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
defmodule Day5 do
def scan(polymer) do
chars =
polymer
|> String.to_charlist()

res =
chars
|> Enum.reduce({:none, []}, fn
c, {:none, acc} ->
{:prev, c, acc}

c, {:prev, prev, acc} ->
if react?(c, prev) do
{:none, acc}
else
{:prev, c, [prev | acc]}
end
end)

reduced_polymer =
case res do
{_, acc} -> acc
{:prev, c, acc} -> [c | acc]
end
|> Enum.reverse()
|> to_string

if reduced_polymer == polymer do
polymer
else
scan(reduced_polymer)
end
end

def react?(c1, c2), do: abs(c1 - c2) == 32

@all_units Enum.zip(?a..?z, ?A..?Z) |> Enum.map(fn {c1, c2} -> ~r[#{[c1]}|#{[c2]}] end)
def smallest(polymer) do
@all_units
|> Enum.map(fn unit_to_be_removed ->
polymer
|> String.replace(unit_to_be_removed, "")
|> scan
|> String.length()
end)
|> Enum.min()
end
end

defmodule Day5Test do
use ExUnit.Case

import Day5

test "reduces 2 reacting units" do
assert scan("aA") == ""
end

test "reduces 2 non reacting units" do
assert scan("aB") == "aB"
assert scan("Ba") == "Ba"
end

test "reduces 3 non reacting units" do
assert scan("aBc") == "aBc"
assert scan("aBA") == "aBA"
assert scan("BaD") == "BaD"
end

test "reduces 3 reacting units" do
assert scan("aAB") == "B"
assert scan("abB") == "a"
assert scan("aBb") == "a"
assert scan("BaA") == "B"
end

test "reduces recursively" do
assert scan("baAB") == ""
end

test "large polymer" do
assert scan("dabAcCaCBAcCcaDA") == "dabCBAcaDA"
assert scan("abcdDCBA") == ""
end

test "input" do
assert File.read!("./input.txt") |> String.trim() |> scan |> String.length() == 0
end

test "smallest" do
assert smallest("dabAcCaCBAcCcaDA") == 4
end

test "smallest for input" do
assert File.read!("./input.txt") |> String.trim() |> smallest == 0
end
end

Solution to Advent of Code 2018 Day 4 in Elixir

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
defmodule Day4 do
defmodule State do
defstruct [:guard_id, :start, :sleep]
end

defp desc_minutes({_k, ranges}) do
ranges
|> Enum.reduce(0, fn x, sum ->
sum + Enum.count(x)
end)
|> Kernel.*(-1)
end

def find_sleep_constant(spec) do
{guard, sleep_durations} =
spec
|> parse
|> Enum.sort_by(&desc_minutes/1)
|> hd

guard * most_sleepy_minute(sleep_durations)
end

def sleepiest_guard_minute(spec) do
{guard_id, {min, _count}} =
spec
|> parse # => %{ guard_id => [min_start1..min_end1] }
|> Enum.map(fn {guard_id, durations} ->
{min, occurences} =
durations
|> Enum.flat_map(&Enum.to_list/1)
|> Enum.group_by(& &1)
|> Enum.max_by(fn {_min, occurences} -> Enum.count(occurences) end)

{guard_id, {min, length(occurences)}}
end)
|> Enum.max_by(fn {_guard_id, {_min, count}} -> count end)
{guard_id, min}
end

def most_sleepy_minute(sleep_durations) do
{minute, _} =
sleep_durations
|> Enum.flat_map(&Enum.to_list/1)
|> Enum.group_by(& &1)
|> Enum.sort_by(fn {_k, v} -> -1 * Enum.count(v) end)
|> hd

minute
end

def parse(spec) do
{_state, logs} =
spec
|> String.split("\n", trim: true)
|> Enum.sort()
|> Enum.map(&parse_line/1)
|> Enum.reduce({_state = %State{}, _out = %{}}, fn x, {state, out} ->
case x do
{:start, guard_id, _minutes} ->
{%{state | guard_id: guard_id}, out}

{:sleep, minutes} ->
{%{state | start: minutes}, out}

{:wake, minutes} ->
prev_sleep = out[state.guard_id] || []
{state, Map.put(out, state.guard_id, [state.start..(minutes - 1) | prev_sleep])}
end
end)

logs
end

def parse_line(line) do
<<"[", _year::32, "-", _month::16, "-", _day::16, " ", _hour::16, ":",
minutes_bin::binary-size(2), "] ", note::binary>> = line

parse_note(note, String.to_integer(minutes_bin))
end

def parse_note("wakes up", minutes) do
{:wake, minutes}
end

def parse_note("falls asleep", minutes) do
{:sleep, minutes}
end

def parse_note(begin_note, minutes) do
guard_id =
Regex.named_captures(~r[Guard #(?<guard_id>\d+) begins shift], begin_note)
|> Map.get("guard_id")
|> String.to_integer()

{:start, guard_id, minutes}
end
end

defmodule Day4Test do
use ExUnit.Case

import Day4

test "parses the times when each guard sleeps" do
assert parse("""
[1518-11-01 00:00] Guard #10 begins shift
[1518-11-01 00:05] falls asleep
[1518-11-01 00:25] wakes up
[1518-11-01 00:30] falls asleep
[1518-11-01 00:55] wakes up
[1518-11-01 23:58] Guard #99 begins shift
[1518-11-02 00:40] falls asleep
[1518-11-02 00:50] wakes up
[1518-11-03 00:05] Guard #10 begins shift
[1518-11-03 00:24] falls asleep
[1518-11-03 00:29] wakes up
[1518-11-04 00:02] Guard #99 begins shift
[1518-11-04 00:36] falls asleep
[1518-11-04 00:46] wakes up
[1518-11-05 00:03] Guard #99 begins shift
[1518-11-05 00:45] falls asleep
[1518-11-05 00:55] wakes up
[1518-11-08 00:03] Guard #99334 begins shift
[1518-11-08 00:45] falls asleep
[1518-11-08 00:55] wakes up
""") == %{
10 => [5..24, 30..54, 24..28] |> Enum.reverse(),
99 => [40..49, 36..45, 45..54] |> Enum.reverse(),
99334 => [45..54]
}
end

test "find_sleep_constant" do
assert find_sleep_constant("""
[1518-11-01 00:00] Guard #10 begins shift
[1518-11-01 00:05] falls asleep
[1518-11-01 00:25] wakes up
[1518-11-01 00:30] falls asleep
[1518-11-01 00:55] wakes up
[1518-11-01 23:58] Guard #99 begins shift
[1518-11-02 00:40] falls asleep
[1518-11-02 00:50] wakes up
[1518-11-03 00:05] Guard #10 begins shift
[1518-11-03 00:24] falls asleep
[1518-11-03 00:29] wakes up
[1518-11-04 00:02] Guard #99 begins shift
[1518-11-04 00:36] falls asleep
[1518-11-04 00:46] wakes up
[1518-11-05 00:03] Guard #99 begins shift
[1518-11-05 00:45] falls asleep
[1518-11-05 00:55] wakes up
""") == 240
end

test "parses line" do
assert parse_line("[1518-11-01 00:08] wakes up") == {:wake, 8}
assert parse_line("[1518-11-01 00:30] falls asleep") == {:sleep, 30}
assert parse_line("[1518-11-01 00:23] Guard #10 begins shift") == {:start, 10, 23}
assert parse_line("[1518-11-01 00:23] Guard #99 begins shift") == {:start, 99, 23}
end

test "file" do
assert 240 ==
find_sleep_constant("""
[1518-11-01 00:00] Guard #10 begins shift
[1518-11-01 00:05] falls asleep
[1518-11-01 00:25] wakes up
[1518-11-01 00:30] falls asleep
[1518-11-01 00:55] wakes up
[1518-11-01 23:58] Guard #99 begins shift
[1518-11-02 00:40] falls asleep
[1518-11-02 00:50] wakes up
[1518-11-03 00:05] Guard #10 begins shift
[1518-11-03 00:24] falls asleep
[1518-11-03 00:29] wakes up
[1518-11-04 00:02] Guard #99 begins shift
[1518-11-04 00:36] falls asleep
[1518-11-04 00:46] wakes up
[1518-11-05 00:03] Guard #99 begins shift
[1518-11-05 00:45] falls asleep
[1518-11-05 00:55] wakes up
""")

assert File.read!("./input.txt")
|> find_sleep_constant == 30630
end

test "sleepiest_guard_minute" do
assert {99, 45} ==
sleepiest_guard_minute("""
[1518-11-01 00:00] Guard #10 begins shift
[1518-11-01 00:05] falls asleep
[1518-11-01 00:25] wakes up
[1518-11-01 00:30] falls asleep
[1518-11-01 00:55] wakes up
[1518-11-01 23:58] Guard #99 begins shift
[1518-11-02 00:40] falls asleep
[1518-11-02 00:50] wakes up
[1518-11-03 00:05] Guard #10 begins shift
[1518-11-03 00:24] falls asleep
[1518-11-03 00:29] wakes up
[1518-11-04 00:02] Guard #99 begins shift
[1518-11-04 00:36] falls asleep
[1518-11-04 00:46] wakes up
[1518-11-05 00:03] Guard #99 begins shift
[1518-11-05 00:45] falls asleep
[1518-11-05 00:55] wakes up
""")

assert {guard, min} =
File.read!("./input.txt")
|> sleepiest_guard_minute

assert guard * min == 99
end
end

Easy way to add frozen_string_literal magic string to your ruby files

1
2
3
4
5
comm -23 \
<(git ls-files|sort) \
<(git grep -l 'frozen_string_literal'|sort) \
| grep -E '\.rb$' \
| xargs -n1 sed -i '1s/^/# frozen_string_literal: true\n\n/'

The code is pretty self explanatory, we get a list of all the files in our repo and then remove the ones which already have the magic string and then filter it to just the ruby files and finally adding the magic string to all the files.

Solution to Advent of Code 2018 Day 3 in Elixir

Solving Day 3 turned out to be a bit more challenging for me as I don’t usually do these kind of exercises, Nevertheless it was fun!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209

defmodule Rect do
defstruct [:id, :left, :top, :width, :height]

alias __MODULE__

def parse(spec) do
[id, dimensions] = String.split(spec, "@", trim: true)
[coords, size] = String.split(dimensions, ":", trim: true)

[left, top] = String.split(coords, ",", trim: true) |> Enum.map(&parse_number/1)

[width, height] = String.split(size, "x", trim: true) |> Enum.map(&parse_number/1)

%Rect{
id: String.trim(id),
left: left,
top: top,
width: width,
height: height
}
end

defp parse_number(str), do: str |> String.trim() |> String.to_integer()

def order_horizontal(r1, r2) do
if r1.left < r2.left do
{r1, r2}
else
{r2, r1}
end
end

def order_vertical(r1, r2) do
if r1.top < r2.top do
{r1, r2}
else
{r2, r1}
end
end

def squares(%Rect{width: w, height: h}) when w <= 0 or h <= 0, do: []

def squares(%Rect{} = r) do
for x <- r.left..(r.left + r.width - 1), y <- r.top..(r.top + r.height - 1), do: {x, y}
end
end

defmodule Overlap do
def area(spec) when is_binary(spec) do
spec
|> String.split("\n", trim: true)
|> Enum.map(&Rect.parse/1)
|> area
end

def area(rects, prev_squares \\ [])

def area([h | tl], prev_squares) do
squares =
tl
|> Enum.map(fn x -> overlap(h, x) |> Rect.squares() end)

area(tl, [squares | prev_squares])
end

def area([], squares),
do:
squares
|> List.flatten()
|> Enum.uniq()
|> Enum.count()

def find_non_overlap(spec) when is_binary(spec) do
rects =
spec
|> String.split("\n", trim: true)
|> Enum.map(&Rect.parse/1)

find_non_overlap(rects, rects)
end

def find_non_overlap([h | tl], all_rects) do
if all_rects
|> Enum.filter(fn x -> x.id != h.id end)
|> Enum.all?(fn x ->
o = overlap(h, x)
o.width <= 0 || o.height <= 0
end) do
h
else
find_non_overlap(tl, all_rects)
end
end

def find_non_overlap([], _), do: raise("Not found")

def overlap(%Rect{} = r1, %Rect{} = r2) do
{l, r} = Rect.order_horizontal(r1, r2)

width = min(l.left + l.width - r.left, r.width)

{t, b} = Rect.order_vertical(r1, r2)

height = min(t.top + t.height - b.top, b.height)

%Rect{
left: r.left,
top: b.top,
width: width,
height: height
}
end
end

defmodule OverlapTest do
use ExUnit.Case

import Overlap

test "greets the world" do
assert area("""
# 1 @ 1,3: 4x4
# 2 @ 3,1: 4x4
# 3 @ 5,5: 2x2
""") == 4

assert area("""
# 1 @ 1,3: 4x4
# 2 @ 3,1: 4x4
# 3 @ 1,3: 4x4
""") == 16

assert File.read!("input.txt") |> area == 0
end

test "overlap between 2 rects" do
assert overlap(
%Rect{id: "# 1", left: 1, top: 3, width: 4, height: 8},
%Rect{id: "# 2", left: 3, top: 1, width: 4, height: 4}
) == %Rect{id: nil, left: 3, top: 3, width: 2, height: 2}

assert overlap(
%Rect{id: "# 1", left: 1, top: 3, width: 4, height: 4},
%Rect{id: "# 3", left: 5, top: 5, width: 2, height: 2}
) == %Rect{height: 2, id: nil, left: 5, top: 5, width: 0}
end

test "find_non_overlap" do
assert find_non_overlap("""
# 1 @ 1,3: 4x4
# 2 @ 3,1: 4x4
# 3 @ 5,5: 2x2
""").id == "# 3"

assert File.read!("input.txt") |> find_non_overlap == 0
end
end

defmodule RectTest do
use ExUnit.Case

test "parse" do
assert Rect.parse("# 1 @ 1,3: 4x3") == %Rect{id: "# 1", left: 1, top: 3, width: 4, height: 3}
end

test "order_horizontal" do
{%{id: "# 1"}, %{id: "# 2"}} =
Rect.order_horizontal(
%Rect{id: "# 1", left: 1, top: 3, width: 4, height: 3},
%Rect{id: "# 2", left: 3, top: 3, width: 4, height: 3}
)

{%{id: "# 4"}, %{id: "# 3"}} =
Rect.order_horizontal(
%Rect{id: "# 3", left: 10, top: 3, width: 4, height: 3},
%Rect{id: "# 4", left: 3, top: 3, width: 4, height: 3}
)
end

test "order_vertical" do
{%{id: "# 1"}, %{id: "# 2"}} =
Rect.order_vertical(
%Rect{id: "# 1", left: 1, top: 1, width: 4, height: 1},
%Rect{id: "# 2", left: 3, top: 3, width: 4, height: 3}
)

{%{id: "# 4"}, %{id: "# 3"}} =
Rect.order_vertical(
%Rect{id: "# 3", left: 10, top: 10, width: 4, height: 10},
%Rect{id: "# 4", left: 3, top: 3, width: 4, height: 3}
)
end

test "squares" do
assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: 2, height: 2}) == [
{1, 3},
{1, 4},
{2, 3},
{2, 4}
]

assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: 0, height: 0}) == []
assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: 0, height: 4}) == []
assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: 4, height: 0}) == []
assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: 4, height: -4}) == []
assert Rect.squares(%Rect{id: "# 1", left: 1, top: 3, width: -4, height: 4}) == []
end
end