settings
or plans
or products
(some apps have less than 1000products). You can do a quick size check of your tables by running the followingquery:1 | -- https://stackoverflow.com/a/21738732/24105 |
At this point you need to decide how much memory in your app you can allocatefor these lookup tables. In my experience for a moderately sized app you couldgo as high as 100MB tables. Make sure you add some metrics and benchmarkthis before and after doing any optimiztions.
Say you have 4 tables which are small enough to fit in memory, and which youread a lot from, the first thought that comes to mind is to use caching, andwhen someone says caching you reach for redis or memcache or some other networkservice. I would ask you to stop and think at this point, How would you cachein a way that is faster than redis or memcache?
Once you ask that question, the answer becomes obvious, you cache things in yourapp’s memory, if you have any data in your app’s memory you can just reach forit. Read this excellent gist to get a sense of the latency of different kinds of storage strategies.
When using your app’s memory you don’t have to pay the network cost plus theserialization/deserialization tax. Everytime you cache something in redis ormemcached, your app has to make a network call to these services and push out aserialized version of the data while saving it and do the opposite while readingit. This cost adds up if you do it on every page load.
I work with an app which keeps a website maintenance flag in memcache and thisends up adding 30ms to every request that hits our servers. There is a betterway! Move your settings to your app’s memory. This can easily be done bydefining something like below(in ruby):
1 | # config/initializers/settings.rb |
However, as they say one of the two hard problems in computer science is cacheinvalidation. What do you do when your data changes? This is the hard part.
The easiest strategy for this is to restart the server. This might be aperfectly valid strategy. We do restart our apps when config values change, sorestarting for lookup tables with low frequency changes is a fair strategy.
If that doesn’t work for your app because your lookup data changes frequently,let us say every 5 minutes, another strategy is to poll for this data. Theidea is simple:
Fortunately there is an easy way to see if there is any change on a table inpostgres, aggregate the whole table into a single text column and then computethe md5sum of it. This should change any time there is a change to the data.
1 | SELECT |
Output of this query1
2
3
4
5
6┌──────────────────────────────────┐
│ content_hash │
├──────────────────────────────────┤
│ 337f91e1e09b09e96b3413d27102c761 │
└──────────────────────────────────┘
(1 row)
Now, all you do is keep a tab on this content hash every 5 minutes or so andreload the tables when it changes.
Postgres has support for subscriptions, so you could add a mechanism where eachtable has a subscription that you push to whenever you modify data usingtriggers.https://www.postgresql.org/docs/10/sql-createsubscription.html
If all your changes go through the app through some kind of admin webpage, youcould also add pub/sub to broadcast an update whenever data is modified to whichall your app servers listen to and refresh the data.
Since elixir and erlang are all about concurrency, they lend themselves nicelyto this idiom. Let us see how this can be done in Elixir.
You could also build a button on your admin console which just pings a specificendpoint e.g. /admin/:table/cache-invalidate
and allow for manual cacheinvalidation. The handler for this would just reload the global data.
I feel like the polling strategy is the most robust with the least number ofmoving pieces. Please try this out in your app and let me know how this impactsyour performance.
In a future blog post, I’ll explain the elixir implemenation
]]>1 | # application.ex |
Once, the clustering was set up I wanted to try sending messages through thecluster and see how it performed, the simplest test I could think of was a batonrelay. Essentially, I spin up one GenServer per node and it relays a counter tothe next node, which sends it to the next node and so on like the picture below(psa, psb, psc, and psd are the names of the nodes):
The code for this ended up being very straightforward. We create a GenServer andmake one of the nodes a main_node
so that it can kick off the baton relay.And, whenever we get a counter with a :pass
message we increment the counterand forward it to the next node. Here is the full code:
1 | defmodule Herd.Baton.ErlangProcess do |
Finally, here is the Datadog graph for the counter, The big thing to note isthat the 4 GenServers on a local lan were able to pass around 100M messages in 8hours which amounts to about 3.5K messages per second which is impressive:
]]>I also wanted to track various metrics while running the cluster, so I set upDatadog APMs on all of them and since the pis usually run hot, I wanted to tracktheir temperatures for warning signs. Here is how you can send your temperatureinfo to Datadog.
Create 2 files, a temp.yaml
and a temp.py
(the names should match).
1 | # /etc/datadog-agent/conf.d/temp.yaml |
1 | # /etc/datadog-agent/checks.d/temp.py |
The meat of this code is the following, where we send a guage
metric namedcustom.temperature
and send it the temperature by reading/sys/class/thermal/thermal_zone0/temp
(this is how you can read thetemperature for a pi with ubuntu installed, you may have to tweak this bit forother distros)1
2
3
4
5
6self.gauge(
"custom.temperature",
(int(Path("/sys/class/thermal/thermal_zone0/temp").read_text().strip()) / 1000),
tags=[],
)
That’s it, you can tack other metrics in their too if you’d like to. You’ll alsoneed to restart your datadog agent for it to start sending these metrics.
]]>pg_dump
of your db: pg_dump -Fc --no-acl --no-owner --table forms my_forms_prod > my_forms_prod.pgdump
pg_restore
into a temporary scratch
database pg_restore --verbose --clean --no-acl --no-owner -d scratch my_forms_prod.dump
ALTER TABLE forms RENAME TO old_forms;
pg_dump -Fc --no-acl --no-owner scratch > my_old_forms_prod.pgdump
which will have the “RENAMED” table :Dpg_restore --verbose --clean --no-acl --no-owner -d my_new_forms_prod my_old_forms_prod.dump
This is just a hack though. Hope you find it useful 😀
]]>schema_migrations
table. Phoenix uses Ecto for managing thedatabase. Ecto and Rails use a table called schema_migrations
to store thedatabase migration info. So, you’ll have to rename it to avoid errors whenyou run Ecto migrations.1 | psql db |
mix ecto.create
command. This will set up theschema_migrations
table in the existing database.Now, you’ve successfully migrated your database. And, you can run yourPhoenix/Ecto migrations like you would in a normal phoenix app.
]]>There are a few wrong ways of sharing secrets, Make sure you don’t do any ofthese 🙂
The parameter store is a free and easy tool to save your secrets. There aremore fancy options like the secret manager, but they cost money.
One way of storing secrets is to create one parameter per environment variable,e.g. if you have an app called money, you could create parameters calledmoney_database_url
, money_secret_access_token
etc,. Make sure you createthem as ‘SecretString’ types. And then in your task definition. Use thefollowing code:
1 | { |
This will make your secrets available to your ECS container via environmentvariables called DATABASE_URL
and SECRET_ACCESS_TOKEN
. However, if you havelots of secrets, this becomes unweildy.
I create a file called secrets.json
with all the secrets (You can tweak thisstep, and use some other format)
1 | { |
Once I have all the secrets listed in this file. I pass it through the followingcommand:
1 | jq -c . < "secrets.json" | base64 --wrap 0 |
This strips the spaces in the json and base64 encodes it. I plug this value intoa single parameter called money_config
and then use the same strategy asbefore to pass it as an env var:
1 | "secrets": [ |
Now, in the app, I just decode base64 and then decode the json to get all thevalues. Here is how I do it in my Elixir apps:
1 | # config/releases.exs |
This approach allows you to use around 70 secrets in one parameter because paramater values are limited to a size of 4K characters.
If you have more than 70 environment variables you can add gzip
to the pipe to get in more environment variables in a single parameter.
1 | jq -c . < "secrets.json" | gzip | base64 --wrap 0 |
You’ll have to do things in the opposite order on your app to read this data. With gzip, You can get almost 140 env variables.
]]>There is a way you can do this by updating your data in small batches. The ideais to first find the ids of the records you want to update and then updating asmall batch of them in each transaction.
For our example, let us say we have a users
table which has 3M records createdin the year 2019 whose authentication token needs to be reset. Simple enough!
Doing this in a single update is the easiest and is possible if you don’t usethis table a lot. However, as I said, it is prone to deadlocks and statementtimeouts.
1 | UPDATE users |
Doing this through a CTE in multiple batches works, but is not the mostefficient.
1 | -- first get all the records that you want to update by using rolling OFFSETs |
That works. However, it is not the most efficient update. Because, for everybatch, (in this example a batch of 1000) we perform the filtering and orderingof all the data. So, we end up making the same query 3M/1K or 3000 times. Notthe most efficient use of our database resources!
So, to remove the inefficiency from the previous step, we can create a temporary table tostore the filtered user ids while we update the records. Also, since this is atemp table, it is discarded automatically once the session finishes.
1 | CREATE TEMP TABLE users_to_be_updated AS |
So, in the above SQL we are creating a temporary table containing a row_id whichis a serial number going from 1 to the total number of rows and also adding anindex on this because we’ll be using it in our batch update WHERE clause. And,finally doing our batch update by selecting the rows from 0..1000 in the firstiteration, 1000..2000 in the second iteration, and so on.
1 | # sql_generator.rb |
This tiny script generates a sql file which can then be executed via psql to dothe whole process in one fell swoop.
1 | # generate the sql file |
Once we have the sql file we run it through psql like so
1 | psql --echo-all --file=user_batch_update.psql "DATABASE_URL" |
That’s all folks, now your updates should be done in batches and shouldn’t causeany deadlocks or statement timeouts.
]]>1 | def find_product_in_current_contexts |
This code tries to find the first product in the current contexts in the orderthey are defined. However, the above code has a tiny bug. Can you figure outwhat it is?
In cases where there are no products in any of the contexts this functionreturns the array [1, 2, 3]
instead of returning nil
because Array.each
returns the array and in the case where we don’t find the product we don’treturn early.
We can easily fix this by adding an extra return at the end of the function.
1 | def find_product_in_current_contexts |
The fix is awkward, let us see if we can improve this.
We could use .map
to find a product for every context and return the first notnil
record like so:
1 | def find_product_in_current_contexts |
This looks much cleaner! And it doesn’t have the previous bug either. However,this code is not efficient, we want to return the first product we find for allthe contexts, and the above code always looks in all contexts even if it finds aproduct for the first context. We need to be lazy!
Calling .lazy
on an enumerable gives you a lazy enumerator and the neat thingabout that is it only executes the chain of functions as many times as needed.
Here is a short example which demonstrates its use:
1 | def find(id) |
As you can see from the above example, the lazy enumerator executes only as manytimes as necessary. Here is another example from the ruby docs, to drive thepoint home:
1 | irb> (1..Float::INFINITY).lazy.select(&:odd?).drop(10).take(2).to_a |
Now applying this to our code is pretty straightforward, we just need to add acall to #.lazy
before we map and we are all set!
1 | def find_product_in_current_contexts |
Ah, nice functional ruby!
]]>Enum
functions you want to use. Here are a few common use cases.Enum.map
You can use Enum.map
when you want to transform a set of elements intoanother set of elements. Also, note that the count of elements remainsunchanged. So, if you transform a list of 5 elements using Enum.map
, you getan output list containing exactly 5 elements, However, the shape of the elementsmight be different.
1 | # transform names into their lengths |
If you look at the count of input and output elements it remains the same,However, the shape is different, the input elements are all strings whereas theoutput elements are all numbers.
1 | # get ids of all users from a list of structs |
In this example we transform a list of maps to a list of numbers.
Enum.filter
When you want to whittle down your input list, use Enum.filter
, Filteringdoesn’t change the shape of the data, i.e. you are not transforming elements,and the shape of the input data will be the same as the shape of the outputdata. However, the count of elements will be different, to be more precise itwill be lesser or the same as the input list count.
1 | # filter a list to only get names which start with `m` |
The shape of data here is the same, we use a list of strings as the input andget a list of strings as an output, only the count has changed, in this case, wehave fewer elements.
1 | # filter a list of users to only get active users |
In this example too, the shape of the input elements is a map (user) and theshape of output elements is still a map.
Enum.reduce
The last of the commonly used Enum
functions is Enum.reduce
and it is alsoone of the most powerful functions. You can use Enum.reduce
when you need tochange the shape of the input list into something else, for instance a map
or anumber
.
Change a list of elements into a number by computing its product or sum
1 | iex> Enum.reduce( |
Enum.reduce
takes three arguments, the first is the input enumerable, which isusually a list or map, the second is the starting value of the accumulator andthe third is a function which is applied for each elementwhose result is then sent to the next function application as the accumulator.
Let’s try and understand this using an equivalent javascript example.
1 |
|
Let’s look at another example of converting an employee list into a mapcontaining an employee id and their name.
1 | iex> Enum.reduce( |
So, in a map you end up reducing an input list into one output value.
]]>A few days ago, I caused a minor incident by overloading our databases. Havingbeen away from ruby for a bit, I had forgotten that sidekiq runs multiplethreads per each worker instance. So, I ended up enqueuing about 10K jobs onSidekiq, and Sidekiq started executing them immediately. We have 50 workerinstances and run Sidekiq with a concurrency of 20. So, essentially we had 400worker threads ready to start crunching these jobs. Coincidentally we have 400database connections available and my batch background job ended up consumingall the connections for 5 minutes during which the other parts ofthe application were connection starved and started throwing errors 😬.
That was a dumb mistake. Whenever you find yourself making a dumb mistake,make sure that no one else can repeat that mistake. To fix that, we could set upour database with multiple users in such a way that the web app would connectwith a user which could only open a maximum of 100 connections, the backgroundworker with a user with its own limits and, so on. This would stop these kinds ofproblems from happening again. However, we’ll get there when we get there, asthis would require infrastructure changes.
I had another batch job lined up which had to process millions of rows in asimilar fashion. And, I started looking for solutions. A few solutions that weresuggested were to run these jobs on a single worker or a small set of workers,you can do this by having a custom queue for this job and executing a separatesidekiq instance just for this one queue. However, that would require someinfrastructure work. So, I started looking at other options.
I thought that redis might have something to help us here, and it did! So, redisallows you to make blocking pops from a list using the BLPOP
function. So, ifyou run BLPOP myjob 10
, it will pop the first available element in the list,However, if the list is empty, it will block for 10 seconds during which if anelement is inserted, it will pop it and return its value. Using this knowledge,I thought we could control the enqueuing based on the elements in the list. Theidea is simple.
n
elementswhere n
is the desired concurrency. So, if I seed this list with 2
elements, Sidekiq would execute only 2 jobs at any point in time, regardlessof the number of worker instances/concurrency of sidekiq workers.BLPOP
before itenqueues, so, as soon as the enqueuer starts, it pops the first 2 elements fromthe redis list and enqueues 2 jobs. At this point, the enqueuer is stuck till weadd more elements to the list.LPUSH
and as soonas an element is added the enqueuer which is blocked at BLPOP
pops thiselement and enqueues another job. This goes on till all your background jobsare enqueued, all the while making sure that there are never more than 2 jobsat any given time.Let’s put this into concrete ruby code.
1 | module ControlledConcurrency |
That’s all folks! Hope you find this useful!
The full code for this can be found at: https://github.com/minhajuddin/sidekiq-controlled-concurrency
]]>We want an application which spins up a cowboy server and renders a hello worldmessage. Here is the required code for that:
1 | defmodule Hello do |
And, here is a quick test to assert that it works!
1 | defmodule HelloTest do |
1 | <svg style='border: solid 1px #0f0' viewbox='0 0 200 200' stroke="#44337a" fill='#6b46c1'> |
Terraform has an awesome postgresqlprovider whichcan be used for managing databases, However there are a few parts which aretricky and needed trial and error to get right.
The first roadblock was that my RDS cluster wasn’t accessible publicly (which ishow it should be for security reasons). I do have a way to connect to mypostgres servers via a bastionhost.I thought we could use an SSH tunnel over the bastion host to get to our RDScluster from my local computer. However, terraform doesn’t supportconnecting to the postgres server over an SSH tunnel via its configuration.
So, it required a little bit of jerry-rigging. The postgresql provider was happyas long as it could reach the postgres cluster using a host, port and password.So, I set up a local tunnel outside terraform via my SSH config like so:
1 | Host bastion |
The relevant line here is the LocalForward
declaration which wires up a localport forward so that when you network traffic hits port 3333
on yourlocalhost
it is tunneled over the bastion and then the ecs server and isrouted to your cluster’s port 5432
. One thing to note here is that your ecscluster should be able to connect to your RDS cluster via proper security grouprules.
Once you have the ssh tunnel set up, you can start wiring up your postgresprovider for terraform like so:
1 | provider "postgresql" { |
The provider config is pretty straightforward, we point it to localhost:3333
with a root
user (which is the master user created by the rds cluster). So,when you connect to localhost:3333
, you are actually connecting to the RDScluster through an SSH tunnel (make sure that your ssh connection is open atthis point via ssh ecs1-pg
in a separate terminal). We also need to set thesuperuser
to false
because RDS doesn’t give us a postgres superuser, gettingthis wrong initially caused me a lot of frustration.
Now that our cluster connectivity is set up, we can start creating the databasesand users, each for one of our apps.
Below is a sensible configuration for a database called liveform_prod
and it’suser called liveform
.
1 | locals { |
A few things to note here:
liveform_prod
is owned by a new user called liveform
.5
, You should always set a sensibleconnection limit to prevent this app from crashing the cluster.5
and a statement timeout of 1minute which is big enough for web apps, you should set it to the leastduration which works for your app.random_password
resource) is used as thepassword of our new liveform
role. This can be viewed by runningterraform show
By default postgres allows all users to connect to all databases and create/viewfrom all the tables. We want our databases to be isolated properly so that auser for one app cannot access another app’s database. This requires running ofsome SQL on the newly created database. We can easily do this using anull_resource
and a local-exec
provisioner like so:
1 | resource "null_resource" "liveform_db_after_create" { |
./pg_database_roles_setup.sh
script:
1 |
|
The pg_database_roles_setup.sh
script connects to our rds cluster over the SSHtunnel to the newly created database as the newly created user and revokesconnect privileges for all users on this database, and then adds connectprivileges to the app user and the root user. You can add more queries to thisscript that you might want to run after the database is set up. Finally, thelocal-exec
provisioner passes the right data via environment variables andcalls the database setup script.
If you create a posgresql_role
before setting the connection’s superuser
tofalse
, you’ll get stuck trying to update or delete the new role. To work aroundthis, manually log in to the rds cluster via psql and DROP
the role, and removethis state from terraform using: terraform state rmpostgresql_role.liveform_db_role
Set up the bastion.tf
file like so:
1 | # get a reference to aws_ami.id using a data resource by finding the right AMI |
Set up the terraform.tfvars
file like so:1
2
3
4
5
6
7
8
9
10
11# Set this to `true` and do a `terraform apply` to spin up a bastion host
# and when you are done, set it to `false` and do another `terraform apply`
bastion_enabled = false
# My SSH keyname (without the .pem extension)
ssh_key_name = "hyperngn_aws_ohio"
# The IP of my computer. Do a `curl -sq icanhazip.com` to get it
# Look for the **ProTip** down below to automate this!
myip = ["247.39.103.23/32"]
Set up the vars.tf
file like so:1
2
3
4
5
6
7
8
9
10
11
12
13variable "ssh_key_name" {
description = "Name of AWS key pair"
}
variable "myip" {
type = list(string)
description = "My IP to allow SSH access into the bastion server"
}
variable "bastion_enabled" {
description = "Spins up a bastion host if enabled"
type = bool
}
Relevant sections from my vpc.tf
, you could just hardcode these values in thebastion.tf
or use data
if you’ve set these up manually and resources
ifyou use terraform to control them
1 | resource "aws_subnet" "subnet" { |
Finally you need to set up your ~/.ssh/config to use the bastion as the jumphost like so:
1 | # Bastion config |
Once you are done, you can just login by running the following command and itshould run seamlessly:
1 | ssh ecs1 |
Pro-Tip Put the following in your terraform folder’s .envrc, so that youdon’t have to manually copy paste your IP every time you bring your bastion hostup (You also need to have direnv for this to work).1
2$ cat .envrc
export TF_VAR_myip="[\"$(curl -sq icanhazip.com)/32\"]"
ssh -vv ecs1
command to get copiouslogs and read through all of them to figure out what might be wrong.User
, Ubuntu AMIs create a user calledubuntu
whereas Amazon ECS optimized AMIs create an ec2-user
user, If youget the user wrong ssh
will fail.From a security point of view this is a pretty great set up, your normal serversdon’t allow any SSH access (and in my case aren’t even public and are fronted byALBs). And your bastion host is not up all the time, and even when it is up, itonly allows traffic from your single IP. It also saves cost by tearing down thebastion instance when you don’t need it.
]]>As you can see, we have a many-to-many relation between the products and tagstables via a products_tags table which has just 2 columns the product_id
andthe tag_id
and it has a composite primary key (while also having an index onthe tag_id
to make lookups faster). The use of a join table is required,however, you usually want the join table to be invisible in your domain, as youdon’t want to deal with a ProductTag model, it doesn’t serve any purpose otherthan helping you bridge the object model with the relational model. Anyway, hereis how we ended up building the many-to-many relationship in Phoenix and Ecto.
We use a nondescript Core
context for our Product
model by running thefollowing scaffold code:
1 | mix phx.gen.html Core Product products name:string description:text |
This generates the following migration (I’ve omitted the boilerplate to makereading the relevant code easier):
1 | create table(:products) do |
Don’t forget to add the following to your router.ex
1 | resources "/products", ProductController |
Then, we add the Tag
in the same context by running the following scaffoldgenerator:
1 | mix phx.gen.html Core Tag tags name:string:unique |
This generates the following migration, note the unique index on name
, as wedon’t want tags with duplicate names, you might have separate tags per user inwhich case you would have a unique index on [:user_id, :name]
.
1 | create table(:tags) do |
Finally, we generate the migration for the join table products_tags
(byconvention it uses the pluralized names of both entities joined by an underscoreso products
and tags
joined by an _
gives us the name products_tags
).
1 | mix phx.gen.schema Core.ProductTag products_tags product_id:references:products tag_id:references:tags |
This scaffolded migration requires a few tweaks to make it look like thefollowing:
1 | create table(:products_tags, primary_key: false) do |
Note the following:
primary_key: false
declaration to the table()
function callto avoid creating a wasted id
column.timestamps()
declaration as we don’t want to trackinserts
and updates
on the joins. You might want to track inserts ifyou want to know when a product was tagged with a specific tag which makesthings a little more complex, so, we’ll avoid it for now., primary_key: true
to the :product_id
and :tag_id
linesto make [:product_id, :tag_id]
a composite primary keyNow our database is set up nicely for our many-to-many relationship. Here is howour tables look in the database:
1 | product_tags_demo_dev=# \d products |
Now comes the fun part, modifying our controllers and contexts to get our tagsworking!
The first thing we need to do is add a many_to_many relationship on the Product
schema like so:
1 | schema "products" do |
(Note, that we don’t need to add this relationship on the other side, i.e., Tag
to get this working)
Now, we need to modify our Product
form to show an input mechanism for tags,the easy way to do this is to ask the users to provide a comma-separated list oftags in an input textbox. A nicer way is to use a javascript library likeselect2.For us, a text box with comma-separated tags will suffice.
The easiest way to do this is to add a text field like so:1
2
3<%= label f, :tags %>
<%= text_input f, :tags %>
<%= error_tag f, :tags %>
However, as soon as you wire this up you’ll get an error on the /products/new
page like below:1
protocol Phoenix.HTML.Safe not implemented for #Ecto.Association.NotLoaded<association :tags is not loaded> of type Ecto.Association.NotLoaded (a struct).
to_string
function can’t convert anEcto.Association.NotLoaded
struct into a string, When you have a relation likea belongs_to
or has_one
or many_to_many
that isn’t loaded on a struct, ithas this default value. This is coming from our controller, we can remedy thisby changing our action to the following:
1 | def new(conn, _params) do |
Notice the tags: []
, we are creating a new product with an empty tagscollection so that it renders properly in the form.
Now that we have fixed our form, we can try submitting some tags through thisform, However, when you enter any tags and hit Save
it doesn’t do anythingwhich is not surprising because we haven’t set up the handling of these tags onthe backend yet.
We know that the tags
field has comma-separated tags, so we need to do thefollowing to be able to save a product.
:citext
(short for case insensitive text) read more about how to set up :citext
columns in my blog post about storing username/email in a case insensitivefashion).names
we can insert any new tags and then fetchthe existing tags, combine them, and use put_assoc
to put them on theproduct.Step #4 creates a race condition in your code which can happen when 2 requeststry to create tags with the same name at the same time. An easy way to workaround this is to treat all the tags as new and do an upsert usingRepo.insert_all
with an on_conflict: :nothing
option which adds the fragmentON CONFLICT DO NOTHING
to your SQL making your query run successfully even ifthere are tags with the same name in the database, it just doesn’t insert newtags. Also, note that this function inserts all the tags in a single query doinga bulk insert of all the input tags. Once you upsert
all the tags, you canthen find them and use a put_assoc
to create an association.
This is what ended up as the final Core.create_product
function:
1 | def create_product(attrs \\ %{}) do |
It does the following:
Repo.insert_all
withon_conflict: :nothing
in a single SQL query.put_assoc
to associate the tags with the newly created product.Ecto
takes over and makes sure that our product has the rightassociation records in the products_tags
tableNotice, how through all of our code we haven’t used the products_tags
tableexcept for defining the many_to_many
relationship in the Product
schema.
This is all you need to insert a product with multiple tags, However, we stillwant to show the tags of a product on the product details page. We can do thisby tweaking our action and the Core module like so:
1 | defmodule Core do |
Here we are preloading the tags with the product and we can use it in the viewlike below to show all the tags for a product:
1 | Tags: <%= (for tag <- @product.tags, do: tag.name) |> Enum.join(", ") %> |
This takes care of creating and showing a product with tags, However, if we tryto edit a product we are greeted with the following error:
1 | protocol Phoenix.HTML.Safe not implemented for #Ecto.Association.NotLoaded<association :tags is not loaded> of type Ecto.Association.NotLoaded (a struct). |
Hmmm, we have seen this before when we rendered a new Product without tags,However, in this case, our product does have tags but they haven’t beenloaded/preloaded. We can remedy that easily by tweaking our edit
action to thefollowing:
1 | def edit(conn, %{"id" => id}) do |
This gives us a new error:
1 | lists in Phoenix.HTML and templates may only contain integers representing bytes, binaries or other lists, got invalid entry: %ProductTagsDemo.Core.Tag{__meta__: #Ecto.Schema.Metadata<:loaded, "tags">, id: 1, inserted_at: ~N[2020-05-04 05:20:45], name: "phone", updated_at: ~N[2020-05-04 05:20:45]} |
This is because we are using a text_input
for a collection of tags and whenphoenix tries to convert the list of tags into a string it fails. This is a goodplace to add a custom input function:
1 | defmodule ProductTagsDemoWeb.ProductView do |
With this helper we can tweak our form to:1
2
3
4<%= label f, :tags %>
<%= tag_input f, :tags %>
<%= error_tag f, :tags %>
<small class="help-text">tags separated by commas</small>text_input
has been changed to tag_input
.
Now, when we go to edit a product, it should render the form with the tagsseparated by commas. However, updating the product by changing tags stilldoesn’t work because we haven’t updated our backend code to handle this. Tocomplete this, we need to tweak the controller and the Core
context like so:
1 | defmodule ProductTagsDemoWeb.ProductController do |
Note that in the controller we are using get_product_with_tags!
and in thecontext, we inserted a line to put_assoc
similar to the create_product
function which does the same things as create_product
.
Astute readers will observe that our create and update product implementationdoesn’t rollback newly created tags, when create_product
or update_product
fails. Let us handle this case and wrap our post!
Ecto provides Ecto.Multi
to allow easy database transaction handling. Thisjust needs changes to our context and our view like so:
1 | defmodule ProductTagsDemo.Core do |
Whew, that was long, but hopefully, this gives you a comprehensive understandingof how to handle many_to_many
relationships in Ecto and Phoenix.
The source code associated with this blog post can be found at https://github.com/minhajuddin/product_tags_demo
P.S. There is a lot of duplication in our final create_product
andupdate_product
functions, try removing the duplication in an elegant way! I’llshare my take on it in the next post!
pg_dump
doesn’t support exporting of partial tables. I lookedaround and found a utility called pg_samplewhich is supposed to help you with this. However, I wasn’t comfortable withinstalling this on my production server or letting my production data throughthis script. Thinking a little more made the solution obvious. The idea wassimple:tmp_page_caches
where page_caches
is the table thatyou want to copy using pg_dump
using the following SQL in psql
, thisgives you a lot of freedom on SELECTing just the rows you want.1 | CREATE TABLE tmp_page_caches AS (SELECT * FROM page_caches LIMIT 1000); |
pg_dump
as below. Here we are exporting the data toa sql file and transforming our table name to the original table namemidstream.1 | pg_dump app_production --table tmp_page_caches | sed 's/public.tmp_/public./' > page_caches.sql |
scp
and now run it against thelocal database:1 | scp minhajuddin@server.prod:page_caches.sql . |
1 | DROP TABLE tmp_page_caches; -- be careful not to drop the real table! |
Voila! We have successfully copied over a sample of our production table to ourlocal environment. Hope you find it useful.
]]>When I want to copy passwords to be used elsewhere from my browser, I usuallyopen the developer tools console, inspect element and click on the passwordinput box and then run the following code:
1 | copy($0.value) |
Chrome sets $0
to refer to the currently selected DOM element and $0.value
will give us the value of the password field and sending it to the copy
function copies this text to the OS clipboard.
I have a similar script set up for my terminal, when I want to copy the outputof a command like rake secret
I run the following command:
1 | rake secret | xc # copies a new secret to the clipboard. |
xc
is aliased to the following in my bashrc:
1 | alias xc='tee /dev/tty | xclip -selection clipboard' |
This command prints the output to the terminal (using tee /dev/tty
) and copiesit to the OS clipboard using the xclip
package.
I wanted the same ability in my ruby and elixir REPLs. It was prettystraightforward to do in ruby. Here is the annotated code:
1 | puts 'loading ~/.pryrc ...' |
Below is a similar script for Elixir:
1 | IO.puts("loading ~/.iex.exs") |
1 | iex(2)> :crypto.strong_rand_bytes(16) |> Base.encode16 |> H.copy |
All these utilities (except for the browser’s copy
function) depend on thexclip
utility which can be installed on ubuntu using sudo apt-get installxclip
. You can emulate the same behaviour on a Mac using the pbcopy
utility,you might have to tweak things a little bit, but it should be pretty straightforward.
You can do the same in your favorite programming language too, just find theright way to spawn an xclip
process and send the text you want to be copied toits’ stdin. Hope this makes your development a little more pleasant :)
citext
postgres extension actually did this ina very easy and straightforward way. So, I went back to my code and ripped outthe unnecessary complexity and here is what I ended up with:1 | defmodule SF.Repo.Migrations.EnableCitextExtension do |
So, the way citext works issimilar to our previous approach. If you want to get into all the gory detailsabout how citext is implemented you can check out the code on GitHub at:https://github.com/postgres/postgres/blob/6dd86c269d5b9a176f6c9f67ea61cc17fef9d860/contrib/citext/citext.c
]]>