Advanced Active Record: Optimizing Performance and Robustness with Locking, UUIDs, Fulltext Search, Database Views, and Geospatial Data
Active record advanced concepts
Now that you know how to avoid trouble with Active Record, it is time to take the next step. There are some more advanced database concepts that Rails developers should know (or at least be aware of) to, even more, improve the performance of the application and make its features more robust.
This article is an introduction to those concepts, so I will only barely touch the surface, and each topic deserves a separate article with a deeper dive into the code.
I tested all the presented examples against the PostgreSQL database, and not all concepts may be available on other database engines, or their implementations may vary. All my notes are based on years of hands-on experience at iRonin.IT - a top software development company, where we provide custom software development and IT staff augmentation services for a wide array of technologies.
Locking of the records
Imagine that you created a Rails application where managers can cooperate by managing important information about their company. There might be a case where two managers will attempt to edit the same information simultaneously.
If they save the changes nearly simultaneously, one of them won’t see the changes he introduced. If you want to avoid this or similar problems, you can introduce the locking of the records.
There are two types of locking: optimistic (positive) and pessimistic (negative). Let’s take a look at them.
Optimistic locking
In optimistic locking, you assume another person won’t update the same record simultaneously. If this is going to happen, the application should raise an error.
This behaviour is controlled with a column called lock_version
. Add this column to your model, and Rails will automatically apply optimistic locking. This is what it looks like in practice when you have the lock_version
column in your table:
company1 = Company.find(1)
company2 = Company.find(1)
company1.title = "Great company"
company1.save # => true
company2.title = "Awesome company"
company2.save # Attempted to update a stale object: Job. (ActiveRecord::StaleObjectError)
The second update will be rollbacked as someone has updated the record in the meantime. In this example, we process both records in the same console, but in real applications, it will control concurrent processes.
We name it optimistic locking because we allow reading the record, and we assume there will be no conflicts,s so there is no need to lock exclusively and disallow other people from dealing with this record.
Pessimistic locking
In the pessimistic mode, you eliminate the case where some other process will update or even fetch the record when the transaction is not committed. Let’s consider this case:
company = Company.find(1)
company.with_lock do
sleep(10)
company.update(title: 'Awesome')
end
The transaction will just hang for 10 seconds. You can open another terminal window and try to update the record:
company = Company.find(1)
company.update(title: 'Great')
You won’t be able to do this. The second transaction will hang until the locked one is finished. After the first transaction is finished, the second will finish, and both users will see the title as "Great".
If you would avoid fetching the record if the locked record is updated, you would have to replace Company.find(1)
with Company.lock.find(1)
. There are a few ways you can lock records with Rails, but since this article is just an introduction, I won’t go into details about it.
Integer vs. UUID as the primary key
In a typical Rails application, integers are used as the primary keys for your tables. However, sometimes you may need to use UUID instead, especially if you plan to expose the primary key to the users of the application or external systems.
UUID stands for unique universal identifier, and it’s a unique string in a similar format to 28059680-d3d5-11ed-afa1-0242ac120002. Some use cases for UUIDs include:
Distributed databases - when data is stored on multiple servers, you can avoid primary key conflicts by using UUID instead.
Large-scale systems - in large systems, the generation of primary keys may be problematic, and when using UUID, you can make it more effortless.
Security and private - when you don’t want someone to guess the primary id of other records.
Using UUID brings advantages as well as disadvantages, but in this article, I want just to show you that it exists and how you can use it within the Rails application.
Configuration
There are two steps. One is to enable the pg crypto extension, and the second is to let Rails know that we would like to use uuid instead of id when generating migrations for new tables. In PostgreSQL 13 and later, you no longer need pgcrypto
extension so you can skip the migration step.
Here is the migration:
class EnablePgCrypto < ActiveRecord::Migration[7.0]
def change
enable_extension 'pgcrypto'
end
end
Now, create config/initializers/generators.rb
file with the following contents:
Rails.application.config.generators do |g|
g.orm :active_record, primary_key_type: :uuid
end
Usage
Let’s check if the configuration is working as expected:
rails g model Article title:string
rails db:migrate
Create a record and investigate the value of the id
attribute:
article = Article.create!(title: 'Active Record good practices')
article.id
# => "f72ca708-c9db-4f8f-b9e1-2706e2b02467"
Fulltext search
In PostgreSQL, the full-text search feature allows us to efficiently search with a given term through one or more related columns in the database. What’s more, there is an excellent gem for that!
Like in the case of other topics, this topic is also very broad, and for sure, there is enough information to write a series of articles dedicated only to the case of full-text search. This time I will just scratch the surface with the pg_search gem to arouse your curiosity.
Is like expression not enough?
No. If you want to perform a search using multiple words on multiple columns and eliminate spelling mistakes, the like expression won’t be enough. A simple example: an application with job offers and a search bar. User types “Rails” and you would like to find this keyword in the category, title, and description of the job offer.
Configuration
There are two steps: add a gem to the application and configuration in the model class for which you would like to use full-text search. The gem part:
bundle add pg_search
We can also generate some model and put data into it:
rails g model Job title:string description:text
rails db:migrate
Now, we can put the configuration inside the model:
class Job < ApplicationRecord
include PgSearch::Model
pg_search_scope :search_job,
against: { title: 'A', description: 'B' },
using: { tsearch: { dictionary: 'english' } }
end
With the above configuration, we will search using title and description columns (but the title has priority) and normalize the contents of those columns using the English dictionary. It may sound a little bit mysterious, but if you want to know more right now, there is an excellent article from which I took the configuration example.
Usage
The first argument of the pg_search_scope
method is the class method we can use to perform the search:
Job.search_job("Rails developer")
The above call will generate some complex query, and if you have matching records in the database, it will return an ordinary Active Record collection. We can improve the performance of the query, but it’s beyond the article’s scope.
Database views
Simply put, a database view results from a SQL query stored inside the database management system. There are three types of such views: read-only, updatable, and materialized. Since we are touching on a broader topic, I will go with an example of the materialized view provided by the well-known gem called scenic.
Database views are used to get better performance, isolate a level of abstraction, and don’t repeat the same more or less complex SQL query definitions in the application’s code.
The idea behind the database view
Imagine that in your application, there is a table called activities. It contains millions of records related to the actions performed in the system. Often, you want to generate detailed reports for the specific actions performed in the specified period: week, month, or even year.
To generate the report data, you perform long and complex SQL query containing many conditions and joins. Wouldn’t it be easier if you could just use a model called FinancialDepartmentReport
and pass simple where
with the time frames? It’s possible with the database views. Let’s configure the scenic gem in our Rails application, and I will demonstrate the materialized view in a much simpler and quicker case.
Configuration
Start with adding the scenic gem to your application:
bundle add scenic
For the demonstration purpose, I will also generate the Person
model with the column for storing the person’s name, location, and age:
rails g model Person name:string location:string age:integer
rails db:migrate
And generate some test data that we can operate on:
Person.create!(name: "John Doe", age: 30, location: "United States")
Person.create!(name: "Tina Doe", age: 30, location: "Germany")
Person.create!(name: "Tim Doe", age: 15, location: "Germany")
Person.create!(name: "Alex Doe", age: 14, location: "United States")
Our goal is to create a database view that will contain only adult people from the United States. Instead of doing Person.where(“location = ? AND age > 17”, “United States”)
I would like to be able to call UnitedStatesAdult.all
.
View creation
The scenic gem provides commands similar to the ones we use to create models:
rails generate scenic:model united_states_adult --materialized
The above command will generate the following files:
View file - SQL file where you need to put the SQL query that is going to be used to pull the information into view
Migration file - it will create the view based on the SQL query you provided in the view file
Model - a class that you can use as an ordinary model to query the data from the view
In our case, the SQL query for the view is simple:
SELECT "people".* FROM "people" WHERE (location = 'United States' AND age > 17)
After you save the file, you can run migrations to create the view:
rails db:migrate
Open the rails console and test your new view:
UnitedStatesAdult.pluck(:name)
# => ["John Doe"]
View update
The view is not updated automatically when you will change the information in the people
table. In the UnitedStatesAdult
, there is a method called referesh
- you need to call it to regenerate the view so it contains information that is up to date.
There is a lot more
The code that I presented above is just a simple example of one of the database view types - materialized. There is a lot more to explore in terms of the database views, but it’s a topic for the next article.
Geospatial data
If your application is dealing with information regarding locations, then we can say that you are dealing with geospatial data. If we want to perform more advanced calculations on a database level, we need to familiarize ourselves with GIS.
GIS is a shortcut for Geographic Information System. The good news is that in Rails, we have a gem called activerecord-postgis-adapter
which provides support for GIS in the PostgreSQL database. Let’s quickly configure it and demonstrate how powerful it is.
Configuration
First, you must install the PostGIS extension in your system - https://postgis.net/install/. I decided to set up everything with Docker and pull the dedicated PostgreSQL image to save some time. I followed a few steps to make everything work as expected:
I created Dockerfile for a newly created Rails project
I created a docker-compose.yml file to set up the database and the server
I updated Gemfile with the
activerecord-postgis-adapter
gemI updated the config/database.yml file to reflect that we would like to use a custom adapter
Here is my Dockerfile
:
FROM ruby:3.2.1-alpine
COPY Gemfile* /app/
WORKDIR /app
RUN apk update && apk add build-base git libpq-dev nodejs postgresql-client postgresql
RUN gem install bundler -v 2.3.17
RUN bundle install
COPY . .
EXPOSE 3000
CMD rm -f /tmp/server.pid && rails server -b 0.0.0.0 -P /tmp/server.pid
And corresponding docker-compose.yml
:
version: '3.7'
services:
db:
image: mdillon/postgis:latest
volumes:
- 'postgres:/var/lib/postgresql/data'
environment:
POSTGRES_HOST_AUTH_METHOD: trust
web:
build:
context: .
dockerfile: Dockerfile
tty: true
ports:
- "3000:3000"
environment:
DATABASE_HOST: db
PGUSER: postgres
depends_on:
- db
volumes:
- '.:/app'
volumes:
postgres:
The config/database.yml
file:
default: &default
adapter: postgis
encoding: unicode
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
host: <%= ENV.fetch("DATABASE_HOST") { 'localhost' } %>
development:
<<: *default
database: locator_development
Notice the adapter
value; without this changed adapter, the application won’t work the way we expect. Now, you can run docker-compose up –build
to install and run all components of the application.
Data preparation
We need to fill our database with information about locations so I can demonstrate how we can deal with the geospatial data. I decided to create Airport
and City
model. Each model contains name
field and lonlat
field for storing coordinates.
The migrations are the following:
enable_extension 'postgis'
create_table :cities do |t|
t.string :name
t.st_point :lonlat, geographic: true
t.timestamps
end
create_table :airports do |t|
t.string :name
t.st_point :lonlat, geographic: true
t.timestamps
end
Notice the st_point
column type. It is one of the custom column types added by the gem we added to handle the PostGIS extension in our database. It allows us to store, query, and manage coordinates.
The next step is to insert some data that we can query later (I used airports and cities in Poland):
Airport.create!(name: 'Lech Wałęsa Airport', lonlat: "POINT (18.459664828 54.373165174)")
Airport.create!(name: 'Katowice Airport', lonlat: "POINT (19.074666368 50.471164782)")
City.create!(name: 'Warsaw', lonlat: "POINT (21.017532 52.237049)")
City.create!(name: 'Olsztyn', lonlat: "POINT (20.490189 53.770226)")
City.create!(name: 'Gdańsk', lonlat: "POINT (18.638306 54.372158)")
Querying the data
There is a lot to explore, but I will just barely touch the surface. Let’s say that we want to check which airport is the closest to Warsaw city:
city = City.find_by(name: 'Warsaw')
airport_table = Airport.arel_table
closest_airport = Airport.order(airport_table[:lonlat].st_distance(city.lonlat)).first
puts closest_airport.name
# => "Katowice Airport"
Thanks to jrochkind for pointing out that pgcrypto
extension is not needed for UUID support in Postgres 13 and later.
Didn't get enough of Ruby?
Check out our free books about Ruby to level up your skills and become a better software developer.