SQL view is like a virtual table, combines one or more tables (or even other views), and holds the SQL statement from which it was created. To the user, it looks like a normal table with columns and rows. Views are a way to create custom records subsets:
- simple views, which allows INSERT, UPDATE, or DELETE
- more complex views are read-only
The view is created using the CREATE VIEW
statement, whileCREATE OR REPLACE VIEW
replaces it if it already exists, andDROP VIEW
removes the view.
Setup
All examples are done in Rails 7.0.4 (with Ruby 3.2.2) and PostgreSQL (15.1). To execute SQL statements from examples use any SQL client you like.
Create a new Rails app using PostgreSQL DB.
rails new blog -d postgresql
Add migrations and models
# db/migrate/20230102132134_create_authors.rb
class CreateAuthors < ActiveRecord::Migration[7.0]
def change
create_table :authors do |t|
t.string :first_name
t.string :last_name
t.string :role
end
end
end
# db/migrate/20230102132140_create_posts.rb
class CreatePosts < ActiveRecord::Migration[7.0]
def change
create_table :posts do |t|
t.references :author
t.string :title
t.text :content
t.string :tag
t.datetime :published_at
t.timestamps
end
end
end
# app/models/author.rb
class Author < ApplicationRecord
has_many :posts
end
# app/models/post.rb
class Post < ApplicationRecord
belongs_to :author, optional: true
end
Create DB and run migrations:
rails db:create db:migrate
Simple view
View which queries from one table doesn’t use WITH
, DISTINCT
, GROUP BY
, HAVING
, LIMIT
, OFFSET
, UNION
, INTERSECT
, EXCEPT
clauses, or selects columns without aggregate functions is considered as a simple view. Such views are updatable, so could be used to add/update/remove records to the table it uses. One of the reasons to use simple views is to restrict access to certain data. This could be done with a WHERE
clause.
The following example will create a view that shows only authors with a dev
role.
CREATE VIEW authors_devs AS
SELECT *
FROM authors
WHERE role = 'dev'
Because it's updatable, INSERT is possible:
INSERT INTO authors_devs (first_name, last_name, role)
VALUES ('Jane', 'Doe', 'dev'),
('Joe', 'Foo', 'qa'),
('Jack', 'Bar', 'dev');
INSERT 0 3
It adds 3 rows to the authors’ table:
SELECT * FROM authors;
id | first_name | last_name | role
----+------------+-----------+------
1 | Jane | Doe | dev
2 | Joe | Foo | qa
3 | Jack | Bar | dev
(3 rows)
but only two rows when querying from authors_devs:
SELECT * FROM authors_devs;
id | first_name | last_name | role
----+------------+-----------+------
1 | Jane | Doe | dev
3 | Jack | Bar | dev
(2 rows)
It is possible to update the author with id 3:
UPDATE authors_devs SET first_name = 'John' WHERE id=3;
UPDATE 1
but not one with id 2:
UPDATE authors_devs SET first_name = 'Jessika' WHERE id=2;
UPDATE 0
Similarly deleting is restricted:
DELETE FROM authors_devs WHERE id=3;
DELETE 1
DELETE FROM authors_devs WHERE id=2;
DELETE 0
Complex view
When creating views that join multiple tables, group results and use aggregate functions is the complex view. It is used only to organize data. Typical usage is some kind of report or summary.
The next view example is a summary of authors and posts.
CREATE VIEW author_statistics AS
SELECT
CONCAT(a.first_name, ' ', a.last_name) AS author,
COUNT(DISTINCT p) AS posts_count,
JSON_OBJECT_AGG(DISTINCT tag_counts.name, tag_counts.count) AS tags_count,
TO_CHAR(MAX(p.published_at), 'DD.MM.YYYY HH24:MI') AS last_post_at
FROM authors a
JOIN posts p ON a.id = p.author_id
JOIN (
SELECT
a.id AS author_id,
p.tag AS name,
COUNT(p.tag) AS count
FROM authors a
JOIN posts p ON a.id = p.author_id
GROUP BY a.id, p.tag
) tag_counts ON tag_counts.author_id = a.id
GROUP BY a.id
ORDER BY MAX(p.published_at) DESC
Such a complex view generates a nice table:
SELECT * FROM author_statistics;
author | posts_count | tags_count | last_post_at
-----------------------+-------------+-----------------------------------------------------+------------------
Norbert Dare | 19 | { "backend" : 9, "frontend" : 6, "fullstack" : 4 } | 03.01.2023 11:43
Granville Schmidt | 12 | { "backend" : 3, "frontend" : 4, "fullstack" : 5 } | 03.01.2023 11:30
...
Such views are not automatically updateable. To enable INSERT
, UPDATE
, DELETE
, or TRUNCATE
events on that view, an INSTEAD OF
trigger could be added.
Materialized views
Normal views don’t speed up querying from the database. It’s only a layer to organize the data.
If a query takes a lot of time to execute, it is worth creating a materialized view, it will store the results of the query. This kind of view is perfect for data that is often requested but doesn't change often. A good example is a daily report, which is generated (view is refreshed) once a day.
CREATE MATERIALIZED VIEW author_statistics AS
SELECT ...
To replace the content of the materialized view:
REFRESH MATERIALIZED VIEW author_statistics
Materialized views are regular tables, therefor it is possible to create indexes on them.
Rails
SQL views are treated as a normal database table, so we can add an ActiveRecord model and use AR query language as usual. Having author_statistics
view and AuthorStatistic
AR class model, the following is possible:
> AuthorStatistic.all
AuthorStatistic Load (190.7ms) SELECT "author_statistics".* FROM "author_statistics"
> AuthorStatistic.where('last_post_at > ?', 1.month.ago)
AuthorStatistic Load (202.0ms) SELECT "author_statistics".* FROM "author_statistics" WHERE (last_post_at > '2022-12-11 14:32:39.480871')
We learned how to create SQL views by executing CREATE VIEW
statements in an SQL client. It is fine for fast-checking some solutions without the need to implement them and commit to the repo. When we decide to utilize SQL views in our Rails app a way to store and manage views is needed. A gem that delivers a solution for it is scenic
. Let's take a closer look at it.
Scenic gem
https://github.com/scenic-views/scenic
This is a perfect tool to create SQL views in migrations, keep view's SELECT statements in separate files and keep track of views versions when they evolve.
Scenic has two generators that help with views:
scenic:model
scenic:view
scenic:model
- generates migration file, SQL statement file and model file, whereas scenic:view
does the same without model file.
% rails generate scenic:model author_statistics
create app/models/author_statistic.rb
create db/views/author_statistics_v01.sql
create db/migrate/20230103144413_create_author_statistics.rb
# db/migrate/20230103144413_create_author_statistics
class CreateAuthorStatistics < ActiveRecord::Migration[7.0]
def change
create_view :author_statistics
end
end
# app/models/author_statistic.rb
class AuthorStatistic < ApplicationRecord
end
It is possible to create a materialized view with option --materialized
.
% rails generate scenic:model materialized_author_statistics --materialized
create app/models/materialized_author_statistic.rb
create db/views/materialized_author_statistics_v01.sql
create db/migrate/20230103150813_create_materialized_author_statistics.rb
# 20230103150813_create_materialized_author_statistics
class CreateMaterializedAuthorStatistics < ActiveRecord::Migration[7.0]
def change
create_view :materialized_author_statistics, materialized: true
end
end
# app/models/materialized_author_statistic.rb
class MaterializedAuthorStatistic < ApplicationRecord
# calls refresh on view
# use concurrently: true to not lock view during refresh
def self.refresh
Scenic.database.refresh_materialized_view(table_name, concurrently: false, cascade: false)
end
end
Under db/views/author_statistics_v01.sql
and db/views/materialized_author_statistics_v01.sql
put SQL code
SELECT
CONCAT(a.first_name, ' ', a.last_name) AS author,
COUNT(DISTINCT p) AS posts_count,
JSON_OBJECT_AGG(DISTINCT tag_counts.name, tag_counts.count) AS tags_count,
TO_CHAR(MAX(p.published_at), 'DD.MM.YYYY HH24:MI') AS last_post_at
FROM authors a
JOIN posts p ON a.id = p.author_id
JOIN (
SELECT
a.id AS author_id,
p.tag AS name,
COUNT(p.tag) AS count
FROM authors a
JOIN posts p ON a.id = p.author_id
GROUP BY a.id, p.tag
) tag_counts ON tag_counts.author_id = a.id
GROUP BY a.id
ORDER BY MAX(p.published_at) DESC
Run migrations to create defined views (we will use them later):
% rails db:migrate
Both generators bump up the view version when using the same name:
% rails generate scenic:view author_statistics
create db/views/author_statistics_v02.sql
create db/migrate/20230103145259_update_author_statistics_to_version_2.rb
The new version file - db/views/author_statistics_v02.sql
- contains previous version of SQL, ready for modifications.
# 20230103145259_update_author_statistics_to_version_2.rb
class UpdateAuthorStatisticsToVersion2 < ActiveRecord::Migration[7.0]
def change
update_view :author_statistics, version: 2, revert_to_version: 1
end
end
Benchmarks
Sometimes it is worth measuring different approaches to solving problems. Here we have a fetch from created earlier SQL views and running the same query using plain SQL.
Setup
author_statistics
andmaterialized_author_statistics
SQL views exist with corresponding AR models.- Benchmarks rake tasks use the
Benchmark
module from the Ruby standard library to measure time and gembenchmark-memory
to measure memory. - Seed DB with a significant amount of data
# db/seed.rb
# just random data using Faker
require 'faker'
1000.times do
Author.create(
first_name: Faker::Name.first_name,
last_name: Faker::Name.last_name,
role: ['dev', 'pm', 'qa'].sample
)
end
10000.times do
Post.create(
title: Faker::Lorem.sentence(word_count: 3),
content: Faker::Lorem.paragraphs.join("\n"),
tag: ['backend', 'frontend', 'fullstack'].sample,
author: Author.all.sample,
published_at: rand(24*60*100).minutes.ago
)
end
SQL execute vs SQL view vs Materialized SQL view
For benchmark purpose let's have 5 methods, delivering the same result, but in a different way.
- sql_execute - execute SQL defined in string
- view_model_find - use AR model to fetch data
- view_sql_execute - same as above, but without using a model
- materialized_view_model_find - use AR model for materialized with
- materialized_view_sql_execute - query from view without model
#lib/tasks/benchmarks.rake
ITERATION_NUMBER = ENV.fetch('ITERATION_NUMBER', 100).to_i
def process_result(result)
{
author: result['author'],
posts_count: result['posts_count'],
last_post_at: result['last_post_at']
}
end
namespace :benchmarks do
desc 'author_stats'
task author_stats: :environment do
def sql_execute
sql = <<~SQL
SELECT
...
SQL
ITERATION_NUMBER.times do
ActiveRecord::Base.connection.exec_query(sql)
.map { |result| process_result(result) }
end
end
def view_model_find
ITERATION_NUMBER.times do
AuthorStatistic.select(:author, :posts_count, :last_post_at)
.map { |result| process_result(result) }
end
end
def view_sql_execute
ITERATION_NUMBER.times do
ActiveRecord::Base.connection
.exec_query('SELECT * FROM author_statistics')
.map { |result| process_result(result) }
end
end
def materialized_view_model_find
ITERATION_NUMBER.times do
MaterializedAuthorStatistic.select(:author, :posts_count, :last_post_at)
.map { |result| process_result(result) }
end
end
def materialized_view_sql_execute
ITERATION_NUMBER.times do
ActiveRecord::Base.connection
.exec_query('SELECT * FROM materialized_author_statistics')
.map { |result| process_result(result) }
end
end
functions = [
'sql_execute',
'view_model_find',
'view_sql_execute',
'materialized_view_model_find',
'materialized_view_sql_execute'
]
puts "Time measures:"
Benchmark.bmbm do |x|
functions.each do |function|
x.report("#{function}:") do
send(function)
end
end
end
puts "Memmory measures:"
Benchmark.memory do |x|
functions.each do |function|
x.report("#{function}:") do
send(function)
end
end
end
end
end
Run the rake task:
rake benchmarks:author_stats
**Time measures:**
user system total real
sql_execute: 0.486172 0.036185 0.522357 ( 4.892079)
view_model_find: 1.192921 0.017537 1.210458 ( 4.632283)
view_sql_execute: 0.483697 0.036375 0.520072 ( 4.770821)
materialized_view_model_find: 0.474464 0.006046 0.480510 ( 0.547047)
materialized_view_sql_execute: 0.103913 0.008027 0.111940 ( 0.197640)
**Memmory measures:**
sql_execute: 69.763M memsize
624.501k object
50.000 strings
view_model_find: 139.387M memsize
1.656M objects
50.000 strings
view_sql_execute: 69.516M memsize
624.600k objects
50.000 strings
materialized_view_model_find: 139.377M memsize
1.656M objects
50.000 strings
materialized_view_sql_execute: 69.522M memsize
624.600k object
50.000 strings
Looking at the results there few conclusions:
- As expected querying from SQL view isn’t faster than executing plain SQL. View in fact executes the same SQL.
- Using
ActiveRecord::Base.connection.exec_query
is faster than using ActiveRecord Query language - Querying from materialized view proves that it’s faster
- Using
ActiveRecord::Base.connection.exec_query
consumes less memory (doesn’t need to initialize AR objects).
SQL execute vs Active Record query
Views are written in raw SQL. This means that these could use aggregates, mathematical functions, and operators. Active Records support some of the functions like count
, sum
, min
, and max
, but other than these you need to use Ruby code to achieve the same results as SQL functions.
In this example, we benchmark 3 different functions giving concatenated first and last name, and an array of post titles.
- active_record_query - use only Ruby code
- sql_query - use only SQL code
- mixed_active_record_with_sql - mixture of above
#lib/tasks/benchmarks.rake
ITERATION_NUMBER = ENV.fetch('ITERATION_NUMBER', 10).to_i
namespace :benchmarks do
desc 'author_titles'
task author_titles: :environment do
# active record
def active_record_query
ITERATION_NUMBER.times do
Author.includes(:posts).map do |author|
{
name: "#{author.first_name} #{author.last_name}",
titles: author.posts.map(&:title)
}
end
end
end
def sql_query
sql = <<~SQL
SELECT
CONCAT(a.first_name, ' ', a.last_name) AS name,
ARRAY_AGG(p.title) AS titles
FROM authors a
JOIN posts p ON a.id = p.author_id
GROUP BY a.id
SQL
ITERATION_NUMBER.times do
ActiveRecord::Base.connection.exec_query(sql).map do |result|
{
name: result['name'],
titles: result['titles']
}
end
end
end
def mixed_active_record_with_sql
ITERATION_NUMBER.times do
Author.joins(:posts)
.select(
"CONCAT(authors.first_name, ' ', authors.last_name) AS name",
"ARRAY_AGG(posts.title) AS titles"
)
.group(:id).map do |result|
{
name: result.name,
titles: result.titles
}
end
end
end
functions = [
'active_record_query',
'sql_query',
'mixed_active_record_with_sql'
]
puts "Time measures:"
Benchmark.bmbm do |x|
functions.each do |function|
x.report("#{function}:") do
send(function)
end
end
end
puts "Memmory measures:"
Benchmark.memory do |x|
functions.each do |function|
x.report("#{function}:") do
send(function)
end
end
end
end
end
Run the rake task:
rake benchmarks:author_titles
**Time measures:**
user system total real
active_record_query 2.315652 0.019930 2.335582 ( 2.824983)
sql_query: 0.008803 0.001647 0.010450 ( 0.204779)
mixed_active_record_with_sql 0.134432 0.001860 0.136292 ( 0.288891)
**Memory measures:**
active_record_query 250.648M memsize
2.458M objects
50.000 strings
sql_query: 9.588M memsize
52.451k objects
50.000 strings
mixed_active_record_with_sql 48.192M memsize
552.790k objects
50.000 strings
The conclusion from the results is that using SQL is way faster and consumes less memory compared to Ruby code.
Summary
Now, you may ask why anybody needs SQL views if the same thing could be done with Active Record Query Interface. Well, there are a few reasons that could be considered:
- This kind of views are easy to create and might be used as a read model
- Combine many tables in one logical table
- Might be used across different services, keeping some business logic in SQL statements
- Materialized views are a great help when need to speed up read time
- SQL aggregate functions are in general faster than doing the same in Ruby code