psql is the best postgres client for developers. It gives you everything you need to explore a database schema and develop SQL.
0. Connect with a connection string:
psql postgresql://<user>:<password>@<hostname>/<databasename>
A postgres cluster can have multiple logical databases. Swap the database name to connect to a different database in the same cluster.
1. Explore the database with \d commands (describe)
\dns -- list schemas
\dt -- list tables
\d <tablename> -- table details
\df -- list functions
\df+ <fn name> -- fn details
\sf <fn name> -- fn source
If you are picking up on the pattern, you can add ‘+’ to most \d commands to get more info, and there is usually a \d command to describe something you are interested in.
2. Get help with SQL Syntax
\h create
\h create table
\h with recursive
This command also provides links to further documentation specific to your postgres version!
3. View your data in different ways
-- expanded view good for wide tables
\x
-- output results to file
\o <output filename>
-- stop file output
\o
-- repeat the last command
-- every N seconds
-- good for a 'live' view
\watch 1
--function for viewing jsonb
-- there is a json equivalent
select jsonb_pretty(json_blob)
from table1;
-- print timing of queries
\timing
4. Write self contained scripts for fast iteration
This is a workflow tip for using psql as a development tool. I rarely type queries directly into psql. Instead I type them in my text editor of choice and paste full self contained scripts into psql to test them.
rollback;
begin;
set search_path to org0_com;
insert into users (name)
select
'u' || s
from
-- great for creating
-- lots of dummy data
generate_series(1, 100000) s;
A bonus tip here is learning the generate_series function to generate lots of dummy data. Very useful to test rough performance characteristics for single queries.
5. .psqlrc to run commands on startup
.psqlrc is the default config file for psql. It takes regular SQL or psql slash commands and runs them whenever the client starts up. My psqlrc has this very useful snippet:
\set ON_ERROR_ROLLBACK on
By default if there is an error from running a statement, the whole transaction is aborted and you have to manually type ROLLBACK in order to reset. This implicitly sets a SAVEPOINT before every statement, and rolls back to that savepoint on errors, letting you continue using the transaction you were in. Beware using this to run production scripts where you really do want to abort the full transaction on errors.
6. Log things to debug in the middle of large scripts or functions
do $$ begin
raise log '
ORION %',
(
select count(*)
from table1
where action='I'
);
end;$$;
RAISE LOG is only available in plpgsql, the postgres procedural language. The DO block is an anonymous function which just lets you use plpgsql embedded in a pure SQL script.
7. Explain analyze buffers to understand performance
/*
WARNING: including ANALYZE
runs the actual query.
*/
-- output results to file
\o understand_performance
explain (analyze, buffers)
select * from users;
/*
Use EXPLAIN to show the plan
but not run the query.
*/
explain
select * from users;
EXPLAIN ANALYZE is already powerful tool for understanding single query performance. BUFFERS is a cherry on top because it is a more stable measure of performance than pure timing. Timing can change based on what other load the DB is handling, but buffers is a direct measure of how much IO postgres has to do to execute the query. If you are experimenting with queries, reducing buffers by 20% probably means you improved the query. But improving timing by 20% could be a statistical fluke.
8. Loops and dynamic SQL using plpgsql functions
-- loop over every schema
do $$
declare
schema_name text;
sql_snippet text := '
update users
set name=''user1''
where target=''user'';
';
begin
raise notice 'STARTING...';
-- loop through all schemas
-- that match the criteria
for schema_name in
select
distinct s.schema_name
from information_schema.schemata s
join information_schema.tables t
on s.schema_name = t.table_schema
where t.table_name
in ('users')
loop
raise notice
'processing schema: %',
schema_name;
execute
format(
'set search_path = %I; %s'
, schema_name,
sql_snippet);
end loop;
raise notice 'DONE';
end $$;
Some things can't be done in pure SQL. plpgsql is a useful procedural extension allowing you to introduce loops, conditionals, and log statements. It integrates nicely with pure SQL, and allows you run procedural logic without data needing to leave the database server. It has pitfalls though, it's not as composable as SQL, and the optimizer can't optimize across nested plpgsql like it can with SQL.
9. Understand locks and query activity
A fun exercise to understand locking is to open multiple terminals and hold transactions open that are holding locks.
--- Terminal 1:
rollback;
begin;
set search_path to org0_com;
alter table users
add column new_column text;
-- Terminal 2:
set search_path to org0_com;
select * from users;
In the second terminal you will see that the SELECT statement hangs until you terminate the alter table statement. This is because alter table acquires an exclusive lock on the users table.
Here are some important queries I keep in my back pocket when troubleshooting production issues that I suspect are due to locks.
-- show active queries
select
pid, query
from pg_stat_activity
where state='active';
-- show blocked pids
select
pid,
usename,
pg_blocking_pids(pid) blocker,
query blocked
from pg_stat_activity
where
cardinality(
pg_blocking_pids(pid)
) > 0;
-- kill blocking query
select pg_terminate_backend(<pid>);
10. Explore more on your own!
-- bulk load data
COPY
--More \ commands:
\e \c \p \s \i \pset \set
-- turn pagination off
-- to get all results
\pset pager off