I’ve been helping a friend launch a site he purchased as a way of earning some passive income. The site itself is really old and we’ve rewritten it from scratch, including migrating it from MySQL to PostgreSQL.
As part of the migration process, I’ve occasionally needed to send a dump of my database to him or to the staging server. In doing so, we’re often met with the following error:
PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "posts_pkey"
This error means that PostgreSQL tried to use a value that already exists for the primary key of the new record. PostgreSQL keeps track of what the next auto-incremented primary key value needs to be separately from the table, rather than merely incrementing upon the current max value. Presumably this is to avoid potential race conditions. When the sequence and the table’s primary key get out of sync, we end up with the error above.
When loading a database from another source, if sequences aren’t included in the dump, the sequences are likely to get out of sync from where they are supposed to be. To resolve this issue, the PostgreSQL Wiki has a solution:
SELECT 'SELECT SETVAL(' ||
quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';'
FROM pg_class AS S,
pg_depend AS D,
pg_class AS T,
pg_attribute AS C,
pg_tables AS PGT
WHERE S.relkind = 'S'
AND S.oid = D.objid
AND D.refobjid = T.oid
AND D.refobjid = C.attrelid
AND D.refobjsubid = C.attnum
AND T.relname = PGT.tablename
ORDER BY S.relname;
If you were to run this in the PostgreSQL client, you would see output like this:
query
------------------------------------------------------------------------------------------------------------
SELECT SETVAL('public.authentications_id_seq', COALESCE(MAX(id), 1) ) FROM public.authentications;
SELECT SETVAL('public.brands_id_seq', COALESCE(MAX(id), 1) ) FROM public.brands;
SELECT SETVAL('public.comments_id_seq', COALESCE(MAX(id), 1) ) FROM public.comments;
SELECT SETVAL('public.follows_id_seq', COALESCE(MAX(id), 1) ) FROM public.follows;
SELECT SETVAL('public.pg_search_documents_id_seq', COALESCE(MAX(id), 1) ) FROM public.pg_search_documents;
SELECT SETVAL('public.posts_id_seq', COALESCE(MAX(id), 1) ) FROM public.posts;
SELECT SETVAL('public.series_id_seq', COALESCE(MAX(id), 1) ) FROM public.series;
SELECT SETVAL('public.users_id_seq', COALESCE(MAX(id), 1) ) FROM public.users;
SELECT SETVAL('public.votes_id_seq', COALESCE(MAX(id), 1) ) FROM public.votes;
(10 rows)
Time: 2.624 ms
Although we could cut and paste that output into the Postres client each time we needed to resolve this issue, it makes more sense to automate things; and we can do that with a simple rake task:
namespace :db do
desc "reset sequences for a specific table or all tables"
task :sequence_reset => :environment do
sql = <<-SQL
SELECT 'SELECT SETVAL(' ||
quote_literal(quote_ident(PGT.schemaname) || '.' || quote_ident(S.relname)) ||
', COALESCE(MAX(' ||quote_ident(C.attname)|| '), 1) ) FROM ' ||
quote_ident(PGT.schemaname)|| '.'||quote_ident(T.relname)|| ';' as query
FROM pg_class AS S,
pg_depend AS D,
pg_class AS T,
pg_attribute AS C,
pg_tables AS PGT
WHERE S.relkind = 'S'
AND S.oid = D.objid
AND D.refobjid = T.oid
AND D.refobjid = C.attrelid
AND D.refobjsubid = C.attnum
AND T.relname = PGT.tablename
ORDER BY S.relname;
SQL
ActiveRecord::Base.connection.execute(sql).each do |query|
ActiveRecord::Base.connection.execute(query['query'])
end
end
end
You can see we’re just assigning the original SQL we borrowed from the
PostgreSQL Wiki to a sql
variable. From there we are executing that SQL, and
then executing each of the resulting SQL statements.
To execute this Rake task, run rake db:sequence_reset
, and you’re good to go.