Lemmy server admins, please enable pg_stat_statements on your PostgreSQL server so the Lemmy community can better identify the performance problems on the more active sites

RoundSparrow@lemmy.ml · edit-2 1 year ago

Lemmy server admins, please enable pg_stat_statements on your PostgreSQL server so the Lemmy community can better identify the performance problems on the more active sites

RoundSparrow@lemmy.ml · 1 year ago

source: https://github.com/LemmyNet/lemmy/pull/3482

RoundSparrow@lemmy.ml · edit-2 1 year ago

Update July 24, 2023

Some major stored procedure SQL problems were overlooked until 2 days ago. I submitted revised statements to fix a massive write operation on every single post and comment creation on a local site. The site_aggregates table… every row modified instead of a single row. https://github.com/LemmyNet/lemmy/pull/3704

I was curious why pg_stat_statements didn’t draw more attention to the INSERT statements hitting so many rows, and I found out that by default it does not take into account stored procedure execution! https://stackoverflow.com/questions/56741860/pg-stat-activity-how-to-see-current-activity-inside-a-running-stored-procedure

pg_stat_statements.track = all

Now I’m seeing far more activity than I have been looking at for the past 2 months… the stored procedure statements are showing up!

Install steps: https://gist.github.com/rcanepa/535163dc249539912c25

phiresky@lemmy.world · 1 year ago

@roundsparrow@lemmy.ml If you find queries that are slow could you post the EXPLAIN ANALYZE here or send them to me? You can do it manually or maybe use https://www.postgresql.org/docs/current/auto-explain.html

RoundSparrow@lemmy.ml · edit-2 1 year ago

Once the extension is installed and enabled, you can reset the stats whenever you want and it will start to build a new list of queries and I assume reset the average execution time.

I found a curious query that seems to take 10 or 11 seconds on my server:

SELECT "person"."id", "person"."name", "person"."display_name", "person"."avatar", "person"."banned", "person"."published", "person"."updated", "person"."actor_id", "person"."bio", "person"."local", "person"."banner", "person"."deleted", "person"."inbox_url", "person"."shared_inbox_url", "person"."matrix_user_id", "person"."admin", "person"."bot_account", "person"."ban_expires", "person"."instance_id", "person_aggregates"."id", "person_aggregates"."person_id", "person_aggregates"."post_count", "person_aggregates"."post_score", "person_aggregates"."comment_count", "person_aggregates"."comment_score" FROM ("person" INNER JOIN "person_aggregates" ON ("person_aggregates"."person_id" = "person"."id"))
WHERE (("person"."admin" = $1) AND ("person"."deleted" = $2)) ORDER BY "person"."published"

This query is being called over 1000 times an hour with my instance only really doing incoming Federation content. Should this be optimized by doing a subSELECT on the WHERE clause before doing the INNER JOIN?

RoundSparrow@lemmy.ml · 1 year ago

IMPORTANT NOTE: I am rusty on my PostgreSQL, it’s been over 10 years since I designed and ran a major time-critical website with it. That is not “SECONDS”, it is “MILLISECONDS”, but it is still a slow query being called frequently.