I've explained how to use pg_hint_plan
in this blog, and at PostgreSQL conferences. I focused on how to use it correctly without starting religious discussions about hints being a bad practice. If you understand how it works, the scope of good and bad usage becomes obvious. In short: they are good during your performance troubleshooting job, when you test queries, or as a short-term workaround for production queries before the root cause is fixed.
Besides the explications, and how it works, and when to use them, I also take use the following analogy to explain why we may need a hint. The optimizer is the "Google Map" to access your data. Sometimes you may need to few options to select your preferred way or to add some information you have and it doesn't:
I change this slide for each conference I present it. The path from the airport or train station to the conference venue is always a good example. Last time (PgConf.de) Google Map was not aware of the bomb clearing around the train station 😂
I skip the "are hints good or bad" discussion, but I don't ignore it. The PostgreSQL Wiki lists the problems and benefits and here is my opinion on those points.
- Poor application code maintainability: hints in queries require massive refactoring
I would say the opposite: pg_hint_plan
hints, like Oracle ones, provide a way to change the execution behavior without refactoring the SQL code. Hints are added in /*+ */
comments without re-writing or changing the structure of the SQL code.
- Interference with upgrades: today's helpful hints become anti-performance after an upgrade
It has always been recommended to review all hints after upgrades. A good practice, when the regression tests are trusted to detect performance issues, is to remove hints after an upgrade. Or at least check if the bug they were supposed to workaround is fixed in the new version. And, looking at the previous PostgreSQL versions, I don't see any pg_hint_plan
hint changing to the opposite behavior with a new release.
- Encouraging bad DBA habits slap a hint on instead of figuring out the real issue
I don't think DBAs are the ones adding hints to queries. That requires an understanding of the query, the data model, and the access patterns. The developer will do that. Anyway, that's not specific to hints. Many features can be misused. That's why my priority is to explain how it works rather than ignoring them. Using hints to test different execution plans is a great way to figure out the real issue.
- Does not scale with data size: the hint that's right when a table is small is likely to be wrong when it gets larger
One goal of hinting is to get a stable and scalable execution plan. If you force an Index Scan over a Seq Scan, you guarantee that the response time doesn't depend on the size of the table. Even if, for a small table, a Seq Scan is faster. Using hints encourages to look at the execution plan and that's the only way to understand if the execution will scale. Especially when using the /*+ Rows() */ hint to check that the execution plan stays the same when the join result grows.
- Failure to actually improve query performance: most of the time, the optimizer is actually right
Improving query performance starts with finding the best execution plan and understanding why it was not chosen by the query planner. Then fixing the root cause (missing extended statistics, indexes, refactoring the SQL...). To do that, you need hints:
- run the query without hint (the 'bad plan') and look at the cost
- run the query with hint (the 'good plan') and look at the cost
- understand which cost was over estimated in the 'good plan' and why
If most of the time the planner is right, then most of the time users will not be tempted to change the plan with hints. They are used for the few cases where it may not be right.
- Interfering with improving the query planner: people who use hints seldom report the query problem to the project
Yes, if have a workaround for a bug, people may continue with it without raising the issue. This is not specific to hints. However here is what should happen when you encounter such an issue:
- test the alternative execution plan to be sure it is an optimizer bug. I use hints to do this
- build a reproducible test case that shows the correct behavior and the bad behavior. I use hints to do this
- deploy a workaround to get the production service back until the bug is fixed. I use hints to do this
Enterprises can't wait for a fix and patch immediately. That's a major reason for being able to work around the issues. And the best workaround is one that limits its scope to the issue. With hints, you can change a single access path in a complex query. Without them you will have to set parameters with a large scope, risking side effects on what works well.
The Wiki also references some other ways to do it without hints and it is about setting query planner parameters. This not easy to do for a single query and the risk is that it has side effects on other queries. With pg_hint_plan
you can set them for one query with /*+ Set( parameter value) */
. You can also change one index access or join method for one table or one join only. Without hints, people will refactor their SQL, add dummy conditions to skip the indexes, and this will bring more problems than hints.
When PostgreSQL introduced the possibility to inline the Common Table Expressions it was clear that this may not be the best to do for all queries with a WITH clause. Other databases like Oracle add a /*+ materialize */
or /*+ inline */ hint to control this. Because of the lack of hints in PostgreSQL, this was added in the WITH clause syntax, breaking the SQL semantic (which should declare the expected result and not the way to get it). It also break the compatibility as the WITH ... MATERIALIZE provides the same behavior as previous versions but fails with them.
That said, the queries in your application where you use hints should be well-documented with the reason for adding the hint. If there's an issue with the query planner, report the issue and follow up to know when you can remove your hint.
Many PostgreSQL-compatible databases install pg_hint_plan
by default, like Amazon Aurora or YugabyteDB. It is harmless. As any feature, it can hurt if you use it in a bad way, but having it already installed will save you a lot of time the day when you will have to understand an execution plan.