Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only first statement in multi-statement is traced #12

Closed
hlinnaka opened this issue May 7, 2024 · 3 comments
Closed

Only first statement in multi-statement is traced #12

hlinnaka opened this issue May 7, 2024 · 3 comments

Comments

@hlinnaka
Copy link
Contributor

hlinnaka commented May 7, 2024

Steps to reproduce the issue

$ psql postgres -c "select count(*) from pg_am; select count(*) from pg_class"
 count 
-------
     7
(1 row)

 count 
-------
   420
(1 row)

$ psql postgres -c "select trace_id, span_id, span_type, span_operation from pg_tracing_consume_spans"
             trace_id             |     span_id      |  span_type   |       span_operation        
----------------------------------+------------------+--------------+-----------------------------
 266759934ee85fe1100bf7a07bd43421 | 1b40b22bc5da4fa8 | Planner      | Planner
 266759934ee85fe1100bf7a07bd43421 | 88167138ea5945e5 | Executor     | ExecutorRun
 266759934ee85fe1100bf7a07bd43421 | d10fe19e7702ecd0 | SeqScan      | SeqScan on pg_am
 266759934ee85fe1100bf7a07bd43421 | 85e37a824bf3962b | Aggregate    | Aggregate
 266759934ee85fe1100bf7a07bd43421 | 7ffdd4d4338ba069 | Select query | select count(*) from pg_am;
(5 rows)

There are no spans for the second query, select count(*) from pg_class.

@hlinnaka
Copy link
Contributor Author

hlinnaka commented May 7, 2024

I bumped into this while I was reading the code and trying to understand how the spans are created and closed, how the nesting levels work etc. It all feels pretty complicated. I'm not sure what exactly we should do here, but I think the current mapping to spans is not great, and even if we fix this case, we'll have a lot of problems like this.

Some problems (aside from this particular issue with multi-statement) that I've identified so far:

  • There is no span for parse-analyze phase. It's usually quick, but in some cases it might not be
  • issues Assertion failure with cursor #10 and Crash on SQL comments on explicit BEGIN #13
  • There is no span active when the transaction is committed. The time to flush the WAL to disk can be significant, even more so if you use synchronous replication.
  • How to attach a trace parent if you use prepared statements? You can have an SQL comment on the PREPARE and EXECUTE statement, but I don't think that will do the right thing. Also, with extended query protocol, there is no query text with the Bind+Execute messages, so there's nowhere to put the special SQL comment.

I'm not sure what exactly to do here, but I think the nesting level tracking and how the parse, plan, execute hook function are mapped to tracing context and spans needs some more thinking.

@bonnefoa
Copy link
Collaborator

bonnefoa commented Jun 3, 2024

I bumped into this while I was reading the code and trying to understand how the spans are created and closed, how the nesting levels work etc. It all feels pretty complicated.

Agreed, I'm not satisfied with the current complexity and still trying to simplify this but haven't found a good solution yet. AFAIK, there's no easy entry point where to start a span. Most queries will go through the post_parse hook first but a prepared statement won't and may start with the planner hook (directly ExecutorStart hook if it's a cached plan). Transaction block + extended protocol makes things worse since the parsing of the next statement happens while the previous statement is still ongoing (PortalDrop happens when processing the Bind message).

Though for the problem you're reporting, I wasn't able to reproduce it. I do see both statements being traced. What was the pg_tracing.sample_rate used and on which Postgres version?

psql -c "select count(*) from pg_am; select count(*) from pg_class"

psql postgres -c "select trace_id, span_id, span_type, span_operation from pg_tracing_consume_spans"
             trace_id             |     span_id      |   span_type   |                          span_operation
----------------------------------+------------------+---------------+-------------------------------------------------------------------
 b759fe638dc7e1d85357ec926ae7732c | c524f9b23fe8fe6c | Planner       | Planner
 b759fe638dc7e1d85357ec926ae7732c | e5e10e789d9e3f5a | Executor      | ExecutorRun
 b759fe638dc7e1d85357ec926ae7732c | 810690be67920433 | SeqScan       | SeqScan on pg_am
 b759fe638dc7e1d85357ec926ae7732c | 5f98c17515191298 | Aggregate     | Aggregate
 b759fe638dc7e1d85357ec926ae7732c | c4fe67771f18ec07 | Select query  | select count(*) from pg_am;
 b759fe638dc7e1d85357ec926ae7732c | 11c9351b49b872e2 | Planner       | Planner
 b759fe638dc7e1d85357ec926ae7732c | 4ada72970aee8e0f | Executor      | ExecutorRun
 b759fe638dc7e1d85357ec926ae7732c | cf47a615cc790b52 | IndexOnlyScan | IndexOnlyScan using pg_class_tblspc_relfilenode_index on pg_class
 b759fe638dc7e1d85357ec926ae7732c | 34ba1c4550f29542 | Aggregate     | Aggregate
 b759fe638dc7e1d85357ec926ae7732c | d76f6eaf42b900a3 | Select query  | select count(*) from pg_am; select count(*) from pg_class

There is no span for parse-analyze phase. It's usually quick, but in some cases it might not be

I used to generate a span for parse-analyze but it was mostly a guess by assuming that the time between stmtStartTimestamp and post_parse was spent for parse-analyze. I eventually dropped it as it added a lot of complexity and wasn't reliable. A better solution would be to have a parse-analyze hook which also could be used by other extensions like pg_stat_statements.

There is no span active when the transaction is committed. The time to flush the WAL to disk can be significant, even more so if you use synchronous replication.

Good point, I didn't realise commit happened after portal was dropped. That might be doable by using xact callbacks and creating a span between XACT_EVENT_PRE_COMMIT and XACT_EVENT_COMMIT.

How to attach a trace parent if you use prepared statements? You can have an SQL comment on the PREPARE and EXECUTE statement, but I don't think that will do the right thing.

Prepared statements executed through EXECUTE and PREPARE look ok. Tracing PREPARE will show the ProcessUtility call while tracing EXECUTE will show the prepare utility then the nested query.

/*dddbs='postgres.db',traceparent='00-00000000000000000000000000000001-0000000000000001-01'*/ PREPARE test_prepared_one_param (integer) AS SELECT * FROM pgbench_accounts where aid=\$1;
/*dddbs='postgres.db',traceparent='00-00000000000000000000000000000002-0000000000000002-01'*/ EXECUTE test_prepared_one_param(200);

WITH oldest_start AS (
    SELECT min(span_start) as min_start
    FROM pg_tracing_peek_spans
)
select
    extract(MICROSECONDS FROM age(span_start, oldest_start.min_start)) as us_start,
    extract(MICROSECONDS FROM age(span_end, oldest_start.min_start)) as us_end,
    lvl, span_type, span_operation
    FROM peek_ordered_spans, oldest_start;

 us_start | us_end | lvl |   span_type    |                                      span_operation
----------+--------+-----+----------------+-------------------------------------------------------------------------------------------
        0 |    317 |   1 | Utility query  | PREPARE test_prepared_one_param (integer) AS SELECT * FROM pgbench_accounts where aid=$1;
        9 |    317 |   2 | ProcessUtility | ProcessUtility
      277 |    317 |   3 | Select query   | PREPARE test_prepared_one_param (integer) AS SELECT * FROM pgbench_accounts where aid=$1;
      534 |   1021 |   1 | Utility query  | EXECUTE test_prepared_one_param(200);
      541 |   1021 |   2 | ProcessUtility | ProcessUtility
      565 |   1017 |   3 | Select query   | PREPARE test_prepared_one_param (integer) AS SELECT * FROM pgbench_accounts where aid=$1;
      565 |    943 |   4 | Planner        | Planner
      967 |    986 |   4 | Executor       | ExecutorRun
      969 |    985 |   5 | IndexScan      | IndexScan using pgbench_accounts_pkey on pgbench_accounts

Also, with extended query protocol, there is no query text with the Bind+Execute messages, so there's nowhere to put the special SQL comment.

That's definitely been a pain point. Early versions relied on a hack where you could pass SQLCommenter as a parameter

PREPARE test_prepared (text, integer) AS /*$1*/ SELECT $2;
EXECUTE test_prepared('dddbs=''postgres.db'',traceparent=''00-00000000000000000000000000000009-0000000000000009-01''', 1);

It worked but it definitely wasn't a great experience so I dropped it. A better alternative would be to propagate the tracecontext through GUC, either explicitly or with Jelte's proposition.

@bonnefoa
Copy link
Collaborator

I went through multiple iterations and (hopefully) simplifications since the initial report.

  • There is no span active when the transaction is committed. The time to flush the WAL to disk can be significant, even more so if you use synchronous replication.

This is now reported. There's a dedicated TransactionCommit span that covers the time between PRE_COMMIT and COMMIT.
Screenshot 2024-09-12 at 15 41 53

  • How to attach a trace parent if you use prepared statements? You can have an SQL comment on the PREPARE and EXECUTE statement, but I don't think that will do the right thing. Also, with extended query protocol, there is no query text with the Bind+Execute messages, so there's nowhere to put the special SQL comment.

I've added the possibility to propagate the trace context through a GUC value. Not great but there's not much alternative that wouldn't interfere too much with the query itself.

I'm not sure what exactly to do here, but I think the nesting level tracking and how the parse, plan, execute hook function are mapped to tracing context and spans needs some more thinking.

This part was simplified. Active spans are kept in a stack and the parent of new spans will be at the top of the stack (minus some exceptions). This reduced the amount of informations that has to be tracked for a nested level. There's still the need to track 2 traceparents due to the possible overlap created with the extended protocol (the parse of the next statement happens before the ExecutorEnd of the previous statement). I may have leads to get rid of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants