Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry CloseAuctions procedure in AuctionMark on rollback #354

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nuno-faria
Copy link
Contributor

The current way the CloseAuctions procedure is issued is with custom code in the executeWork method, instead of scheduled by the worker. This means that if the transaction fails on concurrency-induced conflicts, when the worker tries to retry, it will retry a different one:

// CloseAuctions conflicting with other transactions 
org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
  Hint: The transaction might succeed if retried.
	at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2713)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2401)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:368)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:498)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:415)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:190)
	at org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:152)
	at com.oltpbenchmark.benchmarks.auctionmark.procedures.CloseAuctions.run(CloseAuctions.java:168)
	at com.oltpbenchmark.benchmarks.auctionmark.AuctionMarkWorker.executeCloseAuctions(AuctionMarkWorker.java:456)
	at com.oltpbenchmark.benchmarks.auctionmark.AuctionMarkWorker.executeWork(AuctionMarkWorker.java:350)
	at com.oltpbenchmark.api.Worker.doWork(Worker.java:418)
	at com.oltpbenchmark.api.Worker.run(Worker.java:284)
	at java.base/java.lang.Thread.run(Thread.java:833) 

// and when the worker catches the exception, it retries a NewBid instead
Retryable SQLException occurred during [com.oltpbenchmark.benchmarks.auctionmark.procedures.NewBid/03]... current retry attempt [2], max retry attempts [3], sql state [40001], error code [0].

Since this procedure is executed sparsely, it would be useful to guarantee that it succeeds.

Copy link
Member

@apavlo apavlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to accept this.

done = true;
}
catch (SQLException e) {
if (e.getSQLState().startsWith("40")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "40"? Is this DBMS specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL states starting with 40 are related to transactional conflicts, such as "serialization failure" and "integrity constraint violation". As far as I know, it is standard across SQL systems.

Comment on lines +454 to +462
while (!done) {
try {
results = proc.run(conn, benchmarkTimes, startTime, endTime);
done = true;
}
catch (SQLException e) {
if (e.getSQLState().startsWith("40")) {
conn.rollback();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this get stuck in an infinite loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because sooner or later the transaction will be able to complete. Any other type of error other than transactional conflicts will raise an exception to the worker.

Comment on lines +454 to +458
while (!done) {
try {
results = proc.run(conn, benchmarkTimes, startTime, endTime);
done = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't think we want to do this because we can't keep track of the # of times that a txn is submitted. So this can cause BenchBase to exceed the defined submission rate. Also, by retrying inside of the benchmark worker we can't keep track of the # failed txns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree that this is not the best approach. I also thought about adding an update method to the "TransactionType" class, so we can update the original "txnType" when we switch it in the "executeWork" function, making the worker deal with retries. This also would mean that the CloseAuctions procedure would be correctly logged in the results, since right now, as far as the worker knows, no CloseAuctions procedure is executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants