Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulate batch operations on multiple tables/entities #160

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Encapsulate batch operations on multiple tables/entities #160

wants to merge 1 commit into from

Conversation

GingerMoon
Copy link
Contributor

@mmatczuk
I made this PR as a draft version for the purpose of POC.
The code has't been carefully polished yet.
Please just have a quick look and let me know if this is what we want.
Then I will spend some time on polishing the PR.

@GingerMoon
Copy link
Contributor Author

By the way, it would be helpful if you could share some suggestions on naming terminologies and variables.

@mmatczuk
Copy link
Contributor

This patch is huge I did not expect it.

Few things:

  • We should use session.NewBatch, building it in qb has many shortcommings
  • There should not be that many abstractions
  • I think we can tackle use case 1, use case 2 is just a normal use of Batch in gocql

Design objectives:

  • Apply the same mutations to many query builders typically originating from tables
  • Build a gocql.Batch from many query builders and bind a struct

@GingerMoon
Copy link
Contributor Author

Thanks.

We should use session.NewBatch, building it in qb has many shortcommings

Would you please share some of the shortcommings?

@mmatczuk
Copy link
Contributor

Batch is a special kind of query. I'd have to double check what happens when you send batch as a normal query. If you get to know anything keep me posted.

@GingerMoon
Copy link
Contributor Author

GingerMoon commented May 29, 2020

The difficiences are below:

  1. gocql.Batch has some more batch specific check, such as BatchSize(65535).
  2. gocql.Batch use BatchObserver instead of QueryObserver.
  3. gocql.Batch has cancelBatch call back.
  4. gocql.Batch prepares the included statments separately, which is more efficient. In contrast, the current gocqlx.qb.BatchBuilder, which use gocql.Query, prepares the whole batch statements as one ordinary query.

Shall we make a PR first for re-implementing gocqlx.Batch?

@mmatczuk
Copy link
Contributor

qb package is just a query builder I think there is nothing to fix at this point. You can send PR add your findings to https://github.com/scylladb/gocqlx/blob/master/qb/batch.go#L27.

The whole point of this PR is adding proper API for working with batches that implement manual "view" creation - that is one entity multiple pks.

@GingerMoon
Copy link
Contributor Author

gocql.Batch cannot be "prepared" beforehand like what I do with gocqlx.qb.BatchBuilder because Query or Bind in gocq.Batch requires args immediately.
Do you have any suggestions on what fields the new type "EntityBatch" has?

@mmatczuk
Copy link
Contributor

That's an interesting point. I'd suggest you measure that first, maybe in your case it's better to prepare the batch?

If we were to use batch I think gocqlx needs to be changed to allow getting values for names - like in BindStruct.

@GingerMoon
Copy link
Contributor Author

Can we postpone deprecating qb.BatchBuilder?
If YES, should I go on the current entity batch implementation (pre-construct the stmt&names via qb.BatchBuilder)?

@GingerMoon
Copy link
Contributor Author

GingerMoon commented Jun 12, 2020

May I go on using qb.BatchBuilder to implement this MR?

@GingerMoon
Copy link
Contributor Author

GingerMoon commented Jun 15, 2020

I am thinking whether I can keep using qb.Batchbuilder for this MR.
The biggest cons of using the current qb.BatchBuilder instead of gocql.Batch is the overhead of "prebuilding" a batch's single cql queries. But this overhead might be not a problem for this MR's target.
Say, we have one entity employee and two tables: employee_by_id, employee_by_name.
This MR's target is encapsulating batches for employee's CRUD operations.
Every single cql in the batches doesn't make much sense by itself (I cannot think of a case where the single cql can be reused), so it doesn't matter if we only prepare the whole batch instead of every single cql inside the batch.
From this point of view, I think it might make sense for this MR to keep using qb.BatchBuilder instead of gocql.Batch since the gocqlx's encapsulation of gocql.Batch hasn't been done yet(maybe later I will do it).
@mmatczuk Anymore insights would be appreciated!

@mmatczuk
Copy link
Contributor

Correct, let's use qb.BatchBuilder for now.

@GingerMoon
Copy link
Contributor Author

Originally I just wanted to add the pre-constructed batch cqls.
Later on I realized that it would be more convenient for developers if they only need to provide the entity and some call back functions.
For example, we have one entity employee and two tables: employee_by_id, employee_by_name.
If we want to update employee.Name, we have to read the existing old entity, delte it and then create a new one with the new name. And delete and create must be in one batch.

So I'd like to chagne the target of this MR from
"pre-construct batch operations cql"
to
"encapsulate entity batch db access operations". (gocqlx.Session.Query will also be used).
@mmatczuk What do you think?

@mmatczuk
Copy link
Contributor

Maybe I'm missing something but it looks quite easy.

  • Create delete stmt with a query builder (or tables)
  • Create insert stmt with a query builder (or tables)
  • Add both of them to batch (without prefix)
  • Bind with a struct
  • Execute

What could we simplify here for a general audience?

@GingerMoon
Copy link
Contributor Author

@mmatczuk It seems that I didn't express myself clearly.
Would you please have a look at entitybatch/sample/singleEntity/operation.go and entitybatch/sample/operation_test.go?

As you will see, instead of the 5 steps what you mentioned, the developer can now call only one API:

err := DataAccess.UpdateContainsSubTablePrimaryKey(&singleEntityUpdate)

Please let me know if it makes sense or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants