Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: .union() & .union_all() query clause methods #309

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

jorroll
Copy link

@jorroll jorroll commented Dec 28, 2017

Fixes #308

Currently, both .union() and .union_all() are working for my (limited) testing.

I wanted to run my implementation by you before writing any tests / docs. You were correct, the query clause method implementation is rather confusing.

  • query.union_all(query) adds a UNION ALL clause
  • query.union(query) adds a UNION clause

You can pass either a query object or a string to union / union_all.

Some added benefits come if you use a query object for the union clause's argument:

  • .pluck()ing the base query object will overwrite the return of the union clause
  • the cypher output in the console gets pretty formatting

Pings:
@cheerfulstoic
@subvertallchris

@coveralls
Copy link

Coverage Status

Coverage decreased (-8.8%) to 65.738% when pulling a0bbf24 on thefliik:master into 00190f4 on neo4jrb:master.

1 similar comment
@coveralls
Copy link

coveralls commented Dec 28, 2017

Coverage Status

Coverage decreased (-8.8%) to 65.738% when pulling a0bbf24 on thefliik:master into 00190f4 on neo4jrb:master.


# If `value` is a query object, returns value.to_cypher. Formatting optional
def from_query(value, pretty: false)
from_string(value.to_cypher(pretty: pretty))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember seeing other uses of KW args in the neo4j-core source. I'm not sure if this was personal preference or if you're targeting a version of Ruby pre-KW args

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that they aren't used much. In the past we've supported Ruby 1.9, but we require at least Ruby 2.1 now, so keyword arguments should be fine.

@jorroll
Copy link
Author

jorroll commented Jan 3, 2018

Also: this should probably be updated so that if .proxy_as(Class, :symbol) is called on the query, the returns get overridden with the :symbol (like with .pluck())

@cheerfulstoic
Copy link
Contributor

What happens if you make a new Query object with the union methods and then try to call a clause method? Should a unioned Query object be frozen?

I'm a bit worried about the added complexity (to an already really complex pair of classes). UNION feels like it's a fundamentally different sort of thing than what most of Query already takes care of. Looking briefly at the code, the overwriting of to_cypher is what gives me the thought that "one of these things is not like the other" and that it should perhaps be modeled differently. Being able to build strings or execute a query based on the UNION of a set of Query objects could be one simple way to take care of it, but there could alternately perhaps be a UnionQuery class which does things like

  • Take in a set of Query objects
  • Validate the columns
  • Allow enumerating on results

@jorroll
Copy link
Author

jorroll commented Jan 16, 2018

This PR follows what I proposed over in #308. Namely:

Calling .union() on a query always adds a new union clause to the query. Once added, that union clause cannot be modified. Calling query_as(:n).match(n: {color: 'purple'}).union(query_as(:n).match(n: {color: 'blue'})) would be the same as query_as(:n).union(query_as(:n).match(n: {color: 'blue'})).match(n: {color: 'purple'}). I.e. additional query clause methods modify the original query, ignoring any union clauses.

It looks like Neo4j expects all union returns to have the same columns (both name and number). So it would make sense to validate that and also: if you plucked the root query directly (.pluck()) to simply overwrite the return of the root and all unions so that they returned whatever was in the pluck.

This implementation doesn't attempt to validate the columns. Instead, the recommended usage is to pass in a query object as the argument to the union clause and then pluck() the result (which simply overwrites all the returns so that the columns are identical--which is what Neo4j expects). You can pass in a cypher string though, in which, a pluck() will ignore the string but still overwrite any other query objects.

The modified to_cypher is simply to allow pretty formatting in the console.

but there could alternately perhaps be a UnionQuery class which does things like

  1. Take in a set of Query objects
  2. Validate the columns
  3. Allow enumerating on results

I'm not sure how this would be different from adding a UnionClause class? In the UnionClause you pass in query objects, the columns are ensured to be valid if you pluck the return, and you can enumerate the results. I kinda see what you mean about union feeling different from the other clauses (seeing as you pass in a whole new query to a union clause). Except the result from a union query is the same as the result from any other query, so I'm not sure adding a new interface for union queries makes sense.

@tjad
Copy link

tjad commented Oct 6, 2018

Just my 2 sense.

I've read from #308 until here, and I perceive unnecessary discrepancies for "union_all". There are use cases for this, and it is most beneficial when wanting the database engine to do work before retrieving results.

The syntax is not confusing, and in fact, is much easier to use with the neo4jrb framework. Create 2 queries , q1 and q2

q1.union_all(q2)
Both queries have to contain a return clause - they are both executed by the neo4j database, and the results for both queries are unioned and then returned to the client as if it was a single query.

What am I missing ?

UPDATE: As in, it is up to the query writer to make sure that the result of both queries return similarly structured nodes. The single resultset that the engine returns should be proxied just as per a single query (as there is 1 resultset)

https://neo4j.com/docs/developer-manual/current/cypher/clauses/union/

@tjad
Copy link

tjad commented Oct 6, 2018

Would be great if I could at the very least do
q1 = MyObj.as(:m).some_rel.where(prop1: 'a value')
q2 = MyObj.as(:m).where_not('(m)-[:SOME_REL]-()')
q1.union(q2) #query returning MyObj with some_rel relationship entity's prop1='a value' OR MyObj with no some_rel relationships yet

And a QueryProxy of MyObj were returned. (or an array - I guess why you mentioned to do pluck_union)
Perhaps at this point, if I wanted to continue chaining, the next part of the chain would be fed the result nodes as starting points. But disregard the complex possibilities for now... get the simple use cases working.

UPDATE: Yes, I realise that https://github.com/neo4jrb/neo4j would be responsible for the QueryProxy part of it

@jorroll
Copy link
Author

jorroll commented Oct 6, 2018

@tjad I stopped pushing forward with this because @cheerfulstoic was busy at the time and because I was able to refactor my code to remove the need for a union clause.

At the moment, if you want to perform a union query, you'll need to either use the existing union_cypher or just execute a raw query string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants