Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic: Insert Update Delete #81

Open
chrisreu opened this issue Jun 18, 2014 · 5 comments
Open

Semantic: Insert Update Delete #81

chrisreu opened this issue Jun 18, 2014 · 5 comments

Comments

@chrisreu
Copy link
Member

There was some confusion about semantic of insert, update and delete concerning index and doctable.

I guess we've never really defined this, at least not written it down.

The idea of this ticket is to discuss the current state and define how it should work in the future.

Here are my thoughts for the DocTable:

DocTable

I think the document table should get a rest-like interface. This interface should expose the document table in a way that it is possible to retrieve single documents without the need of a query. Think: GetDocumentByUri

Insert would still happen with index inserts, but the interface could provide operations for update in a restful way:

  • update: Updates the whole resource
  • patch: Updates parts of the resource

So with update it would be possible to actually remove properties from the document (which is not possible with the current API). Patch could be used by Hayoo to update the Weights, by just patching the Weight field.

@sebastian-philipp
Copy link
Member

Another idea, that doesn't i introduce a new api would be a virtual uri: context, that is accessible like any other context and can be integrated into queries.

Such a query would look like

/search/uri:document-uri

This looks quite as clean as a rest api. Replacing the document description could be done by a new command replace.

Am 18.06.2014 um 17:28 schrieb chrisreu [email protected]:

There was some confusion about semantic of insert, update and delete concerning index and doctable.

I guess we've never really defined this, at least not written it down.

The idea if this issue is to discuss the current state and define how it should work in the future.

Here are my thoughts:

DocTable

I think the document table should get a rest-like interface. This interface should expose the document table in a way that it is possible to retrieve single documents without the need of a query.

Insert would still happen with index inserts, but the interface could provide operations for update in a restful way:

update: Updates the whole resource
patch: Updates parts of the resource
So with update it would be possible to actually remove properties from the document (which is not possible with the current API). Patch could be used by Hayoo to update the Score, by just patching the Score field.


Reply to this email directly or view it on GitHub.

@chrisreu
Copy link
Member Author

That may be possible. But in my opinion, that would not be a clean solution. Neither on implementation side nor from a users point of view.

It would not only duplicate the URI in memory, but create a whole map, basically from URI to URI containing redundant data.

Also the lookup of a single document would just be more inefficient then necessary. Instead of a literally lookup in the document table the engine would need to run through all steps required for search (query parsing, query processing, computations of the hits).

There is only one scenario were this would make sense. That would be, if we store all the documents in the context as well and get rid of the current DocTable abstractions. The key would the URI, the value would be the document. I think that might be something to consider after a first release.

@UweSchmidt
Copy link
Contributor

I think a single update command is sufficient. When the ApiDocument
does not contain an index part, the index update can be skipped and the operation
becomes cheap.

The description fields from the ApiDocument simply overwrite existing fields or
they are added. The values of the description fields are generalized form Text to JSON
values. Deletion of a field can be implemented by associating that field in the ApiDocument with
the JSON Null value. The NUll can be used as indicator for deletion, Null values
in the doc descriptions seem to be redundant.

With this approach we don't need any change to the interface and don't need any new commands.

@chrisreu
Copy link
Member Author

I like this idea.

The update command pretty much behaves like this right not, doesn't it? I'm not familiar with the new DocDesc structures yet, but may it be possible to integrate the "delete on NULL" directly into this structure?

If we want a restful interface in the future, that supports insert,update and patch with restful semantics, we could still do this on top of the interpreter interface.

@chrisreu chrisreu modified the milestones: 0.1, 0.3.0.0 Aug 10, 2014
@chrisreu
Copy link
Member Author

I'm still not a 100% satisfied with the current semantics. We've got 3 operations now for manipulation. Insert and Delete pretty much do what everyone would expect. Update still feels a little inconsistent in my opinion. Here is why:

Current Update DocTable Index
Attribute/Context is given Attribute gets overwritten with new Value New values get appended to Context
Attribute/Context is not given Nothing happens Nothing happens
Attribute/Context is NULL Attribute gets removed Nothing happens

Update on the Index is more like an append operation then an update. Here is what i think Update should work like:

Proposed Update DocTable Index
Attribute/Context is given Attribute gets overwritten with new Value All words indexed for this document get removed. New words get indexed. Basically the Context gets overwritten in regard of this particular Document
Attribute/Context is not given Nothing happens Nothing happens
Attribute/Context is NULL Attribute gets removed All words for this Document get removed from this particular Context. So basically the whole Document would be removed from this one Context

IF the current behavior for the Index is still needed somewhere, we could easily keep that by adding an Append Command. This new command could consistently append things to DocTable and Index like so:

Proposed optional Append DocTable Index
Attribute/Context is given Value gets appended to Attribute new values get appended to Context, while old words and positions are preserved
Attribute/Context is not given Nothing happens Nothing happens
Attribute/Context is NULL Attribute gets removed All words for this Document get removed from this particular Context. So basically the whole Document would be removed from this one Context

What are your opinions on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants