-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add native support to bitstring #577
Proposal: Add native support to bitstring #577
Conversation
14b1b4a
to
b640437
Compare
Would you be able to update mix.exs so it points to your ecto branch? Basically change this part:
to
Then run That way we could ensure the unit tests pass :) |
Done! Sort of :) There wasn't any I wasn't sure ho to do this. |
I believe we will have an issue casting to the bitstring type. I was playing around with it and it seems like you must specify the number of bits when casting. If you omit it in postgres, like Not sure atm what the best thing to do is. The only thing I can think is creating a parameterized type for bitstrings that requires the size parameter and has underlying type like |
Yeah you need to specify the size when casting a |
Another thing I just realized we need to think about is literals. I think we always have to tag them in Ecto like with binaries: https://github.com/elixir-ecto/ecto/blob/23c40e3d9b95ac4238b5417c4e3bb7329a0cc8b0/lib/ecto/query/builder.ex#L774 And then handle them specially in the adapter like we do for binary: https://github.com/elixir-ecto/ecto_sql/blob/master/lib/ecto/adapters/postgres/connection.ex#L1013 This just popped into my head though I didn't have time to play around with it yet. But we need to consider it. This is used, for example, when you do something like this |
I attempted to tag bitstrings similarly to binaries, but I couldn't find a reliable way to differentiate between them. My only option is to check whether the size is a multiple of 8 (indicating a binary) or not (indicating a bitstring), but this appears to be a weak distinction. Do you have any ideas? Nevertheless, I also suspect that binary literals might suffice for bitstrings as well. I'll do some integration tests to verify this. |
Could you point to some of the places where you'd need to make such a distinction but not know the underlying type? I can take a look. |
The process to tag literal starts with: # literals
def escape({:<<>>, _, args} = expr, type, params_acc, vars, _env) do
valid? = Enum.all?(args, fn
{:"::", _, [left, _]} -> is_integer(left) or is_binary(left)
left -> is_integer(left) or is_binary(left)
end)
unless valid? do
error! "`#{Macro.to_string(expr)}` is not a valid query expression. " <>
"Only literal binaries and strings are allowed, " <>
"dynamic values need to be explicitly interpolated in queries with ^"
end
{literal(expr, type, vars), params_acc}
end where the operator Then defp literal(value, expected, vars),
do: do_literal(value, expected, quoted_type(value, vars))
...
def quoted_type({:<<>>, _, _}, _vars), do: :binary` that returns Eventually, the literal is tagged with type defp do_literal(value, _, current) when current in [:binary],
do: {:%, [], [Ecto.Query.Tagged, {:%{}, [], [value: value, type: current]}]} (it happens here) In this process, I can't find a neat way to distinguish between bitstrings and binaries, since the operator for both types is the same ( I need that |
Thanks that's really helpful. I will take a closer look. |
I'll need to play with this to see if bitstrings are possible in sqlite. I think I'll have to use |
@Gigitsu I think the best we can do here is check |
@warmwaffles from a quick Google search, it seems that in SQLite bitwise operators works on numeric types and not on
@greg-rychlewski I believe I'm facing my knowledge limitations on Elixir macro on this. :) Both here def escape({:<<>>, _, args} = expr, type, params_acc, vars, _env) do
valid? = Enum.all?(args, fn
{:"::", _, [left, _]} -> is_integer(left) or is_binary(left)
left -> is_integer(left) or is_binary(left)
end)
unless valid? do
error! "`#{Macro.to_string(expr)}` is not a valid query expression. " <>
"Only literal binaries and strings are allowed, " <>
"dynamic values need to be explicitly interpolated in queries with ^"
end
{literal(expr, type, vars), params_acc}
end and here # Tagged
def quoted_type({:<<>>, _, _}, _vars), do: :binary
def quoted_type({:type, _, [_, type]}, _vars), do: type the I need to transform the AST into actual code to test it with My other option is to add a case match here like this defp expr(%Ecto.Query.Tagged{value: binary, type: :binary}, _sources, _query)
when is_binary(binary) do
["'\\x", Base.encode16(binary, case: :lower) | "'::bytea"]
end
defp expr(%Ecto.Query.Tagged{value: binary, type: :binary}, _sources, _query)
when is_bitstring(binary) do
["'\\x", Base.encode16(binary, case: :lower) | "'::bytea"]
end but the tagged type will always be |
def escape({:<<>>, _, args} = expr, type, params_acc, vars, _env) do
...
I think we can leave this one as is, it should just catch that users are not supplying variables (i.e. def quoted_type({:<<>>, _, _}, _vars), do: :binary The only way I know to check here is using def quoted_type({:<<>>, _, _} = expr, _, _vars) do
if is_binary(Code.eval_quoted(expr)) do
:binary
else
:bitstring
end
end Though I don't know how bad this is considered and would need a second opinion. It might be ok since we know it is a literal. There might also be a better strategy than the one I'm suggesting but I need to think more and possibly get other opinions. defp expr(%Ecto.Query.Tagged{value: binary, type: :binary}, _sources, _query)
when is_binary(binary) do
["'\\x", Base.encode16(binary, case: :lower) | "'::bytea"]
end Yes, we will have to create a clause like this for bitstring. Because that encoding will fail for non-binaries and also that representation will not be allowed to be cast to bit types in Postgres. |
def quoted_type({:<<>>, _, _} = expr, _, _vars) do
if is_binary(Code.eval_quoted(expr)) do
:binary
else
:bitstring
end
end There's another complication with this. Our current checks don't guard against dynamic bit size. We allow stuff like this size = 3
<<2::size(size)>> If we go this route we also need to guard against dynamic size. Otherwise we can't determine the type and I don't think we can guarantee using |
@@ -1443,6 +1459,7 @@ if Code.ensure_loaded?(MyXQL) do | |||
defp ecto_to_db(:bigserial, _query), do: "bigint unsigned not null auto_increment" | |||
defp ecto_to_db(:binary_id, _query), do: "binary(16)" | |||
defp ecto_to_db(:string, _query), do: "varchar" | |||
defp ecto_to_db(:bitstring, _query), do: "bit" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For MySQL it seems more complicated:
- If you use
bit
type in migrations it will default tobit(1)
. - You cannot cast to bit, even if you specify the soze (https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_cast)
- There is no such thing as
varbit
Assuming all the above is correct, maybe we should do this then:
- Add
defp ecto_size_to_db(:bitstring), do: "bit"
. - Raise here
defp ecto_to_db(:bitstring)
This would mean that migrations work as long as you specify the size and casting raises. This would avoid unexpected surprises in migrations specifying size 1 silently. And since casting doesn't work it would be clearer if we make our own message rather than the user not knowing whether we just sent the wrong type to the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I totally agree.
I don't like the idea of using
I would differentiate
Should we allow a syntax like this: size = 3
<<2::size(^size)>> with the |
Let's get Jose's opinion before making any change to this part. Maybe what you are saying is the best way. But at the same time we'd be passing the wrong type through Ecto in the builder, the planner and out to all the adapters (not just the built-in ones). The types are used in places other than the query string building. |
That's ok, let's wait for Jose's opinion. In the meantime I'll ad some integration tests |
The other way I can think of is we already iterate through the bitstring here:
So we could possibly determine at this point whether the sum of the sizes is divisible by 8 or not by tracking the remainder after each iteration and adding some pattern matches to size. Though we'd have to guard against variable sizes and raise the same message as if the value was variable: unless valid? do
error! "`#{Macro.to_string(expr)}` is not a valid query expression. " <>
"Only literal binaries and strings are allowed, " <>
"dynamic values need to be explicitly interpolated in queries with ^"
end i.e. if any part of the literal is variable, interpolate the entire thing ( |
Yeah I agree, I was experimenting exactly with that, but we should consider that there are different kinds of size definitions, eg |
Yeah I have the same problem. It does not feel very stable to do it in this way. Modifiers can even change in the future and Ecto misses one. If there is a safe way to use |
Regardless, I think we have to consider modifying the checks here: valid? = Enum.all?(args, fn
{:"::", _, [left, _]} -> is_integer(left) or is_binary(left)
left -> is_integer(left) or is_binary(left)
end) Because this will fail for something like this |
Could |
Yeah, that's why I mentioned we'd need to block user input and only accept literals. But given the multitude of modifiers I don't know how feasible this is. And modifiers might change in the future and Ecto misses a new one. So maybe it's unacceptable no matter what. Though there might be some protection given it is inside Basically none of the options are looking too appealing right now:
I'm hoping by continuing to discuss/think about it a better way becomes clear. |
Yeah I got the impression looking into the MyXQL history that this is a tough topic. For example: elixir-ecto/myxql#91. My first impulse was to try and figure out the issue, but given the complexity maybe we go ahead with this PR for now without MySQL and then add it in later once we figure it out. |
Agreed. |
I agree too. So my next step is to remove the MySQL implementation ad do an integration test only for Postgres, am I right? |
@Gigitsu Yeah you can keep the test in Ecto because that's where all the other types are. But you could pull the And then last thing is to add literal support in Ecto and then catch the tagged value here in the postgres adapter. |
Hi @greg-rychlewski, I've removed everything related to bitstring from MySQL and TDS adapters. Now everything seems to work. Please let me know if there is anything I forgot. Additionally, I've updated the Earthfile to make it compatible with ARM (Apple Silicon in my case) CPUs. However, I only have experience with Docker and haven't used Earthly before, so please take a look at it.
I almost forgot this, I work on it |
Co-authored-by: José Valim <[email protected]>
sorry @Gigitsu could you please do |
thanks @Gigitsu ! |
Thank you! |
This is the companion PR of elixir-ecto/ecto#4328