Skip to content

Commit

Permalink
Merge branch 'comtihon:master' into cursor-stop-race-condition
Browse files Browse the repository at this point in the history
  • Loading branch information
samwar authored Apr 12, 2023
2 parents b90dee0 + 5881a8b commit d993a9e
Show file tree
Hide file tree
Showing 15 changed files with 104 additions and 99 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ logs/
variables-ct*
*.iml
data
_build
_build
.idea
130 changes: 68 additions & 62 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
This is the [MongoDB](https://www.mongodb.org/) driver for Erlang.

[![Build Status](https://travis-ci.com/comtihon/mongodb-erlang.svg?branch=master)](https://travis-ci.com/comtihon/mongodb-erlang)
[![Run tests](https://github.com/comtihon/mongodb-erlang/actions/workflows/test.yml/badge.svg)](https://github.com/comtihon/mongodb-erlang/actions/workflows/test.yml)
[![Enot](https://enot.justtech.blog/badge?full_name=comtihon/mongodb-erlang)](https://enot.justtech.blog)

### Usage
Add this repo as the dependency:
Add this repo as the dependency:
Rebar

{deps, [
Expand All @@ -31,14 +31,14 @@ Start all applications, needed by mongodb

> application:ensure_all_started (mongodb).

__Important__:
`mongoc` API was changed in `3.0.0`.
`mc_cursor` API was changed in `3.0.0.`
__Important__:
`mongoc` API was changed in `3.0.0`.
`mc_cursor` API was changed in `3.0.0.`

This driver has two api modules - `mc_worker_api` and `mongo_api`.
This driver has two api modules - `mc_worker_api` and `mongo_api`.
`mc_worker_api` works directly with one connection, while all `mongo_api`
interfaces refer to `mongoc` pool. Although `mongoc` is not stable for now
you should use it if you have shard and need to determine mongo topology.
you should use it if you have shard and need to determine mongo topology.
If you are choosing between using
[mongos](https://docs.mongodb.com/manual/reference/program/mongos/) and
using mongo shard with `mongo_api` - prefer mongos and use `mc_worker_api`.
Expand All @@ -47,7 +47,7 @@ mc_worker_api -- direct connection client
---------------------------------

### Connecting
To connect to a database `test` on mongodb server listening on
To connect to a database `test` on mongodb server listening on
`localhost:27017` (or any address & port of your choosing)
use `mc_worker_api:connect/1`.

Expand All @@ -66,50 +66,56 @@ use `mc_worker_api:connect/1`.
| {port, integer()}
| {register, atom() | fun()}
| {next_req_fun, fun()}.
To connect mc_worker in your supervised pool, use `mc_worker:start_link/1`
To connect mc_worker in your supervised pool, use `mc_worker:start_link/1`
instead and pass all args to it.

`safe`, along with `{safe, GetLastErrorParams}` and `unsafe`, are
`safe`, along with `{safe, GetLastErrorParams}` and `unsafe`, are
write-modes. Safe mode makes a *getLastError* request
after every write in the sequence. If the reply says it failed then
the rest of the sequence is aborted and returns
`{failure, {write_failure, Reason}}`, or `{failure, not_master}` when
after every write in the sequence. If the reply says it failed then
the rest of the sequence is aborted and returns
`{failure, {write_failure, Reason}}`, or `{failure, not_master}` when
connected to a slave. An example write
failure is attempting to insert a duplicate key that is indexed to be
failure is attempting to insert a duplicate key that is indexed to be
unique. Alternatively, unsafe mode issues every
write without a confirmation, so if a write fails you won't know about
write without a confirmation, so if a write fails you won't know about
it and remaining operations will be executed.
This is unsafe but faster because you there is no round-trip delay.
This is unsafe but faster because you there is no round-trip delay.

`master`, along with `slave_ok`, are read-modes. `master` means every
`master`, along with `slave_ok`, are read-modes. `master` means every
query in the sequence must read fresh data (from
a master/primary server). If the connected server is not a master then
a master/primary server). If the connected server is not a master then
the first read will fail, the remaining operations
will be aborted, and `mongo:do` will return `{failure, not_master}`.
will be aborted, and `mongo:do` will return `{failure, not_master}`.
`slave_ok` means every query is allowed to read
stale data from a slave/secondary (fresh data from a master is fine too).

If you set `{register, Name}` option - mc_worker process will be
registered on this Name, or you can pass function
`fun(pid())`, which it runs with self pid.
If you set `{login, Login}` and `{password, Password}` options -
mc_worker will try to authenticate to the database.

`next_req_fun` is a function caller every time, when worker sends
request to database. It can be use to optimise pool
usage. When you use poolboy transaction (or mongoc transaction, which
stale data from a slave/secondary (fresh data from a master is fine too).

Read-modes only apply to the deprecated mc_worker_api query commands. Pass
a `readopts` map, like `#{<<"mode">> => <<"primary">>}` to an `mc_worker_api`
function like
`find_one(Conn, Coll, Selector, [{readopts, #{<<"mode">> => <<"primary">>}}])`
for the new API.

If you set `{register, Name}` option - mc_worker process will be
registered on this Name, or you can pass function
`fun(pid())`, which it runs with self pid.
If you set `{login, Login}` and `{password, Password}` options -
mc_worker will try to authenticate to the database.

`next_req_fun` is a function caller every time, when worker sends
request to database. It can be use to optimise pool
usage. When you use poolboy transaction (or mongoc transaction, which
use poolboy transaction) - `mc_worker` sends request
to database and do nothing, waiting for reply. You can use
to database and do nothing, waiting for reply. You can use
`{next_req_fun, fun() -> poolboy:checkin(?DBPOOL, self()) end}`
to make workers return to pool as soon as they finish request.
When responce from database comes back - it will be saved in
mc_worker msgbox. Msgbox will be processed just before the next
call to mc_worker.
__Notice__, that poolboy's pool should be created with `{strategy, fifo}`
to make workers return to pool as soon as they finish request.
When response from database comes back - it will be saved in
mc_worker msgbox. Msgbox will be processed just before the next
call to mc_worker.
__Notice__, that poolboy's pool should be created with `{strategy, fifo}`
to make uniform usage of pool workers.

### Writing
After you connected to your database - you can carry out write operations,
After you connected to your database - you can carry out write operations,
such as `insert`, `update` and `delete`:

> Collection = <<"test">>.
Expand All @@ -127,7 +133,7 @@ such as `insert`, `update` and `delete`:
<<"home">>=> #{<<"city">> => <<"Boston">>, <<"state">> => <<"MA">>},
<<"league">> => <<"American">>}
]),
An insert example (from `mongo_SUITE` test module). `Connection` is your
An insert example (from `mongo_SUITE` test module). `Connection` is your
Connection, got `from mc_worker_api:connect`, `Collection`
is your collection name, `Doc` is something, you want to save.
Doc will be returned, if insert succeeded. If Doc doesn't contains `_id`
Expand All @@ -136,9 +142,9 @@ automatically generated '_id' fields. If error occurred - Connection will
fall.

> mc_worker_api:delete(Connection, Collection, Selector).
Delete example. `Connection` is your Connection, `Collection` - is a
Delete example. `Connection` is your Connection, `Collection` - is a
collection you want to clean. `Selector` is the
rules for cleaning. If you want to clean everything - pass empty `{}`.
rules for cleaning. If you want to clean everything - pass empty `{}`.
You can also use maps instead bson documents:

> Collection = <<"test">>.
Expand All @@ -150,14 +156,14 @@ To call read operations use `find`, `find_one`:

> {ok, Cursor} = mc_worker_api:find(Connection, Collection, Selector)
All params similar to `delete`.
The difference between `find` and `find_one` is in return. Find_one just
The difference between `find` and `find_one` is in return. Find_one just
returns your result, while find returns you a
`Cursor` - special process' pid. You can query data through the process
`Cursor` - special process' pid. You can query data through the process
with the help of `mc_cursor' module.

> Result = mc_cursor:next(Cursor),
> mc_cursor:close(Cursor),
__Important!__ Do not forget to close cursors after using them!
__Important!__ Do not forget to close cursors after using them!
`mc_cursor:rest` closes the cursor automatically.

To search for params - specify `Selector`:
Expand All @@ -166,19 +172,19 @@ To search for params - specify `Selector`:
will return one document from collection Collection with key == <<"123">>.

mc_worker_api:find_one(Connection, Collection, #{<<"key">> => <<"123">>, <<"value">> => <<"built_in">>}).
will return one document from collection Collection with key == <<"123">>
will return one document from collection Collection with key == <<"123">>
`and` value == <<"built_in">>.
Tuples `{<<"key">>, <<"123">>}` in first example and `{<<"key">>, <<"123">>,
<<"value">>, <<"built_in">>}` are selectors.

For filtering result - use `Projector`:

mc_worker_api:find_one(Connection, Collection, {}, #{projector => #{<<"value">> => true}).
will return one document from collection Collection with fetching `only`
will return one document from collection Collection with fetching `only`
_id and value.

mc_worker_api:find_one(Connection, Collection, {}, #{projector => #{<<"key">> => false, <<"value">> => false}}).
will return your data without key and value params. If there is no other
will return your data without key and value params. If there is no other
data - only _id will be returned.

### Updating
Expand All @@ -192,7 +198,7 @@ This updates selected fields:
<<"tags">> => ["coats", "outerwear", "clothing"]
}},
mc_worker_api:update(Connection, Collection, #{<<"_id">> => 100}, Command),
This will add new field `expired`, if there is no such field, and set it
This will add new field `expired`, if there is no such field, and set it
to true.

Command = #{<<"$set">> => #{<<"expired">> => true}},
Expand All @@ -217,42 +223,42 @@ To create indexes - use `mc_worker_api:ensure_index/3` command:
mc_worker_api:ensure_index(Connection, Collection, #{<<"key">> => #{<<"index">> => 1}, <<"name">> => <<"MyI">>}). %advanced
mc_worker_api:ensure_index(Connection, Collection, #{<<"key">> => #{<<"index">> => 1}, <<"name">> => <<"MyI">>, <<"unique">> => true, <<"dropDups">> => true}). %full

ensure_index takes `mc_worker`' pid or atom name as first parameter,
ensure_index takes `mc_worker`' pid or atom name as first parameter,
collection, where to create index, as second
parameter and bson document with index
specification - as third parameter. In index specification one can set all
specification - as third parameter. In index specification one can set all
or only some parameters.
If index specification is not full - it is automatically filled with
If index specification is not full - it is automatically filled with
values: `name, Name, unique, false, dropDups,
false`, where `Name` is index's key.

### Administering
This driver does not provide helper functions for commands. Use
This driver does not provide helper functions for commands. Use
`mc_worker_api:command` directly and refer to the
[MongoDB documentation](http://www.mongodb.org/display/DOCS/Commands)
[MongoDB documentation](http://www.mongodb.org/display/DOCS/Commands)
for how to issue raw commands.

### Authentication
To authenticate use function `mc_worker_api:connect`, or
`mc_worker:start_link([...{login, <<"login">>}, {password, <<"password">>}...]`
To authenticate use function `mc_worker_api:connect`, or
`mc_worker:start_link([...{login, <<"login">>}, {password, <<"password">>}...]`

### Timeout

By default timeout for all connections to connection gen_server is `infinity`.
By default timeout for all connections to connection gen_server is `infinity`.
If you found problems with it - you can
modify timeout.
To modify it just add `mc_worker_call_timeout` with new value to your
applications's env config.

Timeout for operations with cursors may be explicity passed to `mc_cursor:next/2`,
`mc_cursor:take/3`, `mc_cursor:rest/2`, and `mc_cursor:foldl/5` functions,
Timeout for operations with cursors may be explicitly passed to `mc_cursor:next/2`,
`mc_cursor:take/3`, `mc_cursor:rest/2`, and `mc_cursor:foldl/5` functions,
by default used value of `cursor_timeout` from application config, or `
infinity` if `cursor_timeout` not specified.

### Pooling

If you need simple pool - use modified [Poolboy](https://github.com/comtihon/poolboy),
which is included in this app's deps. As a worker module use `mc_worker_api`.
If you need simple pool - use modified [Poolboy](https://github.com/comtihon/poolboy),
which is included in this app's deps. As a worker module use `mc_worker_api`.
If you need pool to mongo shard with determining topology - use `mongo_api`
for automatic topology discovery and monitoring. It uses poolboy inside.

Expand All @@ -273,7 +279,7 @@ For opening a connection to a MongoDB server you can call `mongoc:connect/3`:
Where `Seed` contains information about host names and ports to connect
and info about topology of MongoDB deployment.

So you can pass just a hostname with port (or tuple with single key) for
So you can pass just a hostname with port (or tuple with single key) for
connection to a single server deployment:

```erlang
Expand All @@ -294,8 +300,8 @@ To connect to a sharded cluster of mongos:
ShardedSeed = { sharded, ["hostname1:port1", "hostname2:port2"] }
```

And if you want your MongoDB deployment metadata to be auto-discovered use
the `unknown` type in the `Seed` tuple:
And if you want your MongoDB deployment metadata to be auto-discovered use
the `unknown` type in the `Seed` tuple:

```erlang
AutoDiscoveredSeed = { unknown, ["hostname1:port1", "hostname2:port2"] }
Expand Down
4 changes: 2 additions & 2 deletions doc/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ I am a 10gen employee an author of the official [Erlang MongoDB driver](http://g

### BSON

At the highest level, the driver is divided into two library applications, [mongodb](http://github.com/mongodb/mongodb-erlang) and [bson](http://github.com/mongodb/bson-erlang). Bson is defined independently of MongoDB at [bsonspec.org](http://bsonspec.org). One design decision was how to represent Bson documents in Erlang. Conceptually, a document is a record, but unlike an Erlang record, a Bson document does not have a single type tag. Futhermore, the same MongoDB collection can hold different types of records. So I decided to represent a Bson document as a tuple with labels interleaved with values, as in `{name, Name, address, Address}`. An alternative would have been to represent a document as a list of label-value pairs, but I wanted to reserve lists for Bson arrays.
At the highest level, the driver is divided into two library applications, [mongodb](http://github.com/mongodb/mongodb-erlang) and [bson](http://github.com/mongodb/bson-erlang). Bson is defined independently of MongoDB at [bsonspec.org](http://bsonspec.org). One design decision was how to represent Bson documents in Erlang. Conceptually, a document is a record, but unlike an Erlang record, a Bson document does not have a single type tag. Furthermore, the same MongoDB collection can hold different types of records. So I decided to represent a Bson document as a tuple with labels interleaved with values, as in `{name, Name, address, Address}`. An alternative would have been to represent a document as a list of label-value pairs, but I wanted to reserve lists for Bson arrays.

A Bson value is one of several types. One of these types is the document type itself, making it recursive. Several value types are not primitive, like objectid and javascript, so I had to create a tagged tuple for each of them. I defined them all to have an odd number of elements to distinguish them from a document which has an even number of elements. Finally, to distinguish between a string and a list of integers, which is indistinguishable in Erlang, I require Bson strings to be binary (UTF-8). Therefore, a plain Erlang string is interpreted as a Bson array of integers, so make sure to always encode your strings, as in `<<"hello">>` or `bson:utf8("hello")`.

Expand All @@ -25,7 +25,7 @@ You may notice that a DB action is analogous to a DB transaction for a relationa

Detailed documentation with examples can be found in the ReadMe's of the two libraries, [mongodb](http://github.com/mongodb/mongodb-erlang) and [bson](http://github.com/mongodb/bson-erlang), and in their source code comments and test modules.

In addition recent docuemenation on the API will be avaliable on the [mongodb website](http://api.mongodb.org/erlang/mongodb/).
In addition recent docuemenation on the API will be available on the [mongodb website](http://api.mongodb.org/erlang/mongodb/).

Should you wish to generate this from the latest code, you can run:

Expand Down
4 changes: 2 additions & 2 deletions enot_config.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "mongodb",
"app_vsn": "3.4.4",
"app_vsn": "3.4.6",
"deps": [
{
"name": "bson",
Expand All @@ -15,7 +15,7 @@
{
"name": "poolboy",
"url": "git://github.com/comtihon/poolboy",
"branch": "master"
"branch": "1.6.1"
}
]
}
3 changes: 2 additions & 1 deletion include/mongo_protocol.hrl
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@
-record(getmore, {
collection :: colldb(),
batchsize = 0 :: mc_worker_api:batchsize(),
cursorid :: mc_worker_api:cursorid()
cursorid :: mc_worker_api:cursorid(),
database :: database()
}).

%% system
Expand Down
6 changes: 3 additions & 3 deletions rebar.config
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@

{deps, [
{bson, ".*",
{git, "git://github.com/comtihon/bson-erlang", {tag, "v0.2.4"}}},
{git, "https://github.com/comtihon/bson-erlang.git", {tag, "v0.2.4"}}},
{pbkdf2, ".*",
{git, "git://github.com/comtihon/erlang-pbkdf2.git", {tag, "2.0.1"}}},
{git, "https://github.com/comtihon/erlang-pbkdf2.git", {tag, "2.0.1"}}},
{poolboy, ".*",
{git, "git://github.com/comtihon/poolboy.git", {branch, "master"}}}
{git, "https://github.com/comtihon/poolboy.git", {branch, "1.6.1"}}}
]}.

{clean_files, [
Expand Down
12 changes: 0 additions & 12 deletions rebar.lock

This file was deleted.

Binary file modified rebar3
Binary file not shown.
6 changes: 3 additions & 3 deletions src/connection/mc_auth_logic.erl
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ scram_sha_1_auth(Connection, Database, Login, Password) ->
scram_first_step(Connection, Database, Login, Password)
catch
_:_ ->
erlang:error(<<"Can't pass authentification">>)
erlang:error(<<"Can't pass authentication">>)
end.

%% @private
Expand Down Expand Up @@ -134,8 +134,8 @@ xorKeys(<<FA, RestA/binary>>, <<FB, RestB/binary>>, Res) ->
xorKeys(RestA, RestB, <<Res/binary, <<(FA bxor FB)>>/binary>>).

%% @private
parse_server_responce(Responce) ->
ParamList = binary:split(Responce, <<",">>, [global]),
parse_server_responce(Response) ->
ParamList = binary:split(Response, <<",">>, [global]),
lists:map(
fun(Param) ->
[K, V] = binary:split(Param, <<"=">>),
Expand Down
5 changes: 3 additions & 2 deletions src/connection/mc_connection_man.erl
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,14 @@
read(Connection, Request) -> read(Connection, Request, undefined).

-spec read(pid() | atom(), query(), undefined | mc_worker_api:batchsize()) -> [] | {ok, pid()}.
read(Connection, Request = #'query'{collection = Collection, batchsize = BatchSize}, CmdBatchSize) ->
read(Connection, Request = #'query'{collection = Collection, batchsize = BatchSize, database = DB}, CmdBatchSize) ->
case request_worker(Connection, Request) of
{_, []} ->
[];
{Cursor, Batch} ->
mc_cursor:start_link(Connection, Collection, Cursor, select_batchsize(CmdBatchSize, BatchSize), Batch)
mc_cursor:start_link(Connection, Collection, Cursor, select_batchsize(CmdBatchSize, BatchSize), Batch, DB)
end.


-spec read_one(pid() | atom(), query()) -> undefined | map().
read_one(Connection, Request) ->
Expand Down
Loading

0 comments on commit d993a9e

Please sign in to comment.