Skip to content

Commit

Permalink
more docs, better readme, also build
Browse files Browse the repository at this point in the history
  • Loading branch information
erhant committed Feb 2, 2024
1 parent 500434f commit 89e32d1
Show file tree
Hide file tree
Showing 11 changed files with 229 additions and 45 deletions.
27 changes: 27 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Publish Package to npmjs

on:
release:
types: [published]

jobs:
publish:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- uses: oven-sh/setup-bun@v1

# do we really need this one?
- uses: actions/setup-node@v3
with:
node-version: 18.x

- run: bun install

- run: bun run build

- uses: JS-DevTools/npm-publish@v3
id: publish
with:
token: ${{ secrets.NPM_TOKEN }}
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -178,3 +178,6 @@ dist
.env.test
.env.prod
.env.dev

# build
build
118 changes: 107 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
</p>
</p>

DriaJS client is a library & CLI that integrates Dria to your application, providing a convenient interface to harness the capabilities of Dria's vector search and retrieval services.
DriaJS client is a library & CLI that integrates [Dria](https://dria.co/) to your application, providing a convenient interface to harness the capabilities of Dria's vector search and retrieval services.

- [x] Create & manage your knowledge bases on Dria.
- [x] Make vector based queries, text based searches or fetch vectors by their IDs.
Expand All @@ -30,29 +30,125 @@ bun add dria

## Usage

With Dria, you can connect to an existing knowledge uploaded to Dria by providing its contract txID.
To begin, import Dria to your code:

TODO: add readme
```ts
import Dria from "dria";
```

## Testing
### Queries

Clone the repo, and then install packages:
With Dria, you can connect to an existing knowledge uploaded to Dria by providing its contract ID. You can then ask questions to this knowledge, make vector based queries, or directly fetch embeddings with their IDs.

```sh
bun install
```ts
const dria = new Dria({ apiKey, contractId });

// a text-based search
const searchRes = await dria.search("What is the capital of France?");

// a vector-based query
const queryRes = await dria.query([0.1, 0.2, 0.3]);

// fetch data for specific ids
const queryRes = await dria.fetch([0, 1, 2]);
```

You can run tests via:
> [!TIP]
>
> You can omit the `apiKey`, in which case Dria will look for it at `DRIA_API_KEY` environment variable.
### Inserting Data

You can insert new data to your existing knowledge, either as batch of texts with metadata or vectors with metadata.

```ts
const dria = new Dria({ apiKey, contractId });

// insert raw text, which will be converted to vector embeddings
// with respect to the model used by this contract
const insertTextRes = await dria.insertTexts([
{ text: "I am a text.", metadata: { fromReadme: true } },
{ text: "I am another text.", metadata: { fromReadme: true } },
]);

// or, compute embeddings on your own and insert the vectors
const insertTextRes = await dria.insertTexts([
{ vector: [0.1, 0.2, 0.3], metadata: { fromReadme: true } },
{ vector: [0.3, 0.2, 0.1], metadata: { fromReadme: true } },
]);
```

### Creating a Knowledge

A new knowledge can be created with Dria client as well. In this example, we omit the `contractId` that was provided to the constructor, since we don't have a contract yet. After deploying a contract, we will set that field manually and we will then be able to call all functions described above so far!

```ts
const dria = new Dria({ apiKey });

contractId = await dria.create(
"My New Contract,
"jinaai/jina-embeddings-v2-base-en",
"Science",
);
dria.contractId = contractId;
```

Our client supports a variety of text embedding models by default:

- OpenAI's Text Embeddings-2 Ada (text-embedding-ada-002)
- OpenAI's Text Embeddings-3 Small (text-embedding-3-large)
- OpenAI's Text Embeddings-3 Large (text-embedding-ada-002)
- Jina's Embeddings V2 Base EN (jina-embeddings-v2-base-en)
- Jina's Embeddings V2 Small EN (jina-embeddings-v2-small-en)

> [!WARNING]
>
> If you provide a different embedding model when creating a contract, you are expected to use those same embeddings models to create vectors from text queries, and call the `query` method.
### Metadata Types

Each knowledge may have a different metadata type, based on the content they were created from. For example, a CSV knowledge will have each column as a separate field in the metadata. You can provide the metadata type as a template parameter so that all methods are type-safe:

```ts
type MetadataType = { id: number; foo: string; bar: boolean };
const dria = new Dria<MetadataType>();

// metadata is typed as given above
const res = dria.fetch([0]);
```

Metadata type can be overridden for each method as well, if the need may be:

```ts
const res = dria.fetch<{ page: number; source: string }>([0]);
```

## Building

You can build the library for NPM via:

```sh
bun test
bun run build
bun b # alias
```

We are using Bun's own [bundler](https://bun.sh/docs/bundler).

> [!NOTE]
>
> The protobuf files are included in the repo, but they can be generated again via:
>
> ```sh
> bun proto:code:insert
> bun proto:type:insert
> bun proto
> ```
## Testing
You can run tests via:
```sh
bun run test
bun t # alias
```
You will need an API key at `DRIA_API_KEY` environment variable, which you can provide in an `.env.test` file.
10 changes: 10 additions & 0 deletions build.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// eslint-disable-next-line node/no-unpublished-import
import dts from "bun-plugin-dts";

await Bun.build({
entrypoints: ["./src/index.ts"],
outdir: "./build",
target: "node",
external: ["axios", "zod"],
plugins: [dts()],
});
Binary file modified bun.lockb
Binary file not shown.
17 changes: 13 additions & 4 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,22 @@
"contributors": [
"Erhan Tezcan <[email protected]> (https://github.com/erhant)"
],
"files": [
"build/",
"LICENSE",
"README.md"
],
"scripts": {
"cli": "bun run ./src/bin/index.ts",
"build": "bun run ./build.ts",
"b": "bun run build",
"check": "tsc --noEmit && echo \"All good.\"",
"format": "prettier --check '**/*.ts'",
"lint": "eslint '**/*.ts' && echo 'All good.'",
"t": "bun test",
"proto:code:insert": "npx pbjs ./proto/insert.proto -w commonjs -t static-module -o ./proto/insert.js",
"proto:type:insert": "npx pbts ./proto/insert.js -o ./proto/insert.d.ts"
"test": "bun test --timeout 15000",
"t": "bun run test",
"proto:code": "npx pbjs ./proto/insert.proto -w commonjs -t static-module -o ./proto/insert.js",
"proto:type": "npx pbts ./proto/insert.js -o ./proto/insert.d.ts",
"proto": "bun proto:code && bun proto:type"
},
"type": "module",
"module": "index.ts",
Expand All @@ -24,6 +32,7 @@
"devDependencies": {
"@types/bun": "^1.0.4",
"@typescript-eslint/eslint-plugin": "^6.20.0",
"bun-plugin-dts": "^0.2.1",
"eslint": "^8.56.0",
"eslint-config-prettier": "^9.1.0",
"eslint-plugin-node": "^11.1.0",
Expand Down
44 changes: 36 additions & 8 deletions src/dria.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,39 @@ import constants from "./constants";
/**
* ## Dria JS Client
*
* If no API key is provided, Dria will look for `DRIA_API_KEY` on the environment.
*
* @param params optional API key and contract txID.
*
* - `apiKey`: User API key.
*
* If not provided, Dria will look for `DRIA_API_KEY` on the environment.
* To find your API key, go to your profile page at [Dria](https://dria.co/profile).
*
* - `contractId`: Contract ID for the knowledge, corresponding to the transaction id of a contract deployment on Arweave.
* In [Dria](https://dria.co/profile), this can be seen at the top of the page when viewing a knowledge.
*
* @template T default type of metadata; a metadata in Dria is a single-level mapping, with string keys and values of type `string`, `number`
*
* @example
* const dria = new Dria({
* apiKey: "your-api-key",
* contractId: "your-contract"
* });
*
* @example
* // provide metadata type
* const dria = new Dria<{foo: string, bar: number}>({
* contractId: "your-contract"
* // apiKey not provided here, so Dria will
* // read it from env as `DRIA_API_KEY`
* });
*
* @example
* // optional metadata type
* type MetadataType = {foo: string, bar: number};
* const dria = new Dria<MetadataType>();
* const dria = new Dria();
* const contractId = await dria.create();
* dria.contractId = contractId;
*/
// eslint-disable-next-line @typescript-eslint/no-explicit-any
export class Dria<T extends MetadataType = any> {
export default class Dria<T extends MetadataType = any> {
protected client: AxiosInstance;
contractId: string | undefined;
/** Cached contract models. */
Expand Down Expand Up @@ -95,11 +117,14 @@ export class Dria<T extends MetadataType = any> {
*/
async fetch<M extends MetadataType = T>(ids: number[]) {
if (ids.length === 0) throw "No IDs provided.";
const data = await this.post<string[]>(constants.DRIA_BASE_URL + "/fetch", {
const data = await this.post<{ metadata: string[]; vectors: number[][] }>(constants.DRIA_BASE_URL + "/fetch", {
id: ids,
contract_id: this.getContractId(),
});
return data.map((d) => JSON.parse(d) as M);
return data.metadata.map((m, i) => ({
metadata: JSON.parse(m) as M,
vector: data.vectors[i],
}));
}

/**
Expand Down Expand Up @@ -157,11 +182,14 @@ export class Dria<T extends MetadataType = any> {
* @param description (optional) description of the knowledge.
* @returns contract txID of the created contract.
* @example
* const dria = new Dria({apiKey: "your-api-key"});
* const contractId = await dria.create(
* "My Contract",
* "jinaai/jina-embeddings-v2-base-en",
* "Science"
* )
* dria.contractId = contractId;
* // you can now make queries, or insert data there
*/
async create(name: string, embedding: ModelTypes, category: CategoryTypes, description: string = "") {
const data = await this.post<{ contract_id: string }>(constants.DRIA_CONTRACT_URL + "/create", {
Expand Down
4 changes: 3 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
export { Dria } from "./dria";
import Dria from "./dria";
export default Dria;

export type { DriaParams } from "./types";
12 changes: 1 addition & 11 deletions src/types/index.ts
Original file line number Diff line number Diff line change
@@ -1,14 +1,4 @@
/**
* Dria client parameters.
*
* - `apiKey`: User API key.
*
* If not provided, Dria will look for `DRIA_API_KEY` on the environment.
* To find your API key, go to your profile page at [Dria](https://dria.co/profile).
*
* - `contractId`: Contract ID for the knowledge, corresponding to the transaction id of a contract deployment on Arweave.
* In [Dria](https://dria.co/profile), this can be seen at the top of the page when viewing a knowledge.
*/
/** Dria client parameters. */
export interface DriaParams {
apiKey?: string;
contractId?: string;
Expand Down
32 changes: 25 additions & 7 deletions tests/api.test.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { expect, describe, it } from "bun:test";
import { randomVector } from "./utils";
import { BatchTexts, BatchVectors } from "../src/schemas";
import { Dria } from "../src";
import Dria from "../src";

describe("API", () => {
// contract of a TypeScript Book uploaded to Dria
Expand All @@ -16,14 +16,20 @@ describe("API", () => {
const ids = [0, 1, 2];
const res = await dria.fetch(ids);
res.forEach((r) => {
expect(r.id).toBeString();
expect(r.text).toBeString();
expect(r.metadata.id).toBeString();
expect(r.metadata.text).toBeString();
});
});

it("should NOT fetch without ids", async () => {
expect(async () => await dria.fetch([])).toThrow("No IDs provided.");
});

it("should NOT fetch without contract ID", async () => {
dria.contractId = undefined;
expect(async () => await dria.fetch([0])).toThrow("ContractID was not set.");
dria.contractId = contractId;
});
});

describe("search", () => {
Expand Down Expand Up @@ -59,7 +65,13 @@ describe("API", () => {
it("should NOT search with wrong level", async () => {
expect(async () => await dria.search("hi", { level: 5 })).toThrow();
expect(async () => await dria.search("hi", { level: 2.5 })).toThrow();
expect(async () => await dria.search("hi", { level: 0 })).toThrow();
expect(async () => await dria.search("hi", { level: -1 })).toThrow();
});

it("should NOT search without contract ID", async () => {
dria.contractId = undefined;
expect(async () => await dria.search("hi")).toThrow("ContractID was not set.");
dria.contractId = contractId;
});
});

Expand Down Expand Up @@ -96,11 +108,17 @@ describe("API", () => {
expect(async () => await dria.query([1], { topK: 10.05 })).toThrow();
expect(async () => await dria.query([1], { topK: 0 })).toThrow();
});

it("should NOT query without contract ID", async () => {
dria.contractId = undefined;
expect(async () => await dria.query([1])).toThrow("ContractID was not set.");
dria.contractId = contractId;
});
});

// waiting for API fix
describe.todo("insert texts", () => {
it("should insert texts", async () => {
describe("insert texts", () => {
// TODO: waiting for API fix on this
it.todo("should insert texts", async () => {
const res = await dria.insertTexts([
{ text: "I am an inserted text.", metadata: { id: 112233, info: "Test_1" } },
{ text: "I am another inserted text.", metadata: { id: 223344, info: "Test_2" } },
Expand Down
Loading

0 comments on commit 89e32d1

Please sign in to comment.