Skip to content

Commit

Permalink
add markdown documentation for doc embedding schema
Browse files Browse the repository at this point in the history
  • Loading branch information
e-maud committed Nov 29, 2024
1 parent d49f6c9 commit b0a58ea
Show file tree
Hide file tree
Showing 19 changed files with 414 additions and 0 deletions.
9 changes: 9 additions & 0 deletions docs/embeddings-docs-backup-properties-the-embedder-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## embedder Type

`string` ([The Embedder Schema](embeddings-docs-backup-properties-the-embedder-schema.md))

## embedder Examples

```json
"Alibaba-NLP/gte-multilingual-base@f7d567e"
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## items Type

`number` ([The Items Schema](embeddings-docs-backup-properties-the-embedding-schema-the-items-schema.md))

## items Examples

```json
-0.11429
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## embedding Type

`number[]` ([The Items Schema](embeddings-docs-backup-properties-the-embedding-schema-the-items-schema.md))
19 changes: 19 additions & 0 deletions docs/embeddings-docs-backup-properties-the-id-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## id Type

`string` ([The Id Schema](embeddings-docs-backup-properties-the-id-schema.md))

## id Constraints

**pattern**: the string must match the following regular expression: 

```regexp
^(.*)$
```

[try pattern](https://regexr.com/?expression=%5E\(.*\)%24 "try regular expression with regexr.com")

## id Examples

```json
"actionfem-1940-01-08-a-i0001"
```
9 changes: 9 additions & 0 deletions docs/embeddings-docs-backup-properties-the-length-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## len Type

`integer` ([The Length Schema](embeddings-docs-backup-properties-the-length-schema.md))

## len Examples

```json
2976
```
19 changes: 19 additions & 0 deletions docs/embeddings-docs-backup-properties-the-ts-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
## ts Type

`string` ([The Ts Schema](embeddings-docs-backup-properties-the-ts-schema.md))

## ts Constraints

**pattern**: the string must match the following regular expression: 

```regexp
^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\+00:00|Z)$
```

[try pattern](https://regexr.com/?expression=%5E%5B0-9%5D%7B4%7D-%5B0-9%5D%7B2%7D-%5B0-9%5D%7B2%7DT%5B0-9%5D%7B2%7D%3A%5B0-9%5D%7B2%7D%3A%5B0-9%5D%7B2%7D\(%5C%2B00%3A00%7CZ\)%24 "try regular expression with regexr.com")

## ts Examples

```json
"2024-08-29T06:42:53+00:00Z"
```
147 changes: 147 additions & 0 deletions docs/embeddings-docs-backup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
## Document Embeddings JSON Schema Type

`object` ([Document Embeddings JSON Schema](embeddings-docs-backup.md))

# Document Embeddings JSON Schema Properties

| Property | Type | Required | Nullable | Defined by |
| :---------------------- | :-------- | :------- | :------------- | :------------------------------------------------------------------------------------------------------------------------------------------ |
| [id](#id) | `string` | Required | cannot be null | [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-id-schema.md "#/properties/id#/properties/id") |
| [ts](#ts) | `string` | Required | cannot be null | [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-ts-schema.md "#/properties/ts#/properties/ts") |
| [embedder](#embedder) | `string` | Required | cannot be null | [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-embedder-schema.md "#/properties/embedder#/properties/embedder") |
| [len](#len) | `integer` | Optional | cannot be null | [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-length-schema.md "#/properties/len#/properties/len") |
| [embedding](#embedding) | `array` | Required | cannot be null | [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-embedding-schema.md "#/properties/embedding#/properties/embedding") |

## id

The unique identifier for a content item, cf. <https://github.com/impresso/impresso-schemas/blob/master/json/newspaper/contentitem.schema.json>

`id`

* is required

* Type: `string` ([The Id Schema](embeddings-docs-backup-properties-the-id-schema.md))

* cannot be null

* defined in: [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-id-schema.md "#/properties/id#/properties/id")

### id Type

`string` ([The Id Schema](embeddings-docs-backup-properties-the-id-schema.md))

### id Constraints

**pattern**: the string must match the following regular expression:&#x20;

```regexp
^(.*)$
```

[try pattern](https://regexr.com/?expression=%5E\(.*\)%24 "try regular expression with regexr.com")

### id Examples

```json
"actionfem-1940-01-08-a-i0001"
```

## ts

The timestamp when the embeddings were created

`ts`

* is required

* Type: `string` ([The Ts Schema](embeddings-docs-backup-properties-the-ts-schema.md))

* cannot be null

* defined in: [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-ts-schema.md "#/properties/ts#/properties/ts")

### ts Type

`string` ([The Ts Schema](embeddings-docs-backup-properties-the-ts-schema.md))

### ts Constraints

**pattern**: the string must match the following regular expression:&#x20;

```regexp
^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}(\+00:00|Z)$
```

[try pattern](https://regexr.com/?expression=%5E%5B0-9%5D%7B4%7D-%5B0-9%5D%7B2%7D-%5B0-9%5D%7B2%7DT%5B0-9%5D%7B2%7D%3A%5B0-9%5D%7B2%7D%3A%5B0-9%5D%7B2%7D\(%5C%2B00%3A00%7CZ\)%24 "try regular expression with regexr.com")

### ts Examples

```json
"2024-08-29T06:42:53+00:00Z"
```

## embedder

The model or tool used to generate the embeddings

`embedder`

* is required

* Type: `string` ([The Embedder Schema](embeddings-docs-backup-properties-the-embedder-schema.md))

* cannot be null

* defined in: [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-embedder-schema.md "#/properties/embedder#/properties/embedder")

### embedder Type

`string` ([The Embedder Schema](embeddings-docs-backup-properties-the-embedder-schema.md))

### embedder Examples

```json
"Alibaba-NLP/gte-multilingual-base@f7d567e"
```

## len

The length of the document in characters.

`len`

* is optional

* Type: `integer` ([The Length Schema](embeddings-docs-backup-properties-the-length-schema.md))

* cannot be null

* defined in: [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-length-schema.md "#/properties/len#/properties/len")

### len Type

`integer` ([The Length Schema](embeddings-docs-backup-properties-the-length-schema.md))

### len Examples

```json
2976
```

## embedding

The vector embeddings of the document

`embedding`

* is required

* Type: `number[]` ([The Items Schema](embeddings-docs-backup-properties-the-embedding-schema-the-items-schema.md))

* cannot be null

* defined in: [Document Embeddings JSON Schema](embeddings-docs-backup-properties-the-embedding-schema.md "#/properties/embedding#/properties/embedding")

### embedding Type

`number[]` ([The Items Schema](embeddings-docs-backup-properties-the-embedding-schema-the-items-schema.md))
9 changes: 9 additions & 0 deletions docs/embeddings-docs-properties-ci_id.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## ci\_id Type

`string`

## ci\_id Examples

```json
"actionfem-1940-01-08-a-i0001"
```
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-ci_type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## ci\_type Type

`string`
9 changes: 9 additions & 0 deletions docs/embeddings-docs-properties-embedding-oneof-0-items.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## items Type

`number`

## items Examples

```json
-0.11429
```
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-embedding-oneof-0.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## 0 Type

`number[]`
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## items Type

`number`

## items Examples

```json
-0.11429
```
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-embedding-oneof-1-items.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## items Type

`number[]`
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-embedding-oneof-1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## 1 Type

`number[][]`
9 changes: 9 additions & 0 deletions docs/embeddings-docs-properties-embedding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
## embedding Type

merged type ([Details](embeddings-docs-properties-embedding.md))

one (and only one) of

* [Untitled array in Document Embeddings JSON Schema](embeddings-docs-properties-embedding-oneof-0.md "check type definition")

* [Untitled array in Document Embeddings JSON Schema](embeddings-docs-properties-embedding-oneof-1.md "check type definition")
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-model_id.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## model\_id Type

`string`
3 changes: 3 additions & 0 deletions docs/embeddings-docs-properties-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## size Type

`integer`
7 changes: 7 additions & 0 deletions docs/embeddings-docs-properties-ts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## ts Type

`string`

## ts Constraints

**date time**: the string must be a date time string, according to [RFC 3339, section 5.6](https://tools.ietf.org/html/rfc3339 "check the specification")
Loading

0 comments on commit b0a58ea

Please sign in to comment.