Trigrams for 460+ languages.
- What is this?
- When should I use this?
- Install
- Use
- API
- Data
- Types
- Compatibility
- Contribute
- Security
- License
This package exposes all trigrams for natural languages. Based on the most translated copyright-free document on this planet: UDHR.
When you are dealing with natural language detection.
This package is ESM only. In Node.js (version 14.14+, 16.0+), install with npm:
npm install trigrams
In Deno with esm.sh
:
import {top, min} from 'https://esm.sh/trigrams@5'
In browsers with esm.sh
:
<script type="module">
import {top, min} from 'https://esm.sh/trigrams@5?bundle'
</script>
import {top, min} from 'trigrams'
console.log((await top()).pam)
console.log((await min()).nld)
Yields:
{ // 300 top trigrams.
'isa': 6,
'upa': 6,
'i k': 6,
// …
'ang': 273,
'ing': 282,
'ng ': 572 // Most common trigram with how often it was found.
}
[ // 300 top trigrams.
' ar',
'eer',
'tij',
// …
'de ',
'an ',
'en ' // Most common trigram.
]
This package exports the identifiers top
and min
.
There is no default export.
Get top trigrams to occurrence counts.
Returns a promise resolving to an object mapping UDHR in Unicode
codes to objects mapping the top 300 trigrams to occurrence counts
(Promise<Record<string, Record<string, number>>>
).
Get top trigrams.
Returns a promise resolving to arrays containing the top 300 trigrams sorted
from least occurring to most occurring
(Promise<Record<string, Array<string>>>
).
The trigrams are based on the unicode versions of the universal declaration of human rights.
The files are created from all paragraphs made available by
wooorm/udhr
and do not include headings and such.
Before creating trigrams,
- the unicode characters from
\u0021
to\u0040
(both including) are removed - one or more white space characters (
\s+
) are replaced with a single space - alphabetic characters are lower cased (
[A-Z]
)
Additionally, the input is padded with two spaces on both sides.
This package is fully typed with TypeScript. It exports no additional types.
This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+ and 16.0+. It also works in Deno and modern browsers.
Yes please! See How to Contribute to Open Source.
This package is safe.