Skip to content

Commit

Permalink
space
Browse files Browse the repository at this point in the history
  • Loading branch information
Trondtr committed Nov 8, 2023
1 parent 3be4d8f commit e322230
Showing 1 changed file with 28 additions and 18 deletions.
46 changes: 28 additions & 18 deletions ling/LinguisticAnalysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,24 +4,6 @@ Linguistic analysis
Instead of compiling the grammatical tools yourself (as described elsewhere on these pages), you may also **download ready-compiled analysers for text analysis**. This page explains how. If you **have** compiled the tools on your machine, we recommend [this page](../tools/docu-sme-manual.md) instead.


## Automatic grammatical analysis

**Summary:** When you have downloaded the files (cf. the **Download...** links below), you will be able to run the following command in a terminal window (the language code *sme* is for North Saami, for other languages, see below):


```
cat yourtextfile.txt | hfst-tokenise -cg sme.pmhfst | vislcg3 -g sme.cg3
```


The textfile is sent through a two-step analysis: First through the morphological analyser **sme.pmhfst**,
by using the support program **hfst-tokenise**. The flag *-cg* ensures morphological analysis in the required format.
Thereafter the output is disambiguated with the disambiguator sme.cg3, by using the support program vislcg3.
The flag *-g* identifies the file *sme.cg3* as the grammar file. In order to see more options, you may write
*hfst-tokenise -h* and *vislcg3 -h*.

You may also conduct automatic dictionary lookup, see below.

# Download commands

## 1. Download the required *support programs*
Expand All @@ -33,20 +15,26 @@ These commands will download the compilers *hfst* and *vislcg3*. They require a
**Download on Mac:**
```
curl http://apertium.projectjj.com/osx/install-nightly.sh > install-nightly.sh
chmod a+x install-nightly.sh
sudo ./install-nightly.sh
```


**Download on Linux ubuntu:**

```
wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt-get -f install apertium-all-dev
```

**Download on Linux fedora:**

```
curl https://apertium.projectjj.com/rpm/install-nightly.sh |sudo bash
sudo apt-get -f install apertium-all-devel
```

Expand Down Expand Up @@ -92,6 +80,28 @@ Replace the language code **sme** with the language you want (note! the language
More languages may be added upon request, from [this list](https://giellalt.github.io/LanguageModels.html).



# Using the programs

## Automatic grammatical analysis

**Summary:** When you have downloaded the files (cf. the **Download...** links below), you will be able to run the following command in a terminal window (the language code *sme* is for North Saami, for other languages, see below):


```
cat yourtextfile.txt | hfst-tokenise -cg sme.pmhfst | vislcg3 -g sme.cg3
```


The textfile is sent through a two-step analysis: First through the morphological analyser **sme.pmhfst**,
by using the support program **hfst-tokenise**. The flag *-cg* ensures morphological analysis in the required format.
Thereafter the output is disambiguated with the disambiguator sme.cg3, by using the support program vislcg3.
The flag *-g* identifies the file *sme.cg3* as the grammar file. In order to see more options, you may write
*hfst-tokenise -h* and *vislcg3 -h*.

You may also conduct automatic dictionary lookup, see below.


## Download other programs

### dictionaries
Expand Down

0 comments on commit e322230

Please sign in to comment.