Skip to content

Commit

Permalink
added lexicn package info
Browse files Browse the repository at this point in the history
  • Loading branch information
trinker committed Feb 27, 2017
1 parent 12f71d5 commit 2f311ff
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 9 deletions.
4 changes: 2 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ data(presidential_debates_2012)

## Stemming Versus Lemmatizing

Before moving into the meat these two examples highlight the difference between stemming and lemmatizing.
Before moving into the meat these two examples let's highlight the difference between stemming and lemmatizing.

```{r}
dw <- c('driver', 'drive', 'drove', 'driven', 'drives', 'driving')
Expand Down Expand Up @@ -110,7 +110,7 @@ stem_strings(y)

## Lemmatizing

Lemmatizing is the ["grouping together the inflected forms of a word so they can be analysed as a single item" (wikipedia)](https://en.wikipedia.org/wiki/Lemmatisation). In the example below I reduce the strings to their lemma form. `lemmatize_strings` uses a lookup dictionary. The default uses [Mechura's (2016)](http://www.lexiconista.com) English lemmatization list. The `make_lemma_dictionary` function contains two additional engines for generating a lemma lookup table for use in `lemmatize_strings`.
Lemmatizing is the ["grouping together the inflected forms of a word so they can be analysed as a single item" (wikipedia)](https://en.wikipedia.org/wiki/Lemmatisation). In the example below I reduce the strings to their lemma form. `lemmatize_strings` uses a lookup dictionary. The default uses [Mechura's (2016)](http://www.lexiconista.com) English lemmatization list available from the [**lexicon**](https://cran.r-project.org/package=lexicon) package. The `make_lemma_dictionary` function contains two additional engines for generating a lemma lookup table for use in `lemmatize_strings`.


```{r}
Expand Down
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,8 @@ Load the Tools/Data
Stemming Versus Lemmatizing
---------------------------

Before moving into the meat these two examples highlight the difference
between stemming and lemmatizing.
Before moving into the meat these two examples let's highlight the
difference between stemming and lemmatizing.

dw <- c('driver', 'drive', 'drove', 'driven', 'drives', 'driving')

Expand Down Expand Up @@ -178,9 +178,10 @@ they can be analysed as a single item"
example below I reduce the strings to their lemma form.
`lemmatize_strings` uses a lookup dictionary. The default uses
[Mechura's (2016)](http://www.lexiconista.com) English lemmatization
list. The `make_lemma_dictionary` function contains two additional
engines for generating a lemma lookup table for use in
`lemmatize_strings`.
list available from the
[**lexicon**](https://cran.r-project.org/package=lexicon) package. The
`make_lemma_dictionary` function contains two additional engines for
generating a lemma lookup table for use in `lemmatize_strings`.

y <- c(
'the dirtier dog has eaten the pies',
Expand Down Expand Up @@ -254,9 +255,9 @@ It's pretty fast too. Observe:

(toc <- Sys.time() - tic)

## Time difference of 0.122086 secs
## Time difference of 0.09106207 secs

That's 2,912 rows of text, or 42,708 words, in 0.12 seconds.
That's 2,912 rows of text, or 42,708 words, in 0.09 seconds.

Combine With Other Text Tools
-----------------------------
Expand Down

0 comments on commit 2f311ff

Please sign in to comment.