Releases: ikawaha/kagome
Releases · ikawaha/kagome
Refactoring
- Fix to stabilize the serialization of the index table and the unk index table
- Rebuild dictionaries
- Drop support of go 1.6 and 1.7
Add simple dict mode
- Add simple dict mode to save memory usage
- Remove unsafe
- Fix golint
Using simple dict. mode, the analysis result does not change.
But the output contents (活用型, 活用形, 基本形, 読み, 発音) are omitted.
memory usage:
dict | old | full | simple |
---|---|---|---|
IPA | 193MB | 175MB | 132MB |
UNI | 872MB | 795MB | 590MB |
Full Dict.
BOS
寿司 名詞,一般,*,*,*,*,寿司,スシ,スシ
が 助詞,格助詞,一般,*,*,*,が,ガ,ガ
食べ 動詞,自立,*,*,一段,連用形,食べる,タベ,タベ
たい 助動詞,*,*,*,特殊・タイ,基本形,たい,タイ,タイ
。 記号,句点,*,*,*,*,。,。,。
EOS
Simple Dict.
BOS
寿司 名詞,一般,*,*,*,*
が 助詞,格助詞,一般,*,*,*
食べ 動詞,自立,*,*,一段,連用形
たい 助動詞,*,*,*,特殊・タイ,基本形
。 記号,句点,*,*,*,*
EOS
Bugfix
Performance tweak
Merge pull request #91 from ikawaha/develop Performance tweak
Add the appengine build tag
Merge pull request #89 from ikawaha/develop Add the appengine build tag and some cosmetic change
Bugfix
Reduce space_alloc
v1.5.1 Update
Add user dictionary builder
example:
form io.Reader
s := `
日本経済新聞,日本 経済 新聞,ニホン ケイザイ シンブン,カスタム名詞
# 関西国際空港,関西 国際 空港,カンサイ コクサイ クウコウ,カスタム地名
朝青龍,朝青龍,アサショウリュウ,カスタム人名
`
r := strings.NewReader(s)
rec, err := NewUserDicRecords(r)
if err != nil {
t.Fatalf("user dic build error, %v", err)
}
udic, err := rec.NewUserDic()
from go struct
r := UserDicRecords{
{
Text: "日本経済新聞",
Tokens: []string{"日本", "経済", "新聞"},
Yomi: []string{"ニホン", "ケイザイ", "シンブン"},
Pos: "カスタム名詞",
},
{
Text: "朝青龍",
Tokens: []string{"朝青龍"},
Yomi: []string{"アサショウリュウ"},
Pos: "カスタム人名",
},
}
udic, err := r.NewUserDic()
from JSON
var rec UserDicRecords
json.Unmarshal([]byte(`[
{
"text":"日本経済新聞",
"tokens":["日本","経済","新聞"],
"yomi":["ニホン","ケイザイ","シンブン"],
"pos":"カスタム名詞"
},
{
"text":"朝青龍",
"tokens":["朝青龍"],
"yomi":["アサショウリュウ"],
"pos":"カスタム人名"
}]`), &rec)
udic, err := rec.NewUserDic()
Add a function to get a part of speech tag
Merge pull request #71 from ikawaha/feature/token_features_20160317 Add a function to get a part-of-speech tag
UniDic support
Merge pull request #49 from ikawaha/develop Support UniDic