1
0
mirror of https://github.com/fumiama/jieba.git synced 2026-06-05 00:32:51 +08:00

42 Commits

Author SHA1 Message Date
源文雨
36c17a10b5 fs.File -> io.Reader 2022-12-03 10:54:06 +08:00
源文雨
6982ead703 优化 jieba 2022-11-30 16:00:56 +08:00
源文雨
f3da9e6420 优化 dict, add fs.File 支持 2022-11-30 14:14:48 +08:00
源文雨
d487545eb5 优化 tag_extracker 2022-11-30 13:35:21 +08:00
源文雨
8bbc755ed4 优化 2022-11-30 12:18:15 +08:00
Wang Bin
6b75cef871 tweak SuggestFrequency, added example 2015-05-08 16:34:28 +08:00
Wang Bin
c48eb5b4a7 added AddWord/DeleteWord/SuggestFrequency functions, this is correpsonding to jieba commit #59aa8b69b1399569ea6b417280c993da703baba8 2015-05-08 11:57:46 +08:00
Wang Bin
3d91f615cf moved tokenizers to a seperated module 2015-05-07 18:52:29 +08:00
Wang Bin
4e8887da5e added package documentation 2015-05-06 15:23:35 +08:00
Wang Bin
122bad0a8d code refactor, added more documents 2015-05-06 12:55:04 +08:00
Wang Bin
500e6bd10e tweak style 2015-05-04 15:11:55 +08:00
Wang Bin
d9f77563bf added util module 2015-04-30 15:26:34 +08:00
Wang Bin
83efde1e61 small refactors, removed sort in dag, save logTotal in segmenter 2015-04-04 17:10:40 +08:00
Wang Bin
84ad6fe25e code refactor, updated RegexpSplit function to match Python's re.split function 2015-04-02 18:25:00 +08:00
Wang Bin
c397cafe8a uniform the api 2015-03-30 17:52:09 +08:00
Wang Bin
7a7f8af517 move DAG related function to a seperated file, rename Calc to Routes 2015-03-30 17:10:48 +08:00
Wang Bin
68fed7e250 make struct Jieba's fields private 2015-03-30 16:12:02 +08:00
Wang Bin
c4c3a5f9ad refactor Cut function, make CutAll a seperate function, to simplify the logic of Cut function 2015-03-30 15:18:36 +08:00
Wang Bin
556b96b137 removed unused method/property 2015-03-30 14:31:41 +08:00
Wang Bin
328310cfbb removed all cache load/dump related codes, benchmark shows read from dict file is faster than load from gob file 2015-03-30 14:25:08 +08:00
Wang Bin
51c63cb9ad small refactor the interface, use contructors instead of pointers for entry 2015-03-30 13:00:56 +08:00
Wang Bin
79adffe328 added a new interface for caching 2015-03-28 15:49:32 +08:00
Wang Bin
e11060513c merge trie.go into jieba.go 2015-03-28 12:14:11 +08:00
Wang Bin
73d87e4ed6 refactor posseg, added Posseg struct 2015-03-24 16:54:02 +08:00
Wang Bin
0027927b6d code refactor for RegexpSplit function, moved it to util.go, add return chan string 2015-03-24 14:40:06 +08:00
Wang Bin
858ceb5a0b small tweaks, add docs 2015-02-28 17:08:04 +08:00
Wang Bin
43480db509 unify Cut method, return channel instead of array 2015-02-27 17:30:45 +08:00
Wang Bin
c03b3eac1c unify Cut method, return channel instead of array 2015-02-27 17:15:23 +08:00
Wang Bin
76b9df8511 change cut method to return a channel string, not []string 2015-02-27 11:37:55 +08:00
Wang Bin
f6c298fc65 small refactor for function names 2015-02-26 17:38:26 +08:00
Wang Bin
f7fdb9749d rename GetDAG to DAG 2015-02-26 16:56:18 +08:00
Wang Bin
aa9ad48b1c refactor variable name 2015-02-26 16:07:08 +08:00
Wang Bin
55751ed04d tiny code refactor 2015-02-26 15:34:58 +08:00
Wang Bin
95a27da5cf small refactor, rename files 2015-02-26 11:12:05 +08:00
Wang Bin
67216a8a7d use only one dict to store words and prefixes, this corresponding to jieba commit #f808ea0ebba7056fa1b55081b474329e556933a8 2015-02-25 18:27:24 +08:00
Wang Bin
08ac49d10b small refactor, don't compile regular expression every time, corresponding to jieba commit #32a0e92a09614cf5c72f87b1a59a5c4369200516 2015-02-25 16:32:28 +08:00
Wang Bin
5702495bf6 removed MinFreq, correpsonding to jieba commit #caae26fbfafd75062742823a23e1cc81368b1451 2015-02-25 16:01:39 +08:00
Wang Bin
2515d2e5a0 removed unused idx parameter from Calc function, this is correpsonding to jieba commit #8a2e7f0e7ed205429ae545f5b875af4eaa8490d1 2015-02-25 12:18:24 +08:00
Wang Bin
0f7c56b4ef small code refactor 2015-02-04 14:47:59 +08:00
Wang Bin
9ee7ba2c13 use github.com/deckarep/golang-set instead of Trie, to reduce memory usage and improve performance, this is corresponding to jieba commit #4a93f21918a26083c039970edb9457c589c3a0ab 2015-02-03 15:20:30 +08:00
Wang Bin
d2acf94693 code refactor, simplfied trie model, also added cache for dictionary file 2014-08-13 18:21:41 +08:00
Wang Bin
8c785ad36a initial commit 2013-10-31 18:20:04 +08:00