Wang Bin
|
500e6bd10e
|
tweak style
|
2015-05-04 15:11:55 +08:00 |
|
Wang Bin
|
edef39719d
|
move jieba to a seperate module, tweak posseg module
|
2015-04-30 17:01:02 +08:00 |
|
Wang Bin
|
d9f77563bf
|
added util module
|
2015-04-30 15:26:34 +08:00 |
|
Wang Bin
|
732196127b
|
added more tests for dictionary.go
|
2015-04-30 11:21:16 +08:00 |
|
Wang Bin
|
ac7628edaf
|
added tests for dictionary.go, fixed a small bug
|
2015-04-30 11:03:54 +08:00 |
|
Wang Bin
|
ae54d82c68
|
added tests for dictionary.go, fixed a small bug
|
2015-04-30 11:02:01 +08:00 |
|
Wang Bin
|
0124ebadce
|
put dictionary to a seperated module
|
2015-04-29 18:51:38 +08:00 |
|
Wang Bin
|
b19eb4f6fe
|
code refactor, use uint for map key to improve performance
|
2015-04-06 20:24:07 +08:00 |
|
Wang Bin
|
17ab0b2cc7
|
small style refactor
|
2015-04-04 17:42:41 +08:00 |
|
Wang Bin
|
83efde1e61
|
small refactors, removed sort in dag, save logTotal in segmenter
|
2015-04-04 17:10:40 +08:00 |
|
Wang Bin
|
5c6a2eff74
|
Merge branch 'posseg' into develop
|
2015-04-04 15:39:21 +08:00 |
|
Wang Bin
|
847dae9d38
|
added bench.sh
|
2015-04-04 15:35:01 +08:00 |
|
Wang Bin
|
e8cf1e9a9c
|
small refactor
|
2015-04-04 15:30:11 +08:00 |
|
Wang Bin
|
188133261f
|
small tweaks, added bench.sh for benchmark
|
2015-04-04 15:26:26 +08:00 |
|
Wang Bin
|
bbe302a351
|
removed sorts to slightly improve performance
|
2015-04-03 16:48:45 +08:00 |
|
Wang Bin
|
d22cc9b6b6
|
fixed a typo in jieba_test.go
|
2015-04-02 18:29:24 +08:00 |
|
Wang Bin
|
84ad6fe25e
|
code refactor, updated RegexpSplit function to match Python's re.split function
|
2015-04-02 18:25:00 +08:00 |
|
Wang Bin
|
0ab9063f43
|
added benchmarks for posseg
|
2015-03-31 13:49:54 +08:00 |
|
Wang Bin
|
3852f660aa
|
added benchmark for Cut related functions
|
2015-03-31 12:03:01 +08:00 |
|
Wang Bin
|
7cf16072e6
|
updated all tests to use Fatal/Fatalf to fail tests ealier
|
2015-03-30 18:01:21 +08:00 |
|
Wang Bin
|
c397cafe8a
|
uniform the api
|
2015-03-30 17:52:09 +08:00 |
|
Wang Bin
|
7a7f8af517
|
move DAG related function to a seperated file, rename Calc to Routes
|
2015-03-30 17:10:48 +08:00 |
|
Wang Bin
|
68fed7e250
|
make struct Jieba's fields private
|
2015-03-30 16:12:02 +08:00 |
|
Wang Bin
|
c4c3a5f9ad
|
refactor Cut function, make CutAll a seperate function, to simplify the logic of Cut function
|
2015-03-30 15:18:36 +08:00 |
|
Wang Bin
|
556b96b137
|
removed unused method/property
|
2015-03-30 14:31:41 +08:00 |
|
Wang Bin
|
328310cfbb
|
removed all cache load/dump related codes, benchmark shows read from dict file is faster than load from gob file
|
2015-03-30 14:25:08 +08:00 |
|
Wang Bin
|
0ca4053394
|
fixed the test failure in textrank
|
2015-03-30 13:06:44 +08:00 |
|
Wang Bin
|
51c63cb9ad
|
small refactor the interface, use contructors instead of pointers for entry
|
2015-03-30 13:00:56 +08:00 |
|
Wang Bin
|
48a0bd390b
|
fixed a typo in previous commit
|
2015-03-30 11:13:00 +08:00 |
|
Wang Bin
|
a66bf2a0bd
|
move dictPath function private
|
2015-03-30 11:02:57 +08:00 |
|
Wang Bin
|
79adffe328
|
added a new interface for caching
|
2015-03-28 15:49:32 +08:00 |
|
Wang Bin
|
e11060513c
|
merge trie.go into jieba.go
|
2015-03-28 12:14:11 +08:00 |
|
Wang Bin
|
45c7854fac
|
finished generilzation of dictionary load
|
2015-03-28 10:51:00 +08:00 |
|
Wang Bin
|
e155fe5467
|
refactor to generalize set dictionary function, not finished yet
|
2015-03-25 18:46:14 +08:00 |
|
Wang Bin
|
59da5b5e3a
|
removed dict.go, functions move to util.go, also use interface to simplify code
|
2015-03-25 18:28:37 +08:00 |
|
Wang Bin
|
7fe5e7d4c4
|
small refactor, replace WordTagFreq with Entry
|
2015-03-25 17:53:25 +08:00 |
|
Wang Bin
|
800ecaa8c9
|
small refactor
|
2015-03-25 16:01:05 +08:00 |
|
Wang Bin
|
8687ca58b8
|
removed unecessary stateTag struct, using string instead
|
2015-03-25 15:13:46 +08:00 |
|
Wang Bin
|
1c378c28a7
|
finished all OOP refactor
|
2015-03-24 18:34:07 +08:00 |
|
Wang Bin
|
73d87e4ed6
|
refactor posseg, added Posseg struct
|
2015-03-24 16:54:02 +08:00 |
|
Wang Bin
|
0027927b6d
|
code refactor for RegexpSplit function, moved it to util.go, add return chan string
|
2015-03-24 14:40:06 +08:00 |
|
Wang Bin
|
323b6714fa
|
removed cache directory, the refactor I made before was not clear
|
2015-03-24 14:06:10 +08:00 |
|
Wang Bin
|
d257da40a7
|
try to refactor, not finished yet
|
2015-03-20 18:38:08 +08:00 |
|
Wang Bin
|
16929faf57
|
removed old tokenize module, updated README
|
2015-03-18 17:31:41 +08:00 |
|
Wang Bin
|
f596ac063d
|
added more tests
|
2015-03-17 16:34:36 +08:00 |
|
Wang Bin
|
a14788addb
|
fixed a but in tokenizer under search mode, added more tests
|
2015-03-17 16:29:09 +08:00 |
|
Wang Bin
|
2c95c61d33
|
added jieba tokenizer for bleve
|
2015-03-17 15:30:13 +08:00 |
|
Wang Bin
|
1aabc4a2f3
|
removed unnecessary MarshalBinary/UnmarshalBinary method
|
2015-03-16 15:55:41 +08:00 |
|
Wang Bin
|
8bf9888a1c
|
make some public variable/function to private
|
2015-02-28 18:23:59 +08:00 |
|
Wang Bin
|
1c8d4fbf23
|
make some public variable/function to private
|
2015-02-28 18:17:48 +08:00 |
|