源文雨
|
75ee4a090e
|
优化 amd64 调用与内存
goos: darwin
goarch: amd64
pkg: github.com/fumiama/go-base16384
cpu: Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
name old time/op new time/op delta
EncodeTo/16-8 16.9ns ± 3% 16.7ns ± 1% -1.62% (p=0.048 n=5+5)
EncodeTo/256-8 78.0ns ± 1% 77.6ns ± 0% ~ (p=0.286 n=5+4)
EncodeTo/4K-8 942ns ± 0% 943ns ± 0% ~ (p=0.841 n=5+5)
EncodeTo/32K-8 7.59µs ± 1% 7.53µs ± 1% ~ (p=0.222 n=5+5)
DecodeTo/16-8 43.1ns ± 1% 12.2ns ± 0% -71.70% (p=0.008 n=5+5)
DecodeTo/256-8 179ns ± 1% 74ns ± 1% -58.93% (p=0.008 n=5+5)
DecodeTo/4K-8 1.67µs ± 1% 0.94µs ± 0% -43.89% (p=0.008 n=5+5)
DecodeTo/32K-8 13.2µs ± 0% 7.5µs ± 1% -43.48% (p=0.008 n=5+5)
Encoder/16-8 118ns ± 4% 112ns ± 0% -5.01% (p=0.008 n=5+5)
Encoder/256-8 350ns ± 0% 341ns ± 0% -2.48% (p=0.008 n=5+5)
Encoder/4K-8 3.86µs ± 2% 3.83µs ± 0% ~ (p=0.238 n=5+5)
Encoder/32K-8 29.6µs ± 0% 29.4µs ± 1% ~ (p=0.095 n=5+5)
Decoder/16-8 417ns ± 6% 406ns ± 1% ~ (p=0.056 n=5+5)
Decoder/256-8 471ns ± 1% 467ns ± 1% ~ (p=0.222 n=5+5)
Decoder/4K-8 1.65µs ± 1% 1.65µs ± 2% ~ (p=0.500 n=5+5)
Decoder/32K-8 14.3µs ±21% 12.7µs ± 1% ~ (p=0.151 n=5+5)
name old speed new speed delta
EncodeTo/16-8 946MB/s ± 3% 961MB/s ± 1% ~ (p=0.056 n=5+5)
EncodeTo/256-8 3.28GB/s ± 1% 3.30GB/s ± 0% ~ (p=0.286 n=5+4)
EncodeTo/4K-8 4.35GB/s ± 0% 4.34GB/s ± 0% ~ (p=0.841 n=5+5)
EncodeTo/32K-8 4.32GB/s ± 1% 4.35GB/s ± 1% ~ (p=0.222 n=5+5)
DecodeTo/16-8 510MB/s ± 1% 1803MB/s ± 0% +253.37% (p=0.008 n=5+5)
DecodeTo/256-8 1.65GB/s ± 1% 4.02GB/s ± 1% +143.45% (p=0.008 n=5+5)
DecodeTo/4K-8 2.80GB/s ± 1% 4.99GB/s ± 0% +78.22% (p=0.008 n=5+5)
DecodeTo/32K-8 2.83GB/s ± 0% 5.00GB/s ± 1% +76.93% (p=0.008 n=5+5)
Encoder/16-8 135MB/s ± 4% 142MB/s ± 0% +5.22% (p=0.008 n=5+5)
Encoder/256-8 731MB/s ± 0% 750MB/s ± 0% +2.55% (p=0.008 n=5+5)
Encoder/4K-8 1.06GB/s ± 2% 1.07GB/s ± 0% ~ (p=0.310 n=5+5)
Encoder/32K-8 1.11GB/s ± 0% 1.12GB/s ± 1% ~ (p=0.095 n=5+5)
Decoder/16-8 38.4MB/s ± 6% 39.4MB/s ± 1% ~ (p=0.056 n=5+5)
Decoder/256-8 544MB/s ± 1% 548MB/s ± 1% ~ (p=0.222 n=5+5)
Decoder/4K-8 2.49GB/s ± 1% 2.48GB/s ± 2% ~ (p=0.548 n=5+5)
Decoder/32K-8 2.32GB/s ±18% 2.59GB/s ± 1% ~ (p=0.151 n=5+5)
name old alloc/op new alloc/op delta
EncodeTo/16-8 0.00B 0.00B ~ (all equal)
EncodeTo/256-8 0.00B 0.00B ~ (all equal)
EncodeTo/4K-8 0.00B 0.00B ~ (all equal)
EncodeTo/32K-8 0.00B 0.00B ~ (all equal)
DecodeTo/16-8 48.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5)
DecodeTo/256-8 576B ± 0% 0B -100.00% (p=0.008 n=5+5)
DecodeTo/4K-8 6.14kB ± 0% 0.00kB -100.00% (p=0.008 n=5+5)
DecodeTo/32K-8 49.2kB ± 0% 0.0kB -100.00% (p=0.008 n=5+5)
Encoder/16-8 24.0B ± 0% 24.0B ± 0% ~ (all equal)
Encoder/256-8 24.0B ± 0% 24.0B ± 0% ~ (all equal)
Encoder/4K-8 24.0B ± 0% 24.0B ± 0% ~ (all equal)
Encoder/32K-8 26.0B ± 0% 26.0B ± 0% ~ (all equal)
Decoder/16-8 1.39kB ± 0% 1.39kB ± 0% ~ (all equal)
Decoder/256-8 1.39kB ± 0% 1.39kB ± 0% ~ (all equal)
Decoder/4K-8 4.98kB ± 0% 4.98kB ± 0% ~ (all equal)
Decoder/32K-8 41.1kB ± 0% 41.1kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
EncodeTo/16-8 0.00 0.00 ~ (all equal)
EncodeTo/256-8 0.00 0.00 ~ (all equal)
EncodeTo/4K-8 0.00 0.00 ~ (all equal)
EncodeTo/32K-8 0.00 0.00 ~ (all equal)
DecodeTo/16-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5)
DecodeTo/256-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5)
DecodeTo/4K-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5)
DecodeTo/32K-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5)
Encoder/16-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Encoder/256-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Encoder/4K-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Encoder/32K-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Decoder/16-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/256-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/4K-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/32K-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
|
2022-12-14 10:38:19 +08:00 |
|
源文雨
|
5e0f486237
|
finish arm64 encode
|
2022-12-13 17:08:02 +08:00 |
|
fumiama
|
ceb3054caa
|
fix: amd64 asm read out of range
|
2022-08-22 13:11:31 +08:00 |
|
源文雨
|
2e6fe912c2
|
speed up encoder & decoder
name old time/op new time/op delta
Encoder/16-8 136ns ± 2% 102ns ± 1% -25.00% (p=0.008 n=5+5)
Encoder/256-8 490ns ± 1% 410ns ± 0% -16.24% (p=0.008 n=5+5)
Encoder/4K-8 4.47µs ± 1% 3.52µs ± 1% -21.10% (p=0.008 n=5+5)
Encoder/32K-8 38.9µs ± 0% 33.6µs ± 1% -13.72% (p=0.008 n=5+5)
Decoder/16-8 269ns ± 1% 253ns ± 1% -5.95% (p=0.008 n=5+5)
Decoder/256-8 421ns ± 1% 404ns ± 2% -4.22% (p=0.008 n=5+5)
Decoder/4K-8 1.68µs ± 1% 1.66µs ± 3% ~ (p=0.190 n=5+5)
Decoder/32K-8 12.9µs ± 1% 12.5µs ± 1% -2.68% (p=0.008 n=5+5)
name old speed new speed delta
Encoder/16-8 118MB/s ± 2% 157MB/s ± 1% +33.34% (p=0.008 n=5+5)
Encoder/256-8 523MB/s ± 1% 624MB/s ± 0% +19.38% (p=0.008 n=5+5)
Encoder/4K-8 917MB/s ± 1% 1162MB/s ± 1% +26.73% (p=0.008 n=5+5)
Encoder/32K-8 841MB/s ± 0% 975MB/s ± 1% +15.90% (p=0.008 n=5+5)
Decoder/16-8 59.5MB/s ± 1% 63.2MB/s ± 1% +6.34% (p=0.008 n=5+5)
Decoder/256-8 607MB/s ± 1% 634MB/s ± 2% +4.42% (p=0.008 n=5+5)
Decoder/4K-8 2.44GB/s ± 1% 2.46GB/s ± 3% ~ (p=0.222 n=5+5)
Decoder/32K-8 2.54GB/s ± 1% 2.61GB/s ± 1% +2.76% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Encoder/16-8 40.0B ± 0% 24.0B ± 0% -40.00% (p=0.008 n=5+5)
Encoder/256-8 696B ± 0% 472B ± 0% -32.18% (p=0.008 n=5+5)
Encoder/4K-8 4.12kB ± 0% 0.02kB ± 0% -99.42% (p=0.008 n=5+5)
Encoder/32K-8 69.7kB ± 0% 41.0kB ± 0% -41.16% (p=0.000 n=5+4)
Decoder/16-8 752B ± 0% 752B ± 0% ~ (all equal)
Decoder/256-8 1.39kB ± 0% 1.39kB ± 0% ~ (all equal)
Decoder/4K-8 4.98kB ± 0% 4.98kB ± 0% ~ (all equal)
Decoder/32K-8 41.1kB ± 0% 41.1kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Encoder/16-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
Encoder/256-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5)
Encoder/4K-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.008 n=5+5)
Encoder/32K-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5)
Decoder/16-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/256-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/4K-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
Decoder/32K-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
|
2022-04-22 21:05:19 +08:00 |
|
源文雨
|
87b51ceb35
|
add amd64 asm
name old time/op new time/op delta
EncodeTo/16-8 45.5ns ± 1% 35.9ns ± 1% -21.01% (p=0.008 n=5+5)
EncodeTo/256-8 241ns ± 1% 170ns ± 1% -29.64% (p=0.008 n=5+5)
EncodeTo/4K-8 2.90µs ± 0% 1.70µs ± 0% -41.60% (p=0.008 n=5+5)
EncodeTo/32K-8 23.5µs ± 2% 13.6µs ± 2% -42.20% (p=0.008 n=5+5)
DecodeTo/16-8 20.2ns ± 0% 10.3ns ± 2% -48.92% (p=0.008 n=5+5)
DecodeTo/256-8 141ns ± 1% 71ns ± 0% -49.55% (p=0.008 n=5+5)
DecodeTo/4K-8 2.03µs ± 1% 0.94µs ± 0% -53.82% (p=0.008 n=5+5)
DecodeTo/32K-8 16.1µs ± 0% 7.5µs ± 0% -53.22% (p=0.008 n=5+5)
name old speed new speed delta
EncodeTo/16-8 352MB/s ± 1% 445MB/s ± 1% +26.59% (p=0.008 n=5+5)
EncodeTo/256-8 1.06GB/s ± 1% 1.51GB/s ± 1% +42.13% (p=0.008 n=5+5)
EncodeTo/4K-8 1.41GB/s ± 0% 2.42GB/s ± 0% +71.24% (p=0.008 n=5+5)
EncodeTo/32K-8 1.40GB/s ± 2% 2.42GB/s ± 2% +73.01% (p=0.008 n=5+5)
DecodeTo/16-8 1.09GB/s ± 0% 2.14GB/s ± 2% +95.84% (p=0.008 n=5+5)
DecodeTo/256-8 2.10GB/s ± 1% 4.16GB/s ± 0% +98.21% (p=0.008 n=5+5)
DecodeTo/4K-8 2.30GB/s ± 1% 4.99GB/s ± 0% +116.55% (p=0.008 n=5+5)
DecodeTo/32K-8 2.33GB/s ± 0% 4.98GB/s ± 0% +113.78% (p=0.008 n=5+5)
|
2022-04-22 17:24:49 +08:00 |
|