1
0
mirror of https://github.com/fumiama/blake2b-simd.git synced 2026-06-05 18:20:29 +08:00
Files
blake2b-simd/README.md
2016-07-03 23:29:06 +02:00

1.9 KiB

BLAKE2b-SIMD

Pure Go implementation of BLAKE2b using SIMD optimizations.

Introduction

This package is based on the pure go BLAKE2b implementation of Dmitry Chestnykh and merges it with the (cgo dependent) SSE optimized BLAKE2 implementation (which in turn is based on official implementation. It does so by using Go's Assembler for amd64 architectures with a fallback for other architectures.

It gives roughly a 3x performance improvement over the non-optimized go version.

Benchmarks

Dura 1 GB
blake2b-SIMD 1.59s
blake2b 4.66s

Example performance metrics were generated on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz - 6 physical cores, 12 logical cores running Ubuntu GNU/Linux with kernel version 4.4.0-24-generic (vanilla with no optimizations).

$ benchcmp old.txt new.txt
benchmark                old ns/op     new ns/op     delta
BenchmarkHash64-12       1481          849           -42.67%
BenchmarkHash128-12      1428          746           -47.76%
BenchmarkHash1K-12       6379          2227          -65.09%
BenchmarkHash8K-12       37219         11714         -68.53%
BenchmarkHash32K-12      140716        35935         -74.46%
BenchmarkHash128K-12     561656        142634        -74.60%

benchmark                old MB/s     new MB/s     speedup
BenchmarkHash64-12       43.20        75.37        1.74x
BenchmarkHash128-12      89.64        171.35       1.91x
BenchmarkHash1K-12       160.52       459.69       2.86x
BenchmarkHash8K-12       220.10       699.32       3.18x
BenchmarkHash32K-12      232.87       911.85       3.92x
BenchmarkHash128K-12     233.37       918.93       3.94x

We can see 2-3x improvement in performance over native Go under varying block sizes.