mirror of
https://github.com/fumiama/gozel.git
synced 2026-06-05 00:10:24 +08:00
49 lines
1.9 KiB
Markdown
49 lines
1.9 KiB
Markdown
# Vector Addition — Command Queue
|
|
|
|
> [!Note]
|
|
> **SYCL** is used to write this kernel, which is not a common practice.
|
|
> Please also have a look at the **OpenCL** kernel examples like [image_scale](../image_scale/).
|
|
|
|
A classic GPU compute example: perform element-wise addition of two large float32 vectors on the GPU, then validate the result against a CPU reference.
|
|
|
|
## What It Does
|
|
|
|
1. Discovers a GPU device and prints its basic & compute properties
|
|
2. Allocates host and device memory for two float32 vectors (256 MiB each)
|
|
3. Fills both vectors with random values and copies them to device memory
|
|
4. Loads a SPIR-V kernel (`vector_add`) that computes `a[i] += b[i]` in parallel
|
|
5. Launches the kernel via a **command queue** with explicit command lists (pre-copy → compute → post-copy)
|
|
6. Reads back the results and validates every element against the CPU reference
|
|
7. Reports GPU vs. CPU execution time and throughput
|
|
|
|
## Run
|
|
|
|
```bash
|
|
go run main.go
|
|
```
|
|
|
|
## Sample Output
|
|
|
|
```
|
|
=============== Device Basic Properties ===============
|
|
Running on device: ID = 32103 , Name = Intel(R) Graphics @ 0.00 GHz.
|
|
=============== Device Compute Properties ===============
|
|
Max Group Size (X, Y, Z): (1024, 1024, 1024)
|
|
Max Group Count (X, Y, Z): (4294967295, 4294967295, 4294967295)
|
|
Max Total Group Size: 1024
|
|
Max Shared Local Memory: 65536
|
|
Subgroup Sizes: [8 16 32]
|
|
=============== Computation Configuration ===============
|
|
Group Size (X, Y, Z): (1024, 1, 1)
|
|
Group Count: 65536
|
|
Total Elements (N): 67108864
|
|
Buffer Size: 256 MiB
|
|
=============== Calculation Results ===============
|
|
GPU Execution Time: 53.858600 ms
|
|
GPU Throughput: 4.98 GiB/s
|
|
=============== Validation Results ===============
|
|
CPU Execution Time: 65.882900 ms
|
|
CPU Throughput: 4.07 GiB/s
|
|
Test Passed!!!
|
|
```
|