GMKtec K10
Info
Benchmarks against DL 380 G9ඞ
Green:
- Processor: Intel Core i9-13900HK @ 5.20GHz (14 Cores / 20 Threads)
- Motherboard: GMKtec (NucBox K10 0.12 BIOS)
- Chipset: Intel Alder Lake PCH
- Memory: 2 x 32 GB DDR5-5200MT/s
- Disk: 1000GB CT1000E100SSD8
- Graphics: Intel Raptor Lake-P [Iris Xe] (1500MHz)
- Audio: Conexant SN6140
- Network: Realtek RTL8125 2.5GbE + Intel Raptor Lake PCH CNVi WiFi
- OS: Rocky Linux 10.0
- Kernel: 6.12.0-55.22.1.el10_0.x86_64 (x86_64)
- Compiler: GCC 14.2.1 20250110
- File-System: xfs
- Physical dimensions: 9.8 x 10.3 x 4.2 cm
- Weight: 2.2kg
- Power draw (idle): 18W
- Power draw (max): 120W
Sol:
- Processor: 2 x Intel Xeon E5-2637 v3 @ 3.70GHz (8 Cores / 16 Threads)
- Motherboard: HP ProLiant DL380 Gen9 (P89 BIOS)
- Chipset: Intel Xeon E7 v3/Xeon
- Memory: 16 x 16 GB DDR4-2133MT/s 752369-081
- Disk: 2400GB LOGICAL VOLUME (8x600GB 10K SAS RAID10 Storage Controller HP Smart Array P440ar)
- Graphics: Matrox MGA G200EH
- Network: 4 x Broadcom NetXtreme BCM5719 PCIe
- OS: Debian 12
- Kernel: 6.8.12-9-pve (x86_64)
- Compiler: GCC 12.2.0
- File-System: ext4
- Physical dimensions: 8.73 x 44.55 x 67.94 cm
- Weight: 23.6 kg
- Power draw (idle): 112W
- Power draw (max): 500W
CPU Benchmarksඞ
Sol has: 2 x Intel Xeon E5-2637 v3 @ 3.70GHz (8 Cores / 16 Threads)
Green has: Intel Core i9-13900HK @ 5.20GHz (14 Cores / 20 Threads)
Results on openbenchmarking.org:
- https://openbenchmarking.org/result/2509137-NE-SOLCPU35817
- https://openbenchmarking.org/result/2509117-NE-GREENCPU761
Test | Green | Sol | Difference |
---|---|---|---|
x265 4K | 5.91 fps | 5.55 fps | ~1.1x faster |
x265 1080p | 20.98 fps | 15.57 fps | ~1.3x faster |
7-Zip Compression | 90,807 MIPS | 48,654 MIPS | ~1.9x faster |
7-Zip Decompression | 57,362 MIPS | 42,452 MIPS | ~1.4x faster |
Kernel Compile time | 122.9 s | 227.1 s | ~1.8x faster |
OpenSSL SHA256 | 12.9 GB/s | 2.3 GB/s | ~5.6x faster |
OpenSSL SHA512 | 4.33 GB/s | 2.43 GB/s | ~1.8x faster |
RSA4096 Sign | 2095 ops/s | 1299 ops/s | ~1.6x faster |
RSA4096 Verify | 132,978 ops/s | 87,888 ops/s | ~1.5x faster |
AES-128-GCM | 65.8 GB/s | 28.4 GB/s | ~2.3x faster |
AES-256-GCM | 57.3 GB/s | 21.9 GB/s | ~2.6x faster |
ChaCha20 | 26.4 GB/s | 27.0 GB/s | ~1.0x (equal) |
ChaCha20-Poly1305 | 14.9 GB/s | 16.9 GB/s | ~0.1x slower (Xeon wins!) |
Redis GET (50 conn) | 4.33M req/s | 2.15M req/s | ~2.0x more |
Redis GET (500 conn) | 2.66M req/s | 1.96M req/s | ~1.4x more |
Redis GET (1000 conn) | 2.32M req/s | 2.18M req/s | ~1.1x more |
Redis SET (50 conn) | 2.79M req/s | 1.69M req/s | ~1.6x more |
Redis SET (500 conn) | 2.24M req/s | 1.66M req/s | ~1.3x more |
Redis SET (1000 conn) | 2.31M req/s | 1.69M req/s | ~1.4x more |
Redis LPOP (50 conn) | 4.43M req/s | 2.42M req/s | ~1.8x more |
Redis LPOP (500 conn) | 3.14M req/s | 2.46M req/s | ~1.3x more |
Redis LPOP (1000 conn) | 2.72M req/s | 1.55M req/s | ~1.8x more |
Redis LPUSH (50 conn) | 2.75M req/s | 1.12M req/s | ~2.5x more |
Redis LPUSH (500 conn) | 1.79M req/s | 1.54M req/s | ~1.2x more |
Redis LPUSH (1000 conn) | 1.79M req/s | 1.49M req/s | ~1.2x more |
Redis SADD (50 conn) | 3.07M req/s | 1.93M req/s | ~1.6x more |
Redis SADD (500 conn) | 2.57M req/s | 1.85M req/s | ~1.4x more |
Redis SADD (1000 conn) | 2.48M req/s | 1.97M req/s | ~1.3x more |
MariaDB (1 client) | 2685 QPS | 207 QPS | ~13.0x more |
MariaDB (32 clients) | 2055 QPS | 103 QPS | ~20.0x more |
MariaDB (64 clients) | 1621 QPS | 76 QPS | ~21.0x more |
MariaDB (128 clients) | 820 QPS | 44 QPS | ~18.6x more |
MariaDB (256 clients) | 397 QPS | 7 QPS | ~57x more |
MariaDB (512 clients) | 184 QPS | 3 QPS | ~61x more |
MariaDB (1024-8192 clients) | ~84-83 QPS | ~3 QPS | ~28x more |
CPU Benchmarks with only P-coresඞ
Since i9-13900HK features two kinds of cores, performance aka P-cores and efficiency aka E-cores (also known as Atom cores), it provided an interesting opportunity to test performance with E-cores off.
Check for E-cores: cat /sys/devices/cpu_atom/cpus
Run task on specific cores (0-11 are P-cores in this case): taskset -c 0-11 <COMMAND>
Results on openbenchmarking.org: https://openbenchmarking.org/result/2509142-NE-CPUGREENN25
Test | All cores | Only P-cores | Difference |
---|---|---|---|
x265 4K | 5.91 fps | 4.24 fps | ~1.39x faster |
x265 1080p | 20.98 fps | 17.51 fps | ~1.20x faster |
7-Zip Compression | 90,807 MIPS | 64,768 MIPS | ~1.40x faster |
7-Zip Decompression | 57,362 MIPS | 37,612 MIPS | ~1.53x faster |
Linux Kernel Compile | 122.90 s | 178.45 s | ~1.45x faster |
OpenSSL SHA256 | 12.90 GB/s | 7.45 GB/s | ~1.73x faster |
OpenSSL SHA512 | 4.33 GB/s | 2.66 GB/s | ~1.63x faster |
OpenSSL RSA4096 Sign | 2,095 /s | 1,625 /s | ~1.29x faster |
OpenSSL RSA4096 Verify | 132,978 /s | 104,828 /s | ~1.27x faster |
OpenSSL ChaCha20 | 26.41 GB/s | 18.92 GB/s | ~1.40x faster |
OpenSSL AES-128-GCM | 65.83 GB/s | 38.84 GB/s | ~1.70x faster |
OpenSSL AES-256-GCM | 57.26 GB/s | 33.47 GB/s | ~1.71x faster |
OpenSSL ChaCha20-Poly1305 | 14.91 GB/s | 11.60 GB/s | ~1.28x faster |
Redis GET (50 conns) | 4,334,402 | 1,038,531 | ~4.17x faster |
Redis SET (50 conns) | 2,789,408 | 1,037,535 | ~2.69x faster |
Redis GET (500 conns) | 2,664,632 | 1,756,982 | ~1.52x faster |
Redis LPOP (50 conns) | 4,426,627 | 796,807 | ~5.55x faster |
Redis SET (500 conns) | 2,238,991 | 1,425,186 | ~1.57x faster |
Redis LPOP (1000 conns) | 2,720,310 | 1,467,308 | ~1.85x faster |
Redis LPUSH (50 conns) | 2,748,150 | 688,325 | ~3.99x faster |
Redis SADD (500 conns) | 2,567,202 | 1,891,882 | ~1.36x faster |
Multi-threaded tasks benefit significantly from E-cores. Decompression sees a bigger relative improvement, likely due to better multi-thread scaling. Full core usage speeds up compilation by about a third.
Single-threaded or lightly threaded tasks benefit less but still gain some improvement when E-cores assist background threads.
Redis throughput improves dramatically with E-cores enabled, especially at lower concurrency where single-thread performance matters less, and total parallelism dominates.
For mixed workloads (databases, Redis, compression, video encoding), enabling all cores is needed for maximum performance. Disabling E-cores mainly limits total throughput for parallel tasks.
CPU Benchmarks with only one CPU on dual socketඞ
Since Sol has two CPU sockets and dual CPUs, it was interesting to run benchmarks only on one of them to see a difference in terms of more equal comparison in terms of core/thread counts between the systems and, perhaps, to see how much NUMA affects the results.
Run task only on first CPU: numactl --cpubind=0 --membind=0 <COMMAND>
Results on openbenchmarking.org: https://openbenchmarking.org/result/2509146-NE-CPUSOLSIN78
Test | Dual CPU | Single CPU socket | Difference |
---|---|---|---|
x265 4K (Bosphorus) | 5.55 fps | 6.16 fps | ~0.90x (slower dual) |
x265 1080p (Bosphorus) | 15.57 fps | 15.84 fps | ~0.98x (similar) |
7-Zip Compression | 48654 MIPS | 25680 MIPS | ~1.90x faster |
7-Zip Decompression | 42452 MIPS | 21362 MIPS | ~1.99x faster |
Linux Kernel Build (defconfig) | 227.11 s (lower=better) | 441.74 s | ~1.94x faster |
OpenSSL SHA256 | 2.30e9 B/s | 1.15e9 B/s | ~2.00x faster |
OpenSSL SHA512 | 2.43e9 B/s | 1.22e9 B/s | ~1.99x faster |
OpenSSL RSA4096 sign | 1299.7 ops/s | 651.8 ops/s | ~1.99x faster |
OpenSSL RSA4096 verify | 87888 ops/s | 43995 ops/s | ~2.00x faster |
OpenSSL ChaCha20 | 2.70e10 B/s | 1.35e10 B/s | ~2.00x faster |
OpenSSL AES-128-GCM | 2.84e10 B/s | 1.42e10 B/s | ~2.00x faster |
OpenSSL AES-256-GCM | 2.19e10 B/s | 1.10e10 B/s | ~2.00x faster |
OpenSSL ChaCha20-Poly1305 | 1.69e10 B/s | 8.48e9 B/s | ~2.00x faster |
Redis GET (50 conn) | 2,145,363 req/s | 1,496,068 req/s | ~1.43x faster |
Redis SET (50 conn) | 1,688,719 req/s | 993,505 req/s | ~1.70x faster |
Redis LPOP (50 conn) | 2,417,681 req/s | 1,492,706 req/s | ~1.62x faster |
Redis SADD (50 conn) | 1,926,472 req/s | 1,170,568 req/s | ~1.65x faster |
Redis LPUSH (50 conn) | 1,116,586 req/s | 898,764 req/s | ~1.24x faster |
Redis GET (500 conn) | 1,963,410 req/s | 1,466,446 req/s | ~1.34x faster |
Redis SET (500 conn) | 1,659,920 req/s | 1,081,548 req/s | ~1.54x faster |
Redis LPOP (500 conn) | 2,459,230 req/s | 1,660,237 req/s | ~1.48x faster |
Redis SADD (500 conn) | 1,849,049 req/s | 1,323,014 req/s | ~1.40x faster |
Redis LPUSH (500 conn) | 1,538,552 req/s | 964,594 req/s | ~1.60x faster |
Redis GET (1000 conn) | 2,184,990 req/s | 1,437,471 req/s | ~1.52x faster |
Redis SET (1000 conn) | 1,688,344 req/s | 1,087,756 req/s | ~1.55x faster |
Redis LPOP (1000 conn) | 1,552,459 req/s | 1,493,120 req/s | ~1.04x faster |
Redis SADD (1000 conn) | 1,972,121 req/s | 1,279,570 req/s | ~1.54x faster |
Redis LPUSH (1000 conn) | 1,491,898 req/s | 951,028 req/s | ~1.57x faster |
Compute-heavy workloads (7-Zip, OpenSSL, kernel build) scale almost perfectly (~2x) with both sockets.
x265 scaling is poor - single-socket actually edges out dual in 4K and is nearly identical in 1080p.
CPU Bench Summaryඞ
Test | Winner | Difference |
---|---|---|
Video Encoding (x265) | Green | ~1.1-1.3x faster |
Compression (7-Zip) | Green | ~1.4-1.9x faster |
Kernel Compilation | Green | ~1.8x faster |
Cryptography (SHA, RSA, AES) | Green | ~1.5-5.6x faster |
Cryptography (ChaCha20/Poly1305) | sol | ~1.1x faster |
Redis (in-memory DB) | Green | ~1.2-2.5x more req/s |
MariaDB (SQL DB) | Green | ~13-61x more QPS |
Green has way better IPC with modern architecture, and in some cases can beat parallel loads (7-Zip compression) even if Sol has more threads.
Cryptography improvements likely due to AVX2/AVX512 and AES-NI optimizations on Alder Lake vs Haswell-era Xeons. Surprisingly ChaCha20 performance is equal or even slightly better on Sol, indicating that Green's crypto accelerators favor AES but not ChaCha20. Also RSA4096 and other public-key operations scale less dramatically (~1.5-1.6x), reflecting their compute-bound nature with less dependency on memory bandwidth. ChaCha20 being faster on older Xeon is a rare scenario where lack of AES acceleration helps.
Green consistently outperforms Sol on Redis benchmarks. However at higher concurrent connections, performance delta shrinks slightly, indicating that memory and I/O subsystem become the bottleneck at scale.
MariaDB results in a massive difference in performance not just due to CPU but also NVMe vs RAID SAS latency. CPU cannot shine fully if disk latency dominates.
Disk Benchmarksඞ
Sol has: 2400GB LOGICAL VOLUME (8x600GB 10K SAS RAID10 Storage Controller HP Smart Array P440ar) on ext4
Green has: NVMe 1000GB CT1000E100SSD8 on xfs
Results on openbenchmarking.org:
- https://openbenchmarking.org/result/2509115-NE-DISKGREEN01
- https://openbenchmarking.org/result/2509145-NE-DISKSOL7014
Test | Green | Sol | Difference |
---|---|---|---|
SQLite 1 thread | 9.747 s | 231.45 s | ~24x faster |
SQLite 2 threads | 19.32 s | 416.77 s | ~22x faster |
SQLite 4 threads | 18.64 s | 556.95 s | ~30x faster |
SQLite 8 threads | 32.27 s | 719.51 s | ~22x faster |
SQLite 16 threads | 48.89 s | 948.19 s | ~19x faster |
FIO Random Write 2MB MB/s | 1.187 GB/s | 0.477 GB/s | ~2.5x faster |
FIO Random Write 4KB MB/s | 1.176 GB/s | 0.489 GB/s | ~2.4x faster |
FIO Sequential Read 2MB MB/s | 1.955 GB/s | 0.884 GB/s | ~2.2x faster |
FIO Sequential Read 4KB MB/s | 1.942 GB/s | 0.870 GB/s | ~2.2x faster |
FIO Sequential Write 2MB MB/s | 1.523 GB/s | 0.737 GB/s | ~2.1x faster |
FIO Sequential Write 4KB MB/s | 1.501 GB/s | 0.761 GB/s | ~2x faster |
FS-Mark 1000 Files | 829.7 files/s | 43.7 files/s | ~19x faster |
FS-Mark 4000 Files, 32 Dirs | 805.8 files/s | 44.8 files/s | ~18x faster |
Dbench 1 client | 398.87 MB/s | 24.40 MB/s | ~16x faster |
Dbench 12 clients | 2047.02 MB/s | 127.63 MB/s | ~16x faster |
IOR 2MB / MB/s | 2608.21 | 199.94 | ~13x faster |
IOR 4MB / MB/s | 2897.53 | 236.61 | ~12x faster |
IOR 8MB / MB/s | 3085.04 | 308.99 | ~10x faster |
IOR 16MB / MB/s | 3168.04 | 346.90 | ~9x faster |
IOR 32MB / MB/s | 2879.00 | 407.15 | ~7x faster |
IOR 64MB / MB/s | 2875.59 | 465.83 | ~6x faster |
IOR 256MB / MB/s | 3049.90 | 459.17 | ~6.6x faster |
IOR 512MB / MB/s | 2986.88 | 409.78 | ~7.3x faster |
IOR 1024MB / MB/s | 3391.72 | 382.75 | ~8.9x faster |
Disk Benchmark Summaryඞ
Test | Winner | Difference |
---|---|---|
SQLite (1-16 threads) | Green | ~19-30x faster |
FIO Random Write 2MB / 4KB | Green | ~2-2.5x faster |
FIO Sequential Read 2MB / 4KB | Green | ~2.2x faster |
FIO Sequential Write 2MB / 4KB | Green | ~2x faster |
FS-Mark small files | Green | ~18-19x faster |
Dbench (1/12 clients) | Green | ~16x faster |
IOR small blocks (2-8MB) | Green | ~10-13x faster |
IOR medium blocks (16-64MB) | Green | ~6-9x faster |
IOR large blocks (256-1024MB) | Green | ~6-9x faster |
SMP performance due to DDR5's higher frequency and wider bus per channel.
STREAM benchmarks favor Sol which is likely due to Sol's 4-channel DDR4 memory configuration (16 DIMMs across 2 CPUs), providing higher aggregate bandwidth despite slower individual DIMMs. STREAM favoring Sol shows multi-channel DDR4 can beat DDR5 for large sequential streams, but only for continuous bulk operations.
Green shows enormous advantage in cache writes, reflecting modern CPU caches with higher associativity and faster L2/L3.
Green's memcpy/memset performance shows low-latency benefits for small-block memory operations.
Threaded memory operations show Green's superior per-thread latency and IPC, particularly in multi-threaded small-memory scenarios.
Green benefits from modern DDR5 and low-latency caches for small-block memory and per-thread operations, whereas Sol's memory excels in sustained high-bandwidth workloads due to many-channel DDR4 configuration.
RAM Benchmarksඞ
Sol has: 16 x 16 GB DDR4-2133MT/s 752369-081
Green has: 2 x 32 GB DDR5-5200MT/s
Results on openbenchmarking.org:
- https://openbenchmarking.org/result/2509141-NE-RAMSOL48015
- https://openbenchmarking.org/result/2509145-NE-RAMGREEN277
Test | Green | Sol | Difference |
---|---|---|---|
RAMspeed SMP Add (Integer) | 49170.75 MB/s | 25245.05 MB/s | ~1.9x faster |
RAMspeed SMP Copy (Integer) | 46294.82 MB/s | 24593.03 MB/s | ~1.9x faster |
RAMspeed SMP Scale (Integer) | 44536.78 MB/s | 22317.15 MB/s | ~2x faster |
RAMspeed SMP Triad (Integer) | 47968.84 MB/s | 24709.48 MB/s | ~1.9x faster |
RAMspeed SMP Average (Integer) | 44173.83 MB/s | 24336.00 MB/s | ~1.8x faster |
RAMspeed SMP Add (Floating Point) | 45365.78 MB/s | 26331.63 MB/s | ~1.7x faster |
RAMspeed SMP Copy (Floating Point) | 43004.43 MB/s | 25118.17 MB/s | ~1.7x faster |
RAMspeed SMP Scale (Floating Point) | 42716.91 MB/s | 22584.29 MB/s | ~1.9x faster |
RAMspeed SMP Triad (Floating Point) | 44670.74 MB/s | 26515.81 MB/s | ~1.7x faster |
RAMspeed SMP Average (Floating Point) | 43906.89 MB/s | 25257.18 MB/s | ~1.7x faster |
Stream Copy | 64740.8 MB/s | 85814.2 MB/s | ~1.3x slower |
Stream Scale | 54172.5 MB/s | 65886.2 MB/s | ~1.2x slower |
Stream Triad | 56556.2 MB/s | 71293.8 MB/s | ~1.3x slower |
Stream Add | 56446.8 MB/s | 71080.0 MB/s | ~1.3x slower |
Tinymembench Memcpy | 28540.4 MB/s | 9522.9 MB/s | ~3x faster |
Tinymembench Memset | 56744.2 MB/s | 5950.6 MB/s | ~9.5x faster |
MBW Memory Copy | 26202.42 MiB/s | 11409.51 MiB/s | ~2.3x faster |
MBW Memory Copy Fixed Block | 14090.10 MiB/s | 4169.46 MiB/s | ~3.4x faster |
t-test1 Threads 1 | 18.05 s | 32.38 s | ~1.8x faster |
t-test1 Threads 2 | 5.814 s | 16.51 s | ~2.8x faster |
CacheBench Read Cache | 20550.62 MB/s | 9333.97 MB/s | ~2.2x faster |
CacheBench Write Cache | 268593.70 MB/s | 41921.13 MB/s | ~6.4x faster |
RAM Benchmark Summaryඞ
Test | Winner | Difference |
---|---|---|
RAMspeed SMP (Integer) | Green | ~1.8-2x faster |
RAMspeed SMP (Floating Point) | Green | ~1.7x faster |
Stream | Sol | ~1.2-1.3x faster |
Tinymembench Memcpy | Green | ~3x faster |
Tinymembench Memset | Green | ~9.5x faster |
MBW Memory Copy | Green | ~2.3x faster |
MBW Memory Copy Fixed Block | Green | ~3.4x faster |
t-test1 Threads 1 | Green | ~1.8x faster |
t-test1 Threads 2 | Green | ~2.8x faster |
CacheBench Read Cache | Green | ~2.2x faster |
CacheBench Write Cache | Green | ~6.4x faster |
SQLite single-threaded performance shows 24x advantage for Green; even with multiple threads, Green dominates. NVMe latency vs SAS explains this.
FIO sequential/random throughput is 2-2.5x higher on Green. NVMe scales better due to deep queue depth and modern controller efficiency.
IOR shows Green achieving 6-13x higher MB/s, with the delta shrinking for very large I/O blocks (32-1024MB). This suggests that RAID10 overhead in Sol becomes less significant for very large sequential transfers, but NVMe still dominates.
FS-Mark and Dbench show Green is ~16-19x faster. This is the combined effect of NVMe low latency, XFS efficiency, and fewer mechanical seek delays than 10K SAS disks.
NVMe SSDs provide massive latency and throughput advantages for small and large workloads alike. RAID10 SAS arrays are good for sustained large-block transfers, but fall behind in low-latency, metadata-heavy scenarios.
Overall notesඞ
Green's smaller core count is offset by high IPC, modern caches, and DDR5 bandwidth. Many benchmarks (MariaDB, Redis, crypto) show that per-core performance is more important than total threads, especially for latency-sensitive workloads.
MariaDB and SQLite show extreme differences, not just due to CPU but also NVMe vs RAID SAS latency.
High-speed RAM benefits NVMe latency hiding. Small-block I/O (SQLite, Dbench) benefits from Green's faster memory and cache hierarchy. Sol's large-channel memory helps for sustained sequential FIO reads, but cannot compensate for mechanical seek times in small-block workloads.
Enterprise Xeons still shine in extreme multi-threaded sequential memory or I/O scenarios, but their architecture is tuned for consistency and parallel throughput, not latency-sensitive operations.