Skip to content

seung-lab/vqsort

Repository files navigation

vqsort

Sort arrays with SIMD vqsort in Python

VQSort supports 16,32,64 bit numeric types (signed/unsigned integer and floats).

import vqsort
import numpy as np

arr = np.random.randint(0, 1000, size=[100000], dtype=np.uint32)

 # still need to assign in case type is not supported by vqsort such as 8-bit
arr = vqsort.sort(arr, in_place=True)

After testing on Apple Silicon M3, floats show an advantage, integers do not.

Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

uint64
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       0.53 |       0.63 |       1.21x
    300000 |       0.49 |       0.60 |       1.22x
   1000000 |       0.46 |       0.56 |       1.22x
   3000000 |       0.43 |       0.53 |       1.23x
  10000000 |       0.41 |       0.49 |       1.22x
  30000000 |       0.38 |       0.47 |       1.22x
uint32
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       1.87 |       1.80 |       0.96x
    300000 |       1.72 |       1.70 |       0.99x
   1000000 |       1.55 |       1.54 |       0.99x
   3000000 |       1.43 |       1.43 |       1.00x
  10000000 |       1.31 |       1.33 |       1.01x
  30000000 |       1.23 |       1.24 |       1.01x
uint16
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       0.20 |       1.43 |       7.20x
    300000 |       0.19 |       1.27 |       6.84x
   1000000 |       0.18 |       1.13 |       6.31x
   3000000 |       0.18 |       1.03 |       5.70x
  10000000 |       0.18 |       1.04 |       5.69x
  30000000 |       0.18 |       1.15 |       6.33x
float64
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       0.88 |       0.67 |       0.77x
    300000 |       0.83 |       0.64 |       0.77x
   1000000 |       0.71 |       0.55 |       0.78x
   3000000 |       0.70 |       0.56 |       0.80x
  10000000 |       0.62 |       0.50 |       0.80x
  30000000 |       0.58 |       0.47 |       0.81x
float32
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       1.71 |       1.63 |       0.95x
    300000 |       1.59 |       1.53 |       0.96x
   1000000 |       1.41 |       1.41 |       1.00x
   3000000 |       1.32 |       1.32 |       1.00x
  10000000 |       1.16 |       1.17 |       1.02x
  30000000 |       1.08 |       1.10 |       1.02x
float16
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
Abort at vqsort_f16a.cc:37: Assert 0:
Aborted (core dumped)

ARM64 NEON (Macbook Pro M3)

uint64
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       0.86 |       0.87 |       1.01x
    300000 |       0.79 |       0.79 |       1.01x
   1000000 |       0.74 |       0.74 |       1.01x
   3000000 |       0.69 |       0.70 |       1.02x
  10000000 |       0.66 |       0.66 |       1.01x
  30000000 |       0.62 |       0.62 |       1.00x
uint32
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       2.03 |       2.02 |       0.99x
    300000 |       1.86 |       1.85 |       0.99x
   1000000 |       1.64 |       1.63 |       0.99x
   3000000 |       1.51 |       1.50 |       0.99x
  10000000 |       1.45 |       1.45 |       1.00x
  30000000 |       1.36 |       1.35 |       1.00x
uint16
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       3.76 |       3.72 |       0.99x
    300000 |       3.37 |       3.38 |       1.00x
   1000000 |       3.08 |       3.07 |       1.00x
   3000000 |       2.81 |       2.82 |       1.00x
  10000000 |       3.64 |       3.69 |       1.01x
  30000000 |       4.02 |       4.07 |       1.01x
float64
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       0.69 |       0.92 |       1.34x
    300000 |       0.64 |       0.86 |       1.35x
   1000000 |       0.58 |       0.79 |       1.37x
   3000000 |       0.53 |       0.73 |       1.38x
  10000000 |       0.49 |       0.69 |       1.38x
  30000000 |       0.46 |       0.64 |       1.38x
float32
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       1.40 |       1.98 |       1.41x
    300000 |       1.26 |       1.81 |       1.43x
   1000000 |       1.14 |       1.66 |       1.46x
   3000000 |       1.04 |       1.55 |       1.48x
  10000000 |       0.96 |       1.42 |       1.49x
  30000000 |       0.88 |       1.26 |       1.43x
float16
         N | numpy GB/s | vqsort GB/s |  advantage
--------------------------------------------------
    100000 |       2.21 |       3.47 |       1.57x
    300000 |       2.10 |       3.78 |       1.80x
   1000000 |       2.21 |       4.24 |       1.91x
   3000000 |       2.40 |       5.04 |       2.10x
  10000000 |       2.58 |       5.41 |       2.09x
  30000000 |       2.59 |       5.31 |       2.05x

About

Sort arrays with SIMD vqsort in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors