diff --git a/src/graphics/lib/compute/hotsort/README.md b/src/graphics/lib/compute/hotsort/README.md index 6797a00d3412ec447bb6163d6323481e4cbbfdaf..fb9bce5aa9ff4e3e36ff07fa1fa34ca3cd345fcb 100644 --- a/src/graphics/lib/compute/hotsort/README.md +++ b/src/graphics/lib/compute/hotsort/README.md @@ -24,26 +24,26 @@ implementations when sorting arrays of smaller than 500K-2M keys. Here is a throughput plot for HotSort sorting 32-bit and 64-bit keys with a 640-core Quadro M1200: - - + + HotSort throughput on Vulkan (Mesa) with a 704-core AMD V1807B APU: - + HotSort throughput on Vulkan with a 192-core Intel HD 630: - + ### Execution time Note that these sorting rates translate to sub-millisecond to multi-millisecond execution times on small GPUs: - - - - + + + + # Usage @@ -64,11 +64,11 @@ The following architectures are supported: Vendor | Architecture | 32‑bit | 64‑bit | 32+32‑bit | Notes -------|-------------------------------------------|:------------------:|:------------------:|:-----------:|------ -NVIDIA | sm_35,sm_37,sm_50,sm_52,sm_60,sm_61,sm_70 | :white_check_mark: | :white_check_mark: | :x: | Not tested on all architectures -NVIDIA | sm_30,sm_32,sm_53,sm_62 | :x: | :x: | :x: | Need to generate properly shaped kernels -AMD | GCN | :white_check_mark: | :white_check_mark: | :x: | Tested on Linux MESA 18.2 -Intel | GEN8+ | :white_check_mark: | :white_check_mark: | :x: | Good but the assumed *best-shaped* kernels aren't being used due to a compiler issue -Intel | APL/GLK using a 2x9 or 1x12 thread pool | :x: | :x: | :x: | Need to generate properly shaped kernels +NVIDIA | sm_35,sm_37,sm_50,sm_52,sm_60,sm_61,sm_70 | ✔ | ✔ | ⌠| Not tested on all architectures +NVIDIA | sm_30,sm_32,sm_53,sm_62 | ⌠| ⌠| ⌠| Need to generate properly shaped kernels +AMD | GCN | ✔ | ✔ | ⌠| Tested on Linux MESA 18.2 +Intel | GEN8+ | ✔ | ✔ | ⌠| Good but the assumed *best-shaped* kernels aren't being used due to a compiler issue +Intel | APL/GLK using a 2x9 or 1x12 thread pool | ⌠| ⌠| ⌠| Need to generate properly shaped kernels An architecture-specific instance of the HotSort algorithm is referred to as a "target". @@ -163,7 +163,7 @@ In the slab sorting phase, each lane of a subgroup executes a bitonic sorting network on its registers and successively merges lanes until the slab of registers is sorted in serpentine order. - + ## Merging @@ -179,7 +179,7 @@ sequences. This property also holds for non-power-of-two sequences. As an example, the *Streaming Flip Merge* kernel is illustrated below: - + # Future Enhancements diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_mkeys.png b/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_mkeys.png new file mode 100644 index 0000000000000000000000000000000000000000..0d18746df897e4ee23fc47d51a24bb1e1f4eb901 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_mkeys.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_msecs.png b/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_msecs.png new file mode 100644 index 0000000000000000000000000000000000000000..2549d007eb068f1ccf616c598442f63999693c5c Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_msecs.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_flip_merge.png b/src/graphics/lib/compute/hotsort/docs/images/hs_flip_merge.png new file mode 100644 index 0000000000000000000000000000000000000000..c213cd5859e0b48f0a084133f972b3e259a70469 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_flip_merge.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_mkeys.png b/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_mkeys.png new file mode 100644 index 0000000000000000000000000000000000000000..73f428b870baaf48fa520e7d9f8167cc42d7c0f5 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_mkeys.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_msecs.png b/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_msecs.png new file mode 100644 index 0000000000000000000000000000000000000000..d239d3e226a7420fd927545528a542cf22193f13 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_msecs.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_mkeys.png b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_mkeys.png new file mode 100644 index 0000000000000000000000000000000000000000..fd0753a3f6506f8d2a827ac1ff6eff991b1678a0 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_mkeys.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_msecs.png b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_msecs.png new file mode 100644 index 0000000000000000000000000000000000000000..abf79ead2e97102670f5c05a6f6b40d6c209299c Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_msecs.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_mkeys.png b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_mkeys.png new file mode 100644 index 0000000000000000000000000000000000000000..a55316300fe64fc4e3de672caee59d1880f96f75 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_mkeys.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_msecs.png b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_msecs.png new file mode 100644 index 0000000000000000000000000000000000000000..a36eae0ae21a965aee89ed68a7d2b2d515abc3e6 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_msecs.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_sorted_slab.png b/src/graphics/lib/compute/hotsort/docs/images/hs_sorted_slab.png new file mode 100644 index 0000000000000000000000000000000000000000..09e624faf47ef39523d6468f8311eef0eaccb648 Binary files /dev/null and b/src/graphics/lib/compute/hotsort/docs/images/hs_sorted_slab.png differ diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_mkeys.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_amd_gcn_mkeys.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_mkeys.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_amd_gcn_mkeys.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_msecs.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_amd_gcn_msecs.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_amd_gcn_msecs.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_amd_gcn_msecs.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_flip_merge.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_flip_merge.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_flip_merge.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_flip_merge.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_mkeys.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_intel_gen8_mkeys.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_mkeys.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_intel_gen8_mkeys.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_msecs.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_intel_gen8_msecs.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_intel_gen8_msecs.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_intel_gen8_msecs.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_mkeys.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u32_mkeys.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_mkeys.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u32_mkeys.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_msecs.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u32_msecs.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u32_msecs.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u32_msecs.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_mkeys.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u64_mkeys.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_mkeys.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u64_mkeys.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_msecs.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u64_msecs.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_nvidia_sm35_u64_msecs.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_nvidia_sm35_u64_msecs.svg diff --git a/src/graphics/lib/compute/hotsort/docs/images/hs_sorted_slab.svg b/src/graphics/lib/compute/hotsort/docs/images/src/hs_sorted_slab.svg similarity index 100% rename from src/graphics/lib/compute/hotsort/docs/images/hs_sorted_slab.svg rename to src/graphics/lib/compute/hotsort/docs/images/src/hs_sorted_slab.svg diff --git a/src/graphics/lib/compute/spinel/README.md b/src/graphics/lib/compute/spinel/README.md index 27bf1474c07409c87f8db521dbf4cdeaa06d0bcc..6e7d1519a9b8b572d32d7719582dd956803d14dc 100644 --- a/src/graphics/lib/compute/spinel/README.md +++ b/src/graphics/lib/compute/spinel/README.md @@ -48,7 +48,7 @@ These two capabilities can enable significant work reuse. # Benchmarks -:construction: +🚧 # Usage @@ -60,26 +60,26 @@ The following architectures are under development: Vendor | Architecture | Status | Notes -------|-------------------------------------------|:--------------:|------ -AMD | GCN3+ | :construction: | Under construction -ARM | Bifrost (4-wide) | :construction: | Under construction -ARM | Bifrost (8-wide) | :construction: | Under construction -NVIDIA | sm_35,sm_37,sm_50,sm_52,sm_60,sm_61,sm_70 | :construction: | Under construction -NVIDIA | sm_30,sm_32,sm_53,sm_62 | :x: | -Intel | GEN8+ | :construction: | Under construction -Intel | APL/GLK using a 2x9 or 1x12 thread pool | :x: | +AMD | GCN3+ | 🚧 | Under construction +ARM | Bifrost (4-wide) | 🚧 | Under construction +ARM | Bifrost (8-wide) | 🚧 | Under construction +NVIDIA | sm_35,sm_37,sm_50,sm_52,sm_60,sm_61,sm_70 | 🚧 | Under construction +NVIDIA | sm_30,sm_32,sm_53,sm_62 | ⌠| +Intel | GEN8+ | 🚧 | Under construction +Intel | APL/GLK using a 2x9 or 1x12 thread pool | ⌠| # Programming Idioms -:construction: +🚧 - + # Architecture -:construction: +🚧 - + # Future Enhancements -:construction: +🚧 diff --git a/src/graphics/lib/compute/spinel/docs/images/spinel_api.png b/src/graphics/lib/compute/spinel/docs/images/spinel_api.png new file mode 100644 index 0000000000000000000000000000000000000000..48073d7f28abdf89522976a9e3f8208a23239e1d Binary files /dev/null and b/src/graphics/lib/compute/spinel/docs/images/spinel_api.png differ diff --git a/src/graphics/lib/compute/spinel/docs/images/spinel_pipeline.png b/src/graphics/lib/compute/spinel/docs/images/spinel_pipeline.png new file mode 100644 index 0000000000000000000000000000000000000000..3b92c9320276a9bc82b283070485ca70d71c4cb5 Binary files /dev/null and b/src/graphics/lib/compute/spinel/docs/images/spinel_pipeline.png differ diff --git a/src/graphics/lib/compute/spinel/images/spinel_api.svg b/src/graphics/lib/compute/spinel/docs/images/src/spinel_api.svg similarity index 100% rename from src/graphics/lib/compute/spinel/images/spinel_api.svg rename to src/graphics/lib/compute/spinel/docs/images/src/spinel_api.svg diff --git a/src/graphics/lib/compute/spinel/images/spinel_pipeline.svg b/src/graphics/lib/compute/spinel/docs/images/src/spinel_pipeline.svg similarity index 100% rename from src/graphics/lib/compute/spinel/images/spinel_pipeline.svg rename to src/graphics/lib/compute/spinel/docs/images/src/spinel_pipeline.svg