title | weight | menu | show_toc |
---|---|---|---|
AnyDSL - A Partial Evaluation Framework for Programming High-Performance Libraries |
-1 |
Home |
false |
AnyDSL is a framework for domain-specific libraries (DSLs). These are implemented in our language [Impala]({% link Impala.md %}). In order to achieve high-performance, Impala partially evaluates any abstractions these libraries might impose. Partial evaluation and other optimizations are performed on AnyDSL's intermediate representation [Thorin]({% link Thorin.md %}).
You can ask for support on Discord.
Join the AnyDSL Workshop on July 21, 2022!
When developing a DSL, people from different areas come together:
- the application developer who just wants to use the DSL,
- the DSL designer who develops domain-specific abstractions, and
- the machine expert who knows the target machine very well and how to massage the code in order to achieve good performance.
AnyDSL allows a separation of these concerns using
- higher-order functions,
- partial evaluation and,
- triggered code generation.
fn main() {
let img = load("dragon.png");
let blurred = gaussian_blur(img);
}
fn gaussian_blur(field: Field) -> Field {
let stencil: Stencil = { /* ... */ };
let mut out: Field = { /* ... */ };
for x, y in @iterate(out) {
out.data(x, y) = apply_stencil(x, y, field, stencil);
}
out
}
fn iterate(field: Field, body: fn(int, int) -> ()) -> () {
let grid = (field.cols, field.rows, 1);
let block = (128, 1, 1);
with nvvm(grid, block) {
let x = nvvm_tid_x() + nvvm_ntid_x() * nvvm_ctaid_x();
let y = nvvm_tid_y() + nvvm_ntid_y() * nvvm_ctaid_y();
body(x, y);
}
}
Rodent: https://github.com/anydsl/rodent
Rodent is a BVH traversal library and renderer implemented using the AnyDSL compiler framework. Rodent is a renderer-generating library that converts 3D scenes into optimized/specialized code the scene on CPUs and GPUs. Compared with state-of-the-art renderer, we obtain the following speedups:
- Embree (Intel): up to 23% faster
- OptiX (NVIDIA): up to 31% faster (megakernel)
- OptiX (NVIDIA): up to 42% faster (wavefront)
Rodent supports also ARM CPUs and AMD GPUs.
Stincilla: https://github.com/anydsl/stincilla
Stincilla is a DSL for stencil codes. We used the Gaussian blur filter as example and compared against the implementations in OpenCV 3.0 as reference. Thereby, we achieved the following results:
- Intel CPU: 40% faster
- Intel GPU: 25% faster
- AMD GPU: 50% faster
- NVIDIA GPU: 45% faster
- Up to 10x shorter code
RaTrace: https://github.com/anydsl/traversal
RaTrace is a DSL for ray traversal.
- 17% faster on NVIDIA GTX 970 (reference: Aila et al.)
- 11% faster on Intel Core i7-4790 using type inference (reference: Embree)
- 10% slower on Intel Core i7-4790 using auto-vectorization (reference: Embree)
- 1/10th of coding time according to Halstead measures