RISE & Shine

Programs are expressed at a high level in RISE. Programs are transformed using a set of rewrite rules that encode implementation and optimization choices. The Shine compiler generates high-performance parallel C or OpenCL code while preserving the optimization choices made during rewriting.

RISE is a spiritual successor to the Lift project.

Overview

Starting from a High-Level RISE Program and an Elevate Optimization Strategy the Shine compiler rewrites the high-level program as specified by the optimization strategy into a Low-Level RISE Program that encodes all implementation and optimization decissions explicitly.

The code generator processes the low-level program to generate the final Optimized C, OpenMP, OpenCL, or CUDA Program.


High-Level RISE Program

def highLevelProgram: ToBeTyped[Rise] =
  depFun((n: Nat, m: Nat, o: Nat) =>
    fun(n`.`o`.`f32)(A => fun(m`.`o`.`f32)(B =>
      A |> map(fun(rowOfA =>
        B |> map(fun(rowOfB =>
          zip(rowOfA)(rowOfB) |>
            map(fun(x => fst(x) * snd(x))) |>
              reduce(add)(l(0.0f)) )) )) )) )

Elevate Optimization Strategy

def optimizationStrategy: Strategy[Rise] =
  (`map |-> mapPar`       `@` outermost(isMap))  `;`
  (`map |-> mapSeq`       `@` outermost(isMap))  `;`
  (`reduce |-> reduceSeq` `@` everywhere)

Shine Compiler


rewriting

Low-Level RISE Program

def lowLevelProgram: ToBeTyped[Rise] =
  depFun((n: Nat, m: Nat, o: Nat) =>
    fun(n`.`o`.`f32)(A => fun(m`.`o`.`f32)(B =>
      A |> mapPar(fun(rowOfA =>
        B |> mapSeq(fun(rowOfB =>
          zip(rowOfA)(rowOfB) |>
            map(fun(x => fst(x) * snd(x))) |>
              reduceSeq(add)(l(0.0f)) )) )) )) )

code generation

Optimized  C / OpenMP / OpenCL / CUDA  Program

#include <stdint.h>
void foo(float* output, int n, int m, int o, float* A, float* B){
  #pragma omp parallel for
  for (int i = 0; i < n; i = 1 + i) {
    for (int j = 0; j < m; j = 1 + j) {
        float acc;
        acc = 0.0f;
        for (int k = 0; k < o; k = 1 + k) {
          acc = acc + A[k + i * o] * B[k + j * o]; }
        output[j + i * m] = acc; } }
}

RISE in MLIR

We are implementing RISE as a Dialect in MLIR. We argue that this approach of using simple reusable patterns to break up the established concept of using inflexible monolithic kernels will enable easier exploration of different novel optimizations for machine learning workloads.

https://rise-lang.org/mlir
https://github.com/rise-lang/mlir/

Publications

Presentations

Team