Skip to content

Practice 6 ‐ Code Generation

Ármin Zavada edited this page Oct 18, 2024 · 5 revisions

The goal of this laboratory session is to gain practical experience with code generators.

Supplemantery Materials

Running example

In the lab session, we will use the same running example as in the previous ones; the example DFD specification of the Document Similarity Estimation algorithm is visible below.

Data-flow Diagram of the Document Similarity Estimation

A DFD (Workflow) consists of the following elements:

  • Workers transform inputs to output. For example, the Tokenize process transforms Strings to Lists of Strings. It is important to note that a Worker type may have multiple instances in a diagram, for example there are two Tokenizer node instances in the process.
  • Each node may have one or more unique input pins, which consume input values of a specific type. For example, the Scalar Product node has two input pins "1" and "2", each accepting Vectors.
  • The input pins and the outputs are connected by dedicated channels, which forward the output of a node to the input of another node. The output of a node can be used by multiple input pins, in this case each input pin gets the output. For example, the shingles of a document are processed by two different Scalar product nodes.

In this lab session, we will create a code generator capable of constructing a DFD Workflow Java implementation from the Workflow model.

Tasks

Task 0 - Preparations

Pull practice 6 materials. You don't need to push your changes in this lab since we are not focusing on CI/CD.

git clone https://github.com/ftsrg-edu/ase-labs.git
cd ase-labs
git switch practice-6

Task 1 - Why generate code?

The build task fails since unit tests in the Similarity module are failing. The problem is somewhere in the SimilarityWorkflow class. Find and fix the copy-paste error!

Such mistakes are prevalent in code full of important but repetitive patterns. This motivates us to instead of writing such code generate some or all of it. What kinds of code generators do you know? List a few!

Code generation also comes with other benefits: the domain model is simpler to create, it can be validated and optimized, and multiple generators can be implemented for the same model.

Task 2 - Simple model generation

The code generation infrastructure has already been prepared for you in the Similarity module.

Code generation infrastructure

Code generation usually comes with a lot of infrastructure needed for parsing the domain model, generating code, and integrating the whole process into the build process. In this lab session, we will use Jinja2 (see supplementary materials).

The build.gradle.kts file contains a new generateSimilarityWorkflow task that executes the src/main/python/generate.py Python script. It also depends on another task that prepares the Python environment for execution.

The generate.py script uses Jinja2 to generate the files. It first loads the model file, adds some additional computed values to it, and then forwards it to the Jinja2 template engine. The Python script takes 3 CLI parameters: the model.json file path, the template path, and the output file path.

In this task you only have to write the Jinja2 template file: src/main/jinja/workflow.java.j2.

Domain model

The Domain model is defined as a Langium grammar. The grammar and the Similarity workflow model can be found here: Langium Workflow DSL

Workflow template

We will use the existing SimilarityWorkflow class to write the workflow template. The first step is to identify the repeating patterns. This has already been documented in the SimilarityWorkflow.java file.

  • Write into the template file the common lines
     package hu.bme.mit.ase.shingler.similarity;
     
     import hu.bme.mit.ase.shingler.workflow.impl.*;
     import hu.bme.mit.ase.shingler.workflow.lib.*;
     
     public class NAMEWorkflow extends Workflow<OUTPINTYPE> {
     
         // Input pin declarations
         // ...
     
         // Parameter declarations
         // ...
     
         @Override
         protected void initialize() {
             // Worker declarations
             // ...
     
             // Adding all workers
             // ...
     
             // Set output pin
             // ...
     
             // Input pin channel declarations
             // ...
     
             // Channel declarations
             // ...
     
             // Add input pin channels
             // ...
     
             // Add channels
             // ...
         }
     
     }
  • Add Workflow name: public class {{ name }}Workflow
  • Add Workflow type: extends Workflow<{{ outPin.type }}>
  • Add input pin declarations
    // Input pin declarations
    {%- for in_pin in inPins %}
    public final Pin<{{ in_pin.type }}> {{ in_pin.name }} = new Pin<>();
    {%- endfor %}
  • Add output pin setting: setOutputPin({{ outPin.worker }}.outputPin);
  • Add parameter related code:
    // Parameter declarations
    {%- for param in parameters %}
    private final {{ param.type }} {{ param.name }};
    {%- endfor %}
    
    public {{ name }}Workflow({% for param in parameters %}{{ param.type }} {{ param.name }}{% if not loop.last %}, {% endif %}{% endfor %}) {
        {%- for param in parameters %}
        this.{{ param.name }} = {{ param.name }};
        {%- endfor %}
    }
  • Add Worker declarations and additions
    {%- for worker in workers %}
    var {{ worker.name }} = new {{ worker.type }}Worker(
            {%- if worker.arguments -%}
            {%- for argument in worker.arguments -%}
            {{ argument }}
            {%- if not loop.last -%}, {% endif -%}
            {%- endfor -%}
            {%- endif -%}
    );
    {%- endfor %}
    {% for worker in workers %}
    addWorker({{ worker.name }});
    {%- endfor %}
  • Add Worker declarations and additions
    {%- for worker in workers %}
    var {{ worker.name }} = new {{ worker.type }}Worker(
            {%- if worker.arguments -%}
            {%- for argument in worker.arguments -%}
            {{ argument }}
            {%- if not loop.last -%}, {% endif -%}
            {%- endfor -%}
            {%- endif -%}
    );
    {%- endfor %}
    {% for worker in workers %}
    addWorker({{ worker.name }});
    {%- endfor %}
  • Add Channel related code
    {% for in_pin in inPins %}
    var input{{ in_pin.name | capitalize }} = new Channel<>({{ in_pin.name }}, {{ in_pin.worker }}.{{ in_pin.pin }}Pin);
    {%- endfor %}
    {% for in_pin in inPins %}
    addChannel(input{{ in_pin.name | capitalize }});
    {%- endfor %}
    {% for channel in channels %}
    var {{ channel.name }} = new Channel<>({{ channel.fromWorker }}.outputPin, {{ channel.toWorker }}.{{ channel.toPin }}Pin);
    {%- endfor %}
    {% for channel in channels %}
    addChannel({{ channel.name }});
    {%- endfor %}
  • Run Gradle build

Task 3 - Vary the Domain model

Optional Tasks

  • Implement the Diversity workflow using the created code generator