Skip to main content

Building DataSQRL Projects

This page provides an overview of how to compile, run, and debug SQRL scripts and projects using the DataSQRL command.
The command documentation has a complete reference of the DataSQRL command and all options.


To compile an SQRL script, you invoke the DataSQRL compile command:

docker run --rm -v $PWD:/build datasqrl/cmd compile myscript.sqrl myapischema.graphqls

The compile command takes the SQRL script to compile as the first argument and an API specification as an optional second argument. If no API specification is provided, DataSQRL generates one. See API Design for more details.

The command compiles the script and API specification into an integrated data product. The command creates a build with all the build artifacts that are used during the compilation and build process (e.g. dependencies). The command writes the deployment artifacts for the compiled data product into the build/deploy directory. Read more about deployment artifacts in the deployment documentation.

DataSQRL supports multiple engines and data pipeline architectures. That means, you can configure the architecture of the targeted data pipeline and what systems will execute individual components of the compiled data pipeline.

DataSQRL data pipeline architecture >

The figure shows a data pipeline architecture that consists of a Apache Kafka, Apache Flink, a database engine, and API server. Kafka holds the input and streaming data. Flink ingests the data, processes it, and writes the results to the database. The API server translates incoming requests into database queries and assembles the response from the returned query results.

The data pipeline architecture and engines are configured in the package configuration. The DataSQRL command looks for a package.json configuration file in the directory where it is executed. Alternatively, the package configuration file can be provided as an argument via the -c option. Check out the command line reference for all command line options.

If no package configuration file is provided or found, DataSQRL generates a default package configuration with the example data pipeline architecture shown above and the following engines:

The package configuration contains additional compiler options and declares the dependencies of a script. Read more about the package configuration to learn how to configure your build.


To run the pipeline that DataSQRL compiles from your SQRL script and API specification, execute:

(cd build/deploy; docker compose up)

This command executes docker-compose with the template generated by the DataSQRL compiler in the build/deploy directory. It starts all the engines and deploys the produced deployments artifacts of the compiled data pipeline to the engines to run the entire data pipeline.

The API server which is exposed at localhost:8888/.
You can now access the API and execute queries against it to test your script and the compiled data pipeline. See the API access documentation for more details.

Use the keystroke CTRL-C to stop the running data pipeline. This will stop all engines gracefully.


To debug a SQRL script, compile it with the -d flag:

docker run --rm -v $PWD:/build datasqrl/cmd compile -d myscript.sqrl myapischema.graphqls

and then run it with:

(cd build/deploy; docker compose up)

In debug mode, the stream engine exports a change log of each computed table to the configured debug sink which is the print sink by default. This allows you to see what is being computed at each step in your script.


When you finished development and are ready to deploy your SQRL script, check out the deployment options.