Skip to main content

Deploying DataSQRL Projects

To deploy your DataSQRL project, the first step is to compile the deployment artifacts:

docker run --rm -v $PWD:/build datasqrl/cmd compile myscript.sqrl myapischema.graphqls

The compiler populates the build/ directory with all the build artifacts needed to compile the data pipeline. Inside the build directory is the deploy/ directory that contains all the deployment artifacts for the individual engines configured in the package configuration. If no package configuration is provided, DataSQRL uses the default engines.

You can either deploy DataSQRL projects with docker or deploy each engine separately. Using docker is the easiest deployment option. Deploying each engine separately gives you more flexibility and allows you to deploy on existing infrastructure or use managed cloud services.

Docker

To deploy a SQRL script and API specification with docker, run docker-compose up in the build/deploy folder:

(cd build/deploy; docker compose up)

Docker-compose uses the docker-compose.yml template in the deploy folder which you can modify to your needs.

info

To stop the pipeline, interrupt it with CTRL-C and shut it down with:

docker compose down -v

It's important to remove the containers and volumes with this command before launching another data pipeline to get updated containers.

Individually

Deploying each component of the data pipeline independently gives you complete control over how and where your data pipeline is deployed.

In this deployment mode, DataSQRL compiles deployment artifacts for each engine configured in engines section of the package configuration which you can then deploy in a way that works for your infrastructure and deployment requirements.

The deployment artifacts can be found in the build/deploy folder. How to deploy them individually depends on the engines that you are using for your data pipeline.

Check the engine documentation for a particular engine for more information on how to deploy the executables on existing data infrastructure (an existing Flink cluster, database cluster, etc) or managed service.