Well created Kubernetes Operators pack a lot of power and help run and manage stateful applications on kubernetes. We had earlier seen how to install airflow on kubernetes using helm charts. While helm charts help you get started fast, they may not be suitable for day 2 operatios like:
- Upgrades
- Backup & restore
- Auto recovery
- Automatic/On-demand scalability
- Configuration management
- Deep insights
Let’s find how to install airflow on kubernetes using airflow operator.
git clone https://github.com/GoogleCloudPlatform/airflow-operator
kubectl apply -f config/crds
kubectl apply -f hack/appcrd.yaml
# First we need to build the docker image for the controller
# Set this to the name of the docker registry and image you want to use
export IMG=hiprabhat/airflow-controller:latest
# Build and push
docker build . -t $IMG
docker push ${IMG}
Update the image
|
|
# deploy base components first
kubectl apply -f hack/sample/mysql-celery/base.yaml
You can specify the source of DAGs in the hack/sample/mysql-celery/cluster.yaml file.
dags:
subdir: "airflow/example_dags/"
git:
repo: "https://github.com/apache/incubator-airflow/"
# setting once to false allows the DAGs to be refreshed every 5 minutes
once: false
Now its time to deploy the airflow components.
# after 30-60s deploy cluster components
# using celery + git as DAG source
kubectl apply -f hack/sample/mysql-celery/cluster.yaml
# port forward to access the UI
kubectl port-forward mc-cluster-airflowui-0 8080:8080
![AIRFLOW DAG PAGE](/images/blog/airflow-using-operator.png =250x100) airflow-using-operator.png
In order to setup authentication, follow the steps in the earlier blog