Apache Flink Kubernetes Operator: Version 1.1.0 offers native Kubernetes events
Status changes and relevant Flink deployment and job changes are among the native Kubernetes events that the 1.1.0 operator can now send.
It took the Apache Software Foundation (ASF) development team less than two months to provide version 1.1.0 of a whole range of new features for the Apache Flink Kubernetes Operator. The developers were primarily concerned with improving both the general experience of managing Flink resources and that of the operator himself in production environments.
Flink deployments with Kubernetes tools
At the beginning of June, the Apache team released version 1.0, the first major release of the Flink Kubernetes Operator. This operator thus provided the important functions for the automated management of Flink deployments. In a blog entry, the developers have now announced version 1.1.0 of the software, which enables lifecycle management for Apache Flink deployments with native Kubernetes tools.
In particular, the development team emphasizes that the operator is now able to send native Kubernetes events on relevant changes in Flink deployments and jobs. This includes state changes, custom resource specification changes, and deployment failures.
New metrics for a comprehensive overview
The last version of the operator still had the limitation that it only provided basic system-level metrics to monitor the JVM process. In version 1.1.0 the developers have now introduced an extended range with additional metrics. These relate to life cycle management, access to the Kubernetes API server and the Java Operator SDK framework on which the operator itself is built. Thus, with the help of these metrics, operator administrators should be able to get a comprehensive overview of what is happening in the environment. An extensive list on the Apache web page describes the details of the supported metrics.
A number of key enhancements that the development team has also introduced are designed to help better tolerate operator outages and transient Kubernetes API outages, behavior that can be critical in production environments. This was accomplished through an overhaul and streamlining of the core reconciliation flow responsible for executing and tracking resource upgrades, savepoints, rollbacks, and other operations.
Update to version 1.1.0
According to statements by the development team, the new version 1.1.0 is backwards compatible as long as programmers follow the update instructions in the “Operator Upgrade Guide”. The upgrade should not have any impact on the Flink resources currently in use. The source code is available on the downloads page of the Flink website. There is also a quick start guide for those who want to try out the new features included in the official version 1.1.0. Official Kubernetes Operator Docker images of the new version are also available on Docker Hub.