The Photon search engine, designed for SQL and other languages, is now available for Lakehouse data systems on the major cloud platforms.
Databricks announces the General Availability (GA) of its query engine Photon. According to the announcement, the search engine, which is designed for the Databricks Lakehouse architecture, has successfully completed the public preview phase that started in summer 2021. It is now available to all users for querying data lakes on the most important cloud platforms for productive operation.
More areas of application – more speed
The newly developed query engine, which is compatible with Apache Spark, was originally designed to be able to execute typical data warehousing queries with SQL on data lakes with high performance. Photon can now also be used with other languages such as Python, Scala, Java and R and covers application areas in data engineering, data science and data analytics.
According to the provider, test customers such as AT&T benefit from up to eight times faster queries when using the Query Engine with the Databricks SQL Warehouse. The associated reduced computing times are also reflected in lower costs: Compared to the Spark-based Databricks Runtime, the savings should be up to 30 percent on average.
In the course of the GA release, Databricks also gave the Query Engine a few more performance tunings. For example, functions that perform calculations across a series of table rows for use cases such as aggregations, moving averages or data duplication should work about twice as fast as in the preview phase. Also the sorting function vectorized sort
work faster in Photon than in Apache Spark – up to a factor of 20.
More information about the Query Engine can be found in the official announcement on the Databricks blog. The recording of a lecture from this year’s Data+AI Summit also provides a more comprehensive insight into Photon.