LETSQL is a query engine with a pythonic dataframe interface, built on top of DataFusion that can be used to write multi-engine workflows.
What is not LETSQL?
Is not a dataframe library, while it provides a familiar pythonic dataframe interface, LETSQL is equipped with a query optimizer and can provide in-situ and federated query processing.
Why LETSQL?
By using LETSQL, you will:
Reduce errors thanks to a better Pythonic UX.
Accelerate the development process by lowering the cognitive burden induced by using multiple interacting data systems.
Gain in security by providing in-situ processing (the data does not move).
Improve performance by avoiding data transfer and redundant operations.
Reduce costs by easily swapping to the cheapest tool available.
What is Multi-Engine?
What makes LETSQL stand-out against other Ibis backends, is that it can be use to build multi-engine workflows, by multi-engine it means that it can an Ibis expression involving multiple backends in an optimal manner, segmenting the expression and executing each part in-situ on the corresponding backend.
For the following example we are going to use an Ibis table from a Postgres connection and perform a join with an in-memory pandas DataFrame.
/home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibis/expr/types/relations.py:685: FutureWarning: Selecting/filtering arbitrary expressions in `Table.__getitem__` is deprecated and will be removed in version 10.0. Please use `Table.select` or `Table.filter` instead.
warnings.warn(