Concepts

What is LETSQL?

LETSQL is a query engine with a pythonic dataframe interface, built on top of DataFusion that can be used to write multi-engine workflows.

What is not LETSQL?

Is not a dataframe library, while it provides a familiar pythonic dataframe interface, LETSQL is equipped with a query optimizer and can provide in-situ and federated query processing.

Why LETSQL?

By using LETSQL, you will:

  • Reduce errors thanks to a better Pythonic UX.
  • Accelerate the development process by lowering the cognitive burden induced by using multiple interacting data systems.
  • Gain in security by providing in-situ processing (the data does not move).
  • Improve performance by avoiding data transfer and redundant operations.
  • Reduce costs by easily swapping to the cheapest tool available.

What is Multi-Engine?

What makes LETSQL stand-out against other Ibis backends, is that it can be use to build multi-engine workflows, by multi-engine it means that it can an Ibis expression involving multiple backends in an optimal manner, segmenting the expression and executing each part in-situ on the corresponding backend.

For the following example we are going to use an Ibis table from a Postgres connection and perform a join with an in-memory pandas DataFrame.

import pathlib

import ibis
import pandas as pd

import letsql as ls

ibis.options.interactive = True

# create the letsql connection
con = ls.connect()


# create the Ibis Postgres connection
pg = ibis.postgres.connect(
    host="localhost",
    port=5432,
    user="postgres",
    password="postgres",
    database="ibis_testing",
)


# register Postgres table
batting = con.register(pg.table("batting"), table_name="batting")

# register csv file
df = pd.read_csv("https://raw.githubusercontent.com/ibis-project/testing-data/master/csv/awards_players.csv")
awards_players = con.register(df, table_name="awards_players")

left = batting[batting.yearID == 2015]
right = awards_players[awards_players.lgID == "NL"].drop("yearID", "lgID").execute()

left.join(right, ["playerID"], how="semi")
/home/runner/work/letsql/letsql/.venv/lib/python3.12/site-packages/ibis/expr/types/relations.py:685: FutureWarning: Selecting/filtering arbitrary expressions in `Table.__getitem__` is deprecated and will be removed in version 10.0. Please use `Table.select` or `Table.filter` instead.
  warnings.warn(
┏━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┓
┃ playerID   yearID  stint  teamID  lgID    G      AB     R      H      X2B    X3B    HR     RBI    SB     CS     BB     SO     IBB    HBP    SH     SF     GIDP  ┃
┡━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━┩
│ stringint64int64stringstringint64int64int64int64int64int64int64int64int64int64int64int64int64int64int64int64int64 │
├───────────┼────────┼───────┼────────┼────────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┤
│ alvarpe0120151PIT   NL    150437601061802777204813192046 │
│ beltrad0120151TEX   AL    143567831633241883104165430818 │
│ cabreme0120151CHA   AL    1586297017236212773040882221018 │
│ cabremi0120151DET   AL    1194296414528118761177821530219 │
│ crawfbr0120151SFN   NL    14350765130334218464391199110418 │
│ desmoia0120151WAS   NL    1565836913627219621354518703649 │
│ dickera0120151TOR   AL    333000000000100100 │
│ fieldpr0120151TEX   AL    15861378187280239800648814110521 │
│ gallayo0120151TEX   AL    334021000000100000 │
│ hudsoda0120151ARI   NL    641000000000100000 │
│  │
└───────────┴────────┴───────┴────────┴────────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘