Concepts

What is LETSQL?

LETSQL is a query engine with a pythonic dataframe interface, built on top of DataFusion that can be used to write multi-engine workflows.

What is not LETSQL?

Is not a dataframe library, while it provides a familiar pythonic dataframe interface, LETSQL is equipped with a query optimizer and can provide in-situ and federated query processing.

Why LETSQL?

By using LETSQL, you will:

  • Reduce errors thanks to a better Pythonic UX.
  • Accelerate the development process by lowering the cognitive burden induced by using multiple interacting data systems.
  • Gain in security by providing in-situ processing (the data does not move).
  • Improve performance by avoiding data transfer and redundant operations.
  • Reduce costs by easily swapping to the cheapest tool available.

What is Multi-Engine?

What makes LETSQL stand-out against other Ibis backends, is that it can be use to build multi-engine workflows, by multi-engine it means that it can an Ibis expression involving multiple backends in an optimal manner, segmenting the expression and executing each part in-situ on the corresponding backend.

For the following example we are going to use an Ibis table from a Postgres connection and perform a join with an in-memory pandas DataFrame.

import pathlib

import ibis
import pandas as pd

import letsql as ls

ibis.options.interactive = True

# create the letsql connection
con = ls.connect()


# create the Ibis Postgres connection
pg = ibis.postgres.connect(
    host="localhost",
    port=5432,
    user="postgres",
    password="postgres",
    database="ibis_testing",
)


# register Postgres table
batting = con.register(pg.table("batting"), table_name="batting")

# register csv file
df = pd.read_csv("https://raw.githubusercontent.com/ibis-project/testing-data/master/csv/awards_players.csv")
awards_players = con.register(df, table_name="awards_players")

left = batting[batting.yearID == 2015]
right = awards_players[awards_players.lgID == "NL"].drop("yearID", "lgID").execute()

left.join(right, ["playerID"], how="semi")
/home/runner/work/letsql/letsql/.venv/lib/python3.12/site-packages/ibis/expr/types/relations.py:685: FutureWarning: Selecting/filtering arbitrary expressions in `Table.__getitem__` is deprecated and will be removed in version 10.0. Please use `Table.select` or `Table.filter` instead.
  warnings.warn(
┏━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━┓
┃ playerID   yearID  stint  teamID  lgID    G      AB     R      H      X2B    X3B    HR     RBI    SB     CS     BB     SO     IBB    HBP    SH     SF     GIDP  ┃
┡━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━┩
│ stringint64int64stringstringint64int64int64int64int64int64int64int64int64int64int64int64int64int64int64int64int64 │
├───────────┼────────┼───────┼────────┼────────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┼───────┤
│ braunry0220151MIL   NL    14050687144273258424454115440320 │
│ buehrma0120151TOR   AL    327010000001300100 │
│ carpema0120151SLN   NL    1545741011564432884438115156045 │
│ cuddymi0120151NYN   NL    11737944981811041202488040113 │
│ degroja0120151NYN   NL    315931110040031600404 │
│ ethiean0120151LAN   NL    142395541162071453234375240311 │
│ goldspa0120151ARI   NL    159567103182382331102151181512920716 │
│ gordode0120151MIA   NL    145615882052484465820259122656 │
│ greinza0120151LAN   NL    326781520231011400810 │
│ harvema0120151NYN   NL    29651720170003100102 │
│  │
└───────────┴────────┴───────┴────────┴────────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘