Table expressions

Tables are one of the core data structures in Ibis.

Table

Table(self, arg)

An immutable and lazy dataframe.

Analogous to a SQL table or a pandas DataFrame. A table expression contains an ordered set of named columns, each with a single known type. Unless explicitly ordered with an .order_by(), the order of rows is undefined.

Table immutability means that the data underlying an Ibis Table cannot be modified: every method on a Table returns a new Table with those changes. Laziness means that an Ibis Table expression does not run your computation every time you call one of its methods. Instead, it is a symbolic expression that represents a set of operations to be performed, which typically is translated into a SQL query. That SQL query is then executed on a backend, where the data actually lives. The result (now small enough to be manageable) can then be materialized back into python as a pandas/pyarrow/python DataFrame/Column/scalar.

You will not create Table objects directly. Instead, you will create one

See the user guide for more info.

Methods

Name Description
alias Create a table expression with a specific name alias.
as_scalar Inform ibis that the table expression should be treated as a scalar.
count Compute the number of rows in the table.
difference Compute the set difference of multiple table expressions.
distinct Return a Table with duplicate rows removed.
dropna Remove rows with null values from the table.
fillna Fill null values in a table expression.
filter Select rows from table based on predicates.
intersect Compute the set intersection of multiple table expressions.
limit Select n rows from self starting at offset.
order_by Sort a table by one or more expressions.
sample Sample a fraction of rows from a table.
select Compute a new table expression using exprs and named_exprs.
sql Run a SQL query against a table expression.
union Compute the set union of multiple table expressions.
view Create a new table expression distinct from the current one.

alias

alias(alias)

Create a table expression with a specific name alias.

This method is useful for exposing an ibis expression to the underlying backend for use in the Table.sql method.

.alias will create a temporary view

.alias creates a temporary view in the database.

This side effect will be removed in a future version of ibis and is not part of the public API.

Parameters

Name Type Description Default
alias str Name of the child expression required

Returns

Type Description
Table An table expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> expr = t.alias("pingüinos").sql('SELECT * FROM "pingüinos" LIMIT 5')
>>> expr
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007 │
│ Adelie Torgersen36.719.3193.03450.0female2007 │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

as_scalar

as_scalar()

Inform ibis that the table expression should be treated as a scalar.

Note that the table must have exactly one column and one row for this to work. If the table has more than one column an error will be raised in expression construction time. If the table has more than one row an error will be raised by the backend when the expression is executed.

Returns

Type Description
Scalar A scalar subquery

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> heavy_gentoo = t.filter(t.species == "Gentoo", t.body_mass_g > 6200)
>>> from_that_island = t.filter(t.island == heavy_gentoo.select("island").as_scalar())
>>> from_that_island.species.value_counts().order_by("species")
┏━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ species  species_count ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ stringint64         │
├─────────┼───────────────┤
│ Adelie 44 │
│ Gentoo 124 │
└─────────┴───────────────┘

count

count(where=None)

Compute the number of rows in the table.

Parameters

Name Type Description Default
where ir.BooleanValue | None Optional boolean expression to filter rows when counting. None

Returns

Type Description
IntegerScalar Number of rows in the table

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": ["foo", "bar", "baz"]})
>>> t
┏━━━━━━━━┓
┃ a      ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ foo    │
│ bar    │
│ baz    │
└────────┘
>>> t.count()

3
>>> t.count(t.a != "foo")

2
>>> type(t.count())
ibis.expr.types.numeric.IntegerScalar

difference

difference(table, *rest, distinct=True)

Compute the set difference of multiple table expressions.

The input tables must have identical schemas.

Parameters

Name Type Description Default
table Table A table expression required
*rest Table Additional table expressions ()
distinct bool Only diff distinct rows not occurring in the calling table True

See Also

ibis.difference

Returns

Type Description
Table The rows present in self that are not present in tables.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t1 = ibis.memtable({"a": [1, 2]})
>>> t1
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
└───────┘
>>> t2 = ibis.memtable({"a": [2, 3]})
>>> t2
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     2 │
│     3 │
└───────┘
>>> t1.difference(t2)
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
└───────┘

distinct

distinct(on=None, keep='first')

Return a Table with duplicate rows removed.

Similar to pandas.DataFrame.drop_duplicates().

Some backends do not support keep='last'

Parameters

Name Type Description Default
on str | Iterable[str] | s.Selector | None Only consider certain columns for identifying duplicates. By default deduplicate all of the columns. None
keep Literal[‘first’, ‘last’] | None Determines which duplicates to keep. - "first": Drop duplicates except for the first occurrence. - "last": Drop duplicates except for the last occurrence. - None: Drop all duplicates 'first'

Examples

>>> import ibis
>>> import letsql.examples as ex
>>> import ibis.selectors as s
>>> ibis.options.interactive = True
>>> t = ex.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007 │
│ Adelie Torgersen36.719.3193.03450.0female2007 │
│ Adelie Torgersen39.320.6190.03650.0male  2007 │
│ Adelie Torgersen38.917.8181.03625.0female2007 │
│ Adelie Torgersen39.219.6195.04675.0male  2007 │
│ Adelie Torgersen34.118.1193.03475.0NULL2007 │
│ Adelie Torgersen42.020.2190.04250.0NULL2007 │
│  │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Compute the distinct rows of a subset of columns

>>> t[["species", "island"]].distinct().order_by(s.all())
┏━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ species    island    ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━┩
│ stringstring    │
├───────────┼───────────┤
│ Adelie   Biscoe    │
│ Adelie   Dream     │
│ Adelie   Torgersen │
│ ChinstrapDream     │
│ Gentoo   Biscoe    │
└───────────┴───────────┘

Drop all duplicate rows except the first

>>> t.distinct(on=["species", "island"], keep="first").order_by(s.all())
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species    island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├───────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie   Biscoe   37.818.3174.03400.0female2007 │
│ Adelie   Dream    39.516.7178.03250.0female2007 │
│ Adelie   Torgersen39.118.7181.03750.0male  2007 │
│ ChinstrapDream    46.517.9192.03500.0female2007 │
│ Gentoo   Biscoe   46.113.2211.04500.0female2007 │
└───────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Drop all duplicate rows except the last

>>> t.distinct(on=["species", "island"], keep="last").order_by(s.all())
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species    island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├───────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie   Biscoe   42.718.3196.04075.0male  2009 │
│ Adelie   Dream    41.518.5201.04000.0male  2009 │
│ Adelie   Torgersen43.119.2197.03500.0male  2009 │
│ ChinstrapDream    50.218.7198.03775.0female2009 │
│ Gentoo   Biscoe   49.916.1213.05400.0male  2009 │
└───────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Drop all duplicated rows

>>> expr = t.distinct(on=["species", "island", "year", "bill_length_mm"], keep=None)
>>> expr.count()

273
>>> t.count()

344

You can pass selectors to on

>>> t.distinct(on=~s.numeric())
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species    island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├───────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ ChinstrapDream    46.517.9192.03500.0female2007 │
│ ChinstrapDream    50.019.5196.03900.0male  2007 │
│ Adelie   Biscoe   37.818.3174.03400.0female2007 │
│ Adelie   Biscoe   37.718.7180.03600.0male  2007 │
│ Adelie   Dream    39.516.7178.03250.0female2007 │
│ Adelie   Dream    37.218.1178.03900.0male  2007 │
│ Adelie   Dream    37.518.9179.02975.0NULL2007 │
│ Adelie   Torgersen39.118.7181.03750.0male  2007 │
│ Adelie   Torgersen39.517.4186.03800.0female2007 │
│ Adelie   TorgersenNULLNULLNULLNULLNULL2007 │
│  │
└───────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

The only valid values of keep are "first", "last" and [`None][None]

>>> t.distinct(on="species", keep="second")  
IbisError: Invalid value for `keep`: 'second', must be 'first', 'last' or None

dropna

dropna(subset=None, how='any')

Remove rows with null values from the table.

Parameters

Name Type Description Default
subset Sequence[str] | str | None Columns names to consider when dropping nulls. By default all columns are considered. None
how Literal[‘any’, ‘all’] Determine whether a row is removed if there is at least one null value in the row ('any'), or if all row values are null ('all'). 'any'

Returns

Type Description
Table Table expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007 │
│ Adelie Torgersen36.719.3193.03450.0female2007 │
│ Adelie Torgersen39.320.6190.03650.0male  2007 │
│ Adelie Torgersen38.917.8181.03625.0female2007 │
│ Adelie Torgersen39.219.6195.04675.0male  2007 │
│ Adelie Torgersen34.118.1193.03475.0NULL2007 │
│ Adelie Torgersen42.020.2190.04250.0NULL2007 │
│  │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
>>> t.count()

344
>>> t.dropna(["bill_length_mm", "body_mass_g"]).count()

342
>>> t.dropna(how="all").count()  # no rows where all columns are null

344

fillna

fillna(replacements)

Fill null values in a table expression.

There is potential lack of type stability with the fillna API

For example, different library versions may impact whether a given backend promotes integer replacement values to floats.

Parameters

Name Type Description Default
replacements ir.Scalar | Mapping[str, ir.Scalar] Value with which to fill nulls. If replacements is a mapping, the keys are column names that map to their replacement value. If passed as a scalar all columns are filled with that value. required

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.sex
┏━━━━━━━━┓
┃ sex    ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ male   │
│ female │
│ female │
│ NULL   │
│ female │
│ male   │
│ female │
│ male   │
│ NULL   │
│ NULL   │
│       │
└────────┘
>>> t.fillna({"sex": "unrecorded"}).sex
┏━━━━━━━━━━━━┓
┃ sex        ┃
┡━━━━━━━━━━━━┩
│ string     │
├────────────┤
│ male       │
│ female     │
│ female     │
│ unrecorded │
│ female     │
│ male       │
│ female     │
│ male       │
│ unrecorded │
│ unrecorded │
│           │
└────────────┘

Returns

Type Description
Table Table expression

filter

filter(*predicates)

Select rows from table based on predicates.

Parameters

Name Type Description Default
predicates ir.BooleanValue | Sequence[ir.BooleanValue] | IfAnyAll Boolean value expressions used to select rows in table. ()

Returns

Type Description
Table Filtered table expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007 │
│ Adelie Torgersen36.719.3193.03450.0female2007 │
│ Adelie Torgersen39.320.6190.03650.0male  2007 │
│ Adelie Torgersen38.917.8181.03625.0female2007 │
│ Adelie Torgersen39.219.6195.04675.0male  2007 │
│ Adelie Torgersen34.118.1193.03475.0NULL2007 │
│ Adelie Torgersen42.020.2190.04250.0NULL2007 │
│  │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘
>>> t.filter([t.species == "Adelie", t.body_mass_g > 3500]).sex.value_counts().dropna(
...     "sex"
... ).order_by("sex")
┏━━━━━━━━┳━━━━━━━━━━━┓
┃ sex     sex_count ┃
┡━━━━━━━━╇━━━━━━━━━━━┩
│ stringint64     │
├────────┼───────────┤
│ female22 │
│ male  68 │
└────────┴───────────┘

intersect

intersect(table, *rest, distinct=True)

Compute the set intersection of multiple table expressions.

The input tables must have identical schemas.

Parameters

Name Type Description Default
table Table A table expression required
*rest Table Additional table expressions ()
distinct bool Only return distinct rows True

Returns

Type Description
Table A new table containing the intersection of all input tables.

See Also

ibis.intersect

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t1 = ibis.memtable({"a": [1, 2]})
>>> t1
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
└───────┘
>>> t2 = ibis.memtable({"a": [2, 3]})
>>> t2
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     2 │
│     3 │
└───────┘
>>> t1.intersect(t2)
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     2 │
└───────┘

limit

limit(n, offset=0)

Select n rows from self starting at offset.

The result set is not deterministic without a call to order_by.

Parameters

Name Type Description Default
n int | None Number of rows to include. If None, the entire table is selected starting from offset. required
offset int Number of rows to skip first 0

Returns

Type Description
Table The first n rows of self starting at offset

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 1, 2], "b": ["c", "a", "a"]})
>>> t
┏━━━━━━━┳━━━━━━━━┓
┃ a      b      ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64string │
├───────┼────────┤
│     1c      │
│     1a      │
│     2a      │
└───────┴────────┘
>>> t.limit(2)
┏━━━━━━━┳━━━━━━━━┓
┃ a      b      ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64string │
├───────┼────────┤
│     1c      │
│     1a      │
└───────┴────────┘

You can use None with offset to slice starting from a particular row

>>> t.limit(None, offset=1)
┏━━━━━━━┳━━━━━━━━┓
┃ a      b      ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64string │
├───────┼────────┤
│     1a      │
│     2a      │
└───────┴────────┘

See Also

Table.order_by

order_by

order_by(*by)

Sort a table by one or more expressions.

Similar to pandas.DataFrame.sort_values().

Parameters

Name Type Description Default
by str | ir.Column | s.Selector | Sequence[str] | Sequence[ir.Column] | Sequence[s.Selector] | None Expressions to sort the table by. ()

Returns

Type Description
Table Sorted table

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(
...     {
...         "a": [3, 2, 1, 3],
...         "b": ["a", "B", "c", "D"],
...         "c": [4, 6, 5, 7],
...     }
... )
>>> t
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     3a     4 │
│     2B     6 │
│     1c     5 │
│     3D     7 │
└───────┴────────┴───────┘

Sort by b. Default is ascending. Note how capital letters come before lowercase

>>> t.order_by("b")
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     2B     6 │
│     3D     7 │
│     3a     4 │
│     1c     5 │
└───────┴────────┴───────┘

Sort in descending order

>>> t.order_by(ibis.desc("b"))
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     1c     5 │
│     3a     4 │
│     3D     7 │
│     2B     6 │
└───────┴────────┴───────┘

You can also use the deferred API to get the same result

>>> from ibis import _
>>> t.order_by(_.b.desc())
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     1c     5 │
│     3a     4 │
│     3D     7 │
│     2B     6 │
└───────┴────────┴───────┘

Sort by multiple columns/expressions

>>> t.order_by(["a", _.c.desc()])
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     1c     5 │
│     2B     6 │
│     3D     7 │
│     3a     4 │
└───────┴────────┴───────┘

You can actually pass arbitrary expressions to use as sort keys. For example, to ignore the case of the strings in column b

>>> t.order_by(_.b.lower())
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     3a     4 │
│     2B     6 │
│     1c     5 │
│     3D     7 │
└───────┴────────┴───────┘

This means that shuffling a Table is super simple

>>> t.order_by(ibis.random())
┏━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ a      b       c     ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ int64stringint64 │
├───────┼────────┼───────┤
│     3D     7 │
│     3a     4 │
│     2B     6 │
│     1c     5 │
└───────┴────────┴───────┘

sample

sample(fraction, *, method='row', seed=None)

Sample a fraction of rows from a table.

Results may be non-repeatable

Sampling is by definition a random operation. Some backends support specifying a seed for repeatable results, but not all backends support that option. And some backends (duckdb, for example) do support specifying a seed but may still not have repeatable results in all cases.

In all cases, results are backend-specific. An execution against one backend is unlikely to sample the same rows when executed against a different backend, even with the same seed set.

Parameters

Name Type Description Default
fraction float The percentage of rows to include in the sample, expressed as a float between 0 and 1. required
method Literal[‘row’, ‘block’] The sampling method to use. The default is “row”, which includes each row with a probability of fraction. If method is “block”, some backends may instead perform sampling a fraction of blocks of rows (where “block” is a backend dependent definition). This is identical to “row” for backends lacking a blockwise sampling implementation. For those coming from SQL, “row” and “block” correspond to “bernoulli” and “system” respectively in a TABLESAMPLE clause. 'row'
seed int | None An optional random seed to use, for repeatable sampling. The range of possible seed values is backend specific (most support at least [0, 2**31 - 1]). Backends that never support specifying a seed for repeatable sampling will error appropriately. Note that some backends (like DuckDB) do support specifying a seed, but may still not have repeatable results in all cases. None

Returns

Type Description
Table The input table, with fraction of rows selected.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"x": [1, 2, 3, 4], "y": ["a", "b", "c", "d"]})
>>> t
┏━━━━━━━┳━━━━━━━━┓
┃ x      y      ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64string │
├───────┼────────┤
│     1a      │
│     2b      │
│     3c      │
│     4d      │
└───────┴────────┘

Sample approximately half the rows, with a seed specified for reproducibility.

>>> t.sample(0.5, seed=1234)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/expr/types/core.py:99 in __rich_console__                                                      
                                                                                                  
    96 │   │                                                                                      
    97 │   │   try:                                                                               
    98 │   │   │   if opts.interactive:                                                           
  99 │   │   │   │   rich_object = to_rich(self, console_width=console_width)                   
   100 │   │   │   else:                                                                          
   101 │   │   │   │   rich_object = Text(self._noninteractive_repr())                            
   102 │   │   except Exception as e:                                                             
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/expr/types/pretty.py:271 in to_rich                                                            
                                                                                                  
   268 │   │   │   expr, max_length=max_length, max_string=max_string, max_depth=max_depth        
   269 │   │   )                                                                                  
   270 else:                                                                                  
 271 │   │   return _to_rich_table(                                                             
   272 │   │   │   expr,                                                                          
   273 │   │   │   max_rows=max_rows,                                                             
   274 │   │   │   max_columns=max_columns,                                                       
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/expr/types/pretty.py:342 in _to_rich_table                                                     
                                                                                                  
   339 │   │   if orig_ncols > len(computed_cols):                                                
   340 │   │   │   table = table.select(*computed_cols)                                           
   341                                                                                        
 342 result = table.limit(max_rows + 1).to_pyarrow()                                        
   343 # Now format the columns in order, stopping if the console width would                 
   344 # be exceeded.                                                                         
   345 col_info = []                                                                          
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/too 
 lz/functoolz.py:304 in __call__                                                                  
                                                                                                  
    301                                                                                       
    302 def __call__(self, *args, **kwargs):                                                  
    303 │   │   try:                                                                              
  304 │   │   │   return self._partial(*args, **kwargs)                                         
    305 │   │   except TypeError as exc:                                                          
    306 │   │   │   if self._should_curry(args, kwargs, exc):                                     
    307 │   │   │   │   return self.bind(*args, **kwargs)                                         
                                                                                                  
 /home/runner/work/letsql/letsql/python/letsql/backends/let/hotfix.py:196 in letsql_invoke        
                                                                                                  
   193 │   │   if dt not in con._sources.sources:                                                 
   194 │   │   │   con.register(dt.to_expr(), dt.name)                                            
   195 method = getattr(con, f"{_methodname}")                                                
 196 return method(self, *args, **kwargs)                                                   
   197                                                                                            
   198                                                                                            
   199 for typ, methodnames in (                                                                  
                                                                                                  
 /home/runner/work/letsql/letsql/python/letsql/backends/let/__init__.py:233 in to_pyarrow         
                                                                                                  
   230                                                                                        
   231 def to_pyarrow(self, expr: ir.Expr, **kwargs: Any) -> pa.Table:                        
   232 │   │   backend, expr = self._get_backend_and_expr(expr)                                   
 233 │   │   return backend.to_pyarrow(expr.unbind(), **kwargs)                                 
   234                                                                                        
   235 def to_pyarrow_batches(                                                                
   236 │   │   self,                                                                              
                                                                                                  
 /home/runner/work/letsql/letsql/python/letsql/backends/datafusion/__init__.py:726 in to_pyarrow  
                                                                                                  
   723 │   │   )                                                                                  
   724                                                                                        
   725 def to_pyarrow(self, expr: ir.Expr, **kwargs: Any) -> pa.Table:                        
 726 │   │   batch_reader = self._to_pyarrow_batches(expr, **kwargs)                            
   727 │   │   arrow_table = batch_reader.read_all()                                              
   728 │   │   return expr.__pyarrow_result__(arrow_table)                                        
   729                                                                                            
                                                                                                  
 /home/runner/work/letsql/letsql/python/letsql/backends/datafusion/__init__.py:698 in             
 _to_pyarrow_batches                                                                              
                                                                                                  
   695 │   │   self._register_udfs(expr)                                                          
   696 │   │   self._register_in_memory_tables(expr)                                              
   697 │   │   table_expr = expr.as_table()                                                       
 698 │   │   raw_sql = self.compile(table_expr, **kwargs)                                       
   699 │   │   frame = self.con.sql(raw_sql)                                                      
   700 │   │   schema = table_expr.schema()                                                       
   701 │   │   pyarrow_schema = schema.to_pyarrow()                                               
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/backends/sql/__init__.py:177 in compile                                                        
                                                                                                  
   174 │   │   **kwargs: Any,                                                                     
   175 ):                                                                                     
   176 │   │   """Compile an Ibis expression to a SQL string."""                                  
 177 │   │   query = self._to_sqlglot(expr, limit=limit, params=params, **kwargs)               
   178 │   │   sql = query.sql(dialect=self.dialect, pretty=pretty, copy=False)                   
   179 │   │   self._log(sql)                                                                     
   180 │   │   return sql                                                                         
                                                                                                  
 /home/runner/work/letsql/letsql/python/letsql/backends/let/__init__.py:316 in _to_sqlglot        
                                                                                                  
   313 │   │   op = expr.op()                                                                     
   314 │   │   out = op.map_clear(replace_cache_table)                                            
   315 │   │                                                                                      
 316 │   │   return super()._to_sqlglot(out.to_expr(), limit=limit, params=params)              
   317                                                                                        
   318 def sql(                                                                               
   319 │   │   self,                                                                              
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/backends/sql/__init__.py:159 in _to_sqlglot                                                    
                                                                                                  
   156 │   │   if params is None:                                                                 
   157 │   │   │   params = {}                                                                    
   158 │   │                                                                                      
 159 │   │   sql = self.compiler.translate(table_expr.op(), params=params)                      
   160 │   │   assert not isinstance(sql, sge.Subquery)                                           
   161 │   │                                                                                      
   162 │   │   if isinstance(sql, sge.Table):                                                     
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/backends/sql/compiler.py:457 in translate                                                      
                                                                                                  
    454 │   │   # substitute parameters immediately to avoid having to define a                   
    455 │   │   # ScalarParameter translation rule                                                
    456 │   │   params = self._prepare_params(params)                                             
  457 │   │   op, ctes = sqlize(                                                                
    458 │   │   │   op,                                                                           
    459 │   │   │   params=params,                                                                
    460 │   │   │   rewrites=self.rewrites,                                                       
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/backends/sql/rewrites.py:235 in sqlize                                                         
                                                                                                  
   232                                                                                        
   233 # apply the backend specific rewrites                                                  
   234 if rewrites:                                                                           
 235 │   │   node = node.replace(reduce(operator.or_, rewrites))                                
   236                                                                                        
   237 # lower the expression graph to a SQL-like relational algebra                          
   238 context = {"params": params}                                                           
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/graph.py:464 in replace                                                                 
                                                                                                  
   461 │   │                                                                                      
   462 │   │   """                                                                                
   463 │   │   replacer = _coerce_replacer(replacer, context)                                     
 464 │   │   results = self.map(replacer, filter=filter)                                        
   465 │   │   return results.get(self, self)                                                     
   466                                                                                            
   467                                                                                            
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/graph.py:265 in map                                                                     
                                                                                                  
   262 │   │   │   │   k: _recursive_lookup(v, results)                                           
   263 │   │   │   │   for k, v in zip(node.__argnames__, node.__args__)                          
   264 │   │   │   }                                                                              
 265 │   │   │   results[node] = fn(node, results, **kwargs)                                    
   266 │   │                                                                                      
   267 │   │   return results                                                                     
   268                                                                                            
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/graph.py:188 in fn                                                                      
                                                                                                  
   185 │   │   │   # upward in the hierarchy, using a specialized __recreate__ method             
   186 │   │   │   # improves the performance by 17% compared node.__class__(**kwargs)            
   187 │   │   │   recreated = node.__recreate__(kwargs)                                          
 188 │   │   │   if (result := obj.match(recreated, ctx)) is NoMatch:                           
   189 │   │   │   │   return recreated                                                           
   190 │   │   │   else:                                                                          
   191 │   │   │   │   return result                                                              
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:930 in match                                                                
                                                                                                  
    927                                                                                       
    928 def match(self, value, context):                                                      
    929 │   │   for pattern in self.patterns:                                                     
  930 │   │   │   result = pattern.match(value, context)                                        
    931 │   │   │   if result is not NoMatch:                                                     
    932 │   │   │   │   return result                                                             
    933 │   │   return NoMatch                                                                    
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/patterns.py:389 in match                                                                
                                                                                                  
    386 │   │   # use the `_` reserved variable to record the value being replaced                
    387 │   │   # in the context, so that it can be used in the replacer pattern                  
    388 │   │   context["_"] = value                                                              
  389 │   │   return self.replacer.resolve(context)                                             
    390                                                                                           
    391                                                                                           
    392 def replace(matcher):                                                                     
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/common/deferred.py:325 in resolve                                                              
                                                                                                  
   322 │   │   super().__init__(func=func)                                                        
   323                                                                                        
   324 def resolve(self, context):                                                            
 325 │   │   return self.func(**context)                                                        
   326                                                                                            
   327                                                                                            
   328 class Attr(FrozenSlotted, Resolver):                                                       
                                                                                                  
 /home/runner/.cache/pypoetry/virtualenvs/letsql-uJiiVu6z-py3.10/lib/python3.10/site-packages/ibi 
 s/backends/sql/rewrites.py:325 in rewrite_sample_as_filter                                       
                                                                                                  
   322 Errors as unsupported if a `seed` is specified.                                        
   323 """                                                                                    
   324 if _.seed is not None:                                                                 
 325 │   │   raise com.UnsupportedOperationError(                                               
   326 │   │   │   "`Table.sample` with a random seed is unsupported"                             
   327 │   │   )                                                                                  
   328 return ops.Filter(_.parent, (ops.LessEqual(ops.RandomScalar(), _.fraction),))          
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnsupportedOperationError: `Table.sample` with a random seed is unsupported
UnsupportedOperationError: `Table.sample` with a random seed is unsupported

Translation to backend failed
Error message: UnsupportedOperationError('`Table.sample` with a random seed is unsupported')
Expression repr follows:
r0 := InMemoryTable
  data:
    PandasDataFrameProxy:
         x  y
      0  1  a
      1  2  b
      2  3  c
      3  4  d

Sample[r0, fraction=0.5, method='row', seed=1234]

select

select(*exprs, **named_exprs)

Compute a new table expression using exprs and named_exprs.

Passing an aggregate function to this method will broadcast the aggregate’s value over the number of rows in the table and automatically constructs a window function expression. See the examples section for more details.

For backwards compatibility the keyword argument exprs is reserved and cannot be used to name an expression. This behavior will be removed in v4.

Parameters

Name Type Description Default
exprs ir.Value | str | Iterable[ir.Value | str] Column expression, string, or list of column expressions and strings. ()
named_exprs ir.Value | str Column expressions {}

Returns

Type Description
Table Table expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species  island     bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g  sex     year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ stringstringfloat64float64float64float64stringint64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie Torgersen39.118.7181.03750.0male  2007 │
│ Adelie Torgersen39.517.4186.03800.0female2007 │
│ Adelie Torgersen40.318.0195.03250.0female2007 │
│ Adelie TorgersenNULLNULLNULLNULLNULL2007 │
│ Adelie Torgersen36.719.3193.03450.0female2007 │
│ Adelie Torgersen39.320.6190.03650.0male  2007 │
│ Adelie Torgersen38.917.8181.03625.0female2007 │
│ Adelie Torgersen39.219.6195.04675.0male  2007 │
│ Adelie Torgersen34.118.1193.03475.0NULL2007 │
│ Adelie Torgersen42.020.2190.04250.0NULL2007 │
│  │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

Simple projection

>>> t.select("island", "bill_length_mm").head()
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ island     bill_length_mm ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ stringfloat64        │
├───────────┼────────────────┤
│ Torgersen39.1 │
│ Torgersen39.5 │
│ Torgersen40.3 │
│ TorgersenNULL │
│ Torgersen36.7 │
└───────────┴────────────────┘

In that simple case, you could also just use python’s indexing syntax

>>> t[["island", "bill_length_mm"]].head()
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ island     bill_length_mm ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ stringfloat64        │
├───────────┼────────────────┤
│ Torgersen39.1 │
│ Torgersen39.5 │
│ Torgersen40.3 │
│ TorgersenNULL │
│ Torgersen36.7 │
└───────────┴────────────────┘

Projection by zero-indexed column position

>>> t.select(t[0], t[4]).head()
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ species  flipper_length_mm ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ stringfloat64           │
├─────────┼───────────────────┤
│ Adelie 181.0 │
│ Adelie 186.0 │
│ Adelie 195.0 │
│ Adelie NULL │
│ Adelie 193.0 │
└─────────┴───────────────────┘

Projection with renaming and compute in one call

>>> t.select(next_year=t.year + 1).head()
┏━━━━━━━━━━━┓
┃ next_year ┃
┡━━━━━━━━━━━┩
│ int64     │
├───────────┤
│      2008 │
│      2008 │
│      2008 │
│      2008 │
│      2008 │
└───────────┘

You can do the same thing with a named expression, and using the deferred API

>>> from ibis import _
>>> t.select((_.year + 1).name("next_year")).head()
┏━━━━━━━━━━━┓
┃ next_year ┃
┡━━━━━━━━━━━┩
│ int64     │
├───────────┤
│      2008 │
│      2008 │
│      2008 │
│      2008 │
│      2008 │
└───────────┘

Projection with aggregation expressions

>>> t.select("island", bill_mean=t.bill_length_mm.mean()).head()
┏━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ island     bill_mean ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━┩
│ stringfloat64   │
├───────────┼───────────┤
│ Torgersen43.92193 │
│ Torgersen43.92193 │
│ Torgersen43.92193 │
│ Torgersen43.92193 │
│ Torgersen43.92193 │
└───────────┴───────────┘

Projection with a selector

>>> import ibis.selectors as s
>>> t.select(s.numeric() & ~s.c("year")).head()
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ float64float64float64float64     │
├────────────────┼───────────────┼───────────────────┼─────────────┤
│           39.118.7181.03750.0 │
│           39.517.4186.03800.0 │
│           40.318.0195.03250.0 │
│           NULLNULLNULLNULL │
│           36.719.3193.03450.0 │
└────────────────┴───────────────┴───────────────────┴─────────────┘

Projection + aggregation across multiple columns

>>> from ibis import _
>>> t.select(s.across(s.numeric() & ~s.c("year"), _.mean())).head()
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ bill_length_mm  bill_depth_mm  flipper_length_mm  body_mass_g ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ float64float64float64float64     │
├────────────────┼───────────────┼───────────────────┼─────────────┤
│       43.9219317.15117200.9152054201.754386 │
│       43.9219317.15117200.9152054201.754386 │
│       43.9219317.15117200.9152054201.754386 │
│       43.9219317.15117200.9152054201.754386 │
│       43.9219317.15117200.9152054201.754386 │
└────────────────┴───────────────┴───────────────────┴─────────────┘

sql

sql(query, dialect=None)

Run a SQL query against a table expression.

Parameters

Name Type Description Default
query str Query string required
dialect str | None Optional string indicating the dialect of query. Defaults to the backend’s native dialect. None

Returns

Type Description
Table An opaque table expression

Examples

>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch(table_name="penguins")
>>> expr = t.sql(
...     """
...     SELECT island, mean(bill_length_mm) AS avg_bill_length
...     FROM penguins
...     GROUP BY 1
...     ORDER BY 2 DESC
...     """
... )
>>> expr
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ island     avg_bill_length ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ stringfloat64         │
├───────────┼─────────────────┤
│ Biscoe   45.257485 │
│ Dream    44.167742 │
│ Torgersen38.950980 │
└───────────┴─────────────────┘

Mix and match ibis expressions with SQL queries

>>> t = letsql.examples.penguins.fetch(table_name="penguins")
>>> expr = t.sql(
...     """
...     SELECT island, mean(bill_length_mm) AS avg_bill_length
...     FROM penguins
...     GROUP BY 1
...     ORDER BY 2 DESC
...     """
... )
>>> expr = expr.mutate(
...     island=_.island.lower(),
...     avg_bill_length=_.avg_bill_length.round(1),
... )
>>> expr
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ island     avg_bill_length ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ stringfloat64         │
├───────────┼─────────────────┤
│ biscoe   45.3 │
│ dream    44.2 │
│ torgersen39.0 │
└───────────┴─────────────────┘

Because ibis expressions aren’t named, they aren’t visible to subsequent .sql calls. Use the alias method to assign a name to an expression.

>>> expr.alias("b").sql("SELECT * FROM b WHERE avg_bill_length > 40")
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ island  avg_bill_length ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ stringfloat64         │
├────────┼─────────────────┤
│ biscoe45.3 │
│ dream 44.2 │
└────────┴─────────────────┘

See Also

Table.alias

union

union(table, *rest, distinct=False)

Compute the set union of multiple table expressions.

The input tables must have identical schemas.

Parameters

Name Type Description Default
table Table A table expression required
*rest Table Additional table expressions ()
distinct bool Only return distinct rows False

Returns

Type Description
Table A new table containing the union of all input tables.

See Also

ibis.union

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t1 = ibis.memtable({"a": [1, 2]})
>>> t1
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
└───────┘
>>> t2 = ibis.memtable({"a": [2, 3]})
>>> t2
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     2 │
│     3 │
└───────┘
>>> t1.union(t2)  # union all by default
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     2 │
│     3 │
│     1 │
│     2 │
└───────┘
>>> t1.union(t2, distinct=True).order_by("a")
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
│     3 │
└───────┘

view

view()

Create a new table expression distinct from the current one.

Use this API for any self-referencing operations like a self-join.

Returns

Type Description
Table Table expression