Generic expressions

Scalars and columns of any element type.

Value

Value(self, arg)

Base class for a data generating expression having a known type.

Methods

Name Description
asc Sort an expression ascending.
cast Cast expression to indicated data type.
coalesce Return the first non-null value from args.
collect Aggregate this expression’s elements into an array.
identical_to Return whether this expression is identical to other.
isin Check whether this expression’s values are in values.
isnull Return whether this expression is NULL.
name Rename an expression to name.
notnull Return whether this expression is not NULL.
nullif Set values to null if they equal the values null_if_expr.
try_cast Try cast expression to indicated data type.

asc

asc(nulls_first=False)

Sort an expression ascending.

cast

cast(target_type)

Cast expression to indicated data type.

Similar to pandas.Series.astype.

Parameters

Name Type Description Default
target_type Any Type to cast to. Anything accepted by ibis.dtype() required

Returns

Name Type Description
Value Casted expression

See Also

Value.try_cast() ibis.dtype()

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> x = letsql.examples.penguins.fetch()["bill_depth_mm"]
>>> x
┏━━━━━━━━━━━━━━━┓
┃ bill_depth_mm ┃
┡━━━━━━━━━━━━━━━┩
│ float64       │
├───────────────┤
│          18.7 │
│          17.4 │
│          18.0 │
│          NULL │
│          19.3 │
│          20.6 │
│          17.8 │
│          19.6 │
│          18.1 │
│          20.2 │
│              │
└───────────────┘

python’s built-in types can be used

>>> x.cast(int)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Cast(bill_depth_mm, int64) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int64                      │
├────────────────────────────┤
│                         18 │
│                         17 │
│                         18 │
│                       NULL │
│                         19 │
│                         20 │
│                         17 │
│                         19 │
│                         18 │
│                         20 │
│                           │
└────────────────────────────┘

or string names

>>> x.cast("int8")
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Cast(bill_depth_mm, int8) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ int8                      │
├───────────────────────────┤
│                        18 │
│                        17 │
│                        18 │
│                      NULL │
│                        19 │
│                        20 │
│                        17 │
│                        19 │
│                        18 │
│                        20 │
│                          │
└───────────────────────────┘

If you make an illegal cast, you won’t know until the backend actually executes it. Consider .try_cast().

>>> ibis.literal("a string").cast("int64")

ConversionException: Conversion Error: Could not convert string 'a string' to INT64
LINE 1: SELECT CAST('a string' AS BIGINT) AS "Cast('a ...
               ^

coalesce

coalesce(*args)

Return the first non-null value from args.

Parameters

Name Type Description Default
args Value Arguments from which to choose the first non-null value ()

Returns

Name Type Description
Value Coalesced expression

See Also

ibis.coalesce() Value.fill_null()

Examples

>>> import ibis
>>> ibis.coalesce(None, 4, 5).name("x")

┌───┐
│ 4 │
└───┘

collect

collect(where=None, order_by=None, include_null=False)

Aggregate this expression’s elements into an array.

This function is called array_agg, list_agg, or list in other systems.

Parameters

Name Type Description Default
where ir.BooleanValue | None An optional filter expression. If provided, only rows where where is True will be included in the aggregate. None
order_by Any An ordering key (or keys) to use to order the rows before aggregating. If not provided, the order of the items in the result is undefined and backend specific. None
include_null bool Whether to include null values when performing this aggregation. Set to True to include nulls in the result. False

Returns

Name Type Description
ArrayScalar Collected array

Examples

Basic collect usage

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"key": list("aaabb"), "value": [1, 2, 3, 4, 5]})
>>> t
┏━━━━━━━━┳━━━━━━━┓
┃ key     value ┃
┡━━━━━━━━╇━━━━━━━┩
│ stringint64 │
├────────┼───────┤
│ a     1 │
│ a     2 │
│ a     3 │
│ b     4 │
│ b     5 │
└────────┴───────┘
>>> t.value.collect()

┌────────────────┐
│ [1, 2, ... +3] │
└────────────────┘
>>> type(t.value.collect())
ibis.expr.types.arrays.ArrayScalar

Collect elements per group

>>> t.group_by("key").agg(v=lambda t: t.value.collect()).order_by("key")
┏━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ key     v              ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ stringarray<int64>   │
├────────┼────────────────┤
│ a     [1, 2, ... +1] │
│ b     [4, 5]         │
└────────┴────────────────┘

Collect elements per group using a filter

>>> t.group_by("key").agg(v=lambda t: t.value.collect(where=t.value > 1)).order_by("key")
┏━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ key     v            ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━┩
│ stringarray<int64> │
├────────┼──────────────┤
│ a     [2, 3]       │
│ b     [4, 5]       │
└────────┴──────────────┘

identical_to

identical_to(other)

Return whether this expression is identical to other.

Corresponds to IS NOT DISTINCT FROM in SQL.

Parameters

Name Type Description Default
other Value Expression to compare to required

Returns

Name Type Description
BooleanValue Whether this expression is not distinct from other

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> one = ibis.literal(1)
>>> two = ibis.literal(2)
>>> two.identical_to(one + one)

┌──────┐
│ True │
└──────┘

isin

isin(values)

Check whether this expression’s values are in values.

NULL values are propagated in the output. See examples for details.

Parameters

Name Type Description Default
values Value | Sequence[Value] Values or expression to check for membership required

Returns

Name Type Description
BooleanValue Expression indicating membership

See Also

Value.notin()

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2, 3], "b": [2, 3, 4]})
>>> t
┏━━━━━━━┳━━━━━━━┓
┃ a      b     ┃
┡━━━━━━━╇━━━━━━━┩
│ int64int64 │
├───────┼───────┤
│     12 │
│     23 │
│     34 │
└───────┴───────┘

Check against a literal sequence of values

>>> t.a.isin([1, 2])
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ InValues(a, (1, 2)) ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ boolean             │
├─────────────────────┤
│ True                │
│ True                │
│ False               │
└─────────────────────┘

Check against a derived expression

>>> t.a.isin(t.b + 1)
┏━━━━━━━━━━━━━━━┓
┃ InSubquery(a) ┃
┡━━━━━━━━━━━━━━━┩
│ boolean       │
├───────────────┤
│ False         │
│ False         │
│ True          │
└───────────────┘

Check against a column from a different table

>>> t2 = ibis.memtable({"x": [99, 2, 99]})
>>> t.a.isin(t2.x)
┏━━━━━━━━━━━━━━━┓
┃ InSubquery(a) ┃
┡━━━━━━━━━━━━━━━┩
│ boolean       │
├───────────────┤
│ False         │
│ True          │
│ False         │
└───────────────┘

NULL behavior

>>> t = ibis.memtable({"x": [1, 2]})
>>> t.x.isin([1, None])
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ InValues(x, (1, None)) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean                │
├────────────────────────┤
│ True                   │
│ NULL                   │
└────────────────────────┘
>>> t = ibis.memtable({"x": [1, None, 2]})
>>> t.x.isin([1])
┏━━━━━━━━━━━━━━━━━━━┓
┃ InValues(x, (1,)) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ boolean           │
├───────────────────┤
│ True              │
│ NULL              │
│ False             │
└───────────────────┘
>>> t.x.isin([3])
┏━━━━━━━━━━━━━━━━━━━┓
┃ InValues(x, (3,)) ┃
┡━━━━━━━━━━━━━━━━━━━┩
│ boolean           │
├───────────────────┤
│ False             │
│ NULL              │
│ False             │
└───────────────────┘

isnull

isnull()

Return whether this expression is NULL.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch().limit(5)
>>> t.bill_depth_mm
┏━━━━━━━━━━━━━━━┓
┃ bill_depth_mm ┃
┡━━━━━━━━━━━━━━━┩
│ float64       │
├───────────────┤
│          18.7 │
│          17.4 │
│          18.0 │
│          NULL │
│          19.3 │
└───────────────┘
>>> t.bill_depth_mm.isnull()
┏━━━━━━━━━━━━━━━━━━━━━━━┓
┃ IsNull(bill_depth_mm) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean               │
├───────────────────────┤
│ False                 │
│ False                 │
│ False                 │
│ True                  │
│ False                 │
└───────────────────────┘

name

name(name)

Rename an expression to name.

Parameters

Name Type Description Default
name The new name of the expression required

Returns

Name Type Description
Value self with name name

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2]}, name="t")
>>> t.a
┏━━━━━━━┓
┃ a     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
└───────┘
>>> t.a.name("b")
┏━━━━━━━┓
┃ b     ┃
┡━━━━━━━┩
│ int64 │
├───────┤
│     1 │
│     2 │
└───────┘

notnull

notnull()

Return whether this expression is not NULL.

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch().limit(5)
>>> t.bill_depth_mm
┏━━━━━━━━━━━━━━━┓
┃ bill_depth_mm ┃
┡━━━━━━━━━━━━━━━┩
│ float64       │
├───────────────┤
│          18.7 │
│          17.4 │
│          18.0 │
│          NULL │
│          19.3 │
└───────────────┘
>>> t.bill_depth_mm.notnull()
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ NotNull(bill_depth_mm) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean                │
├────────────────────────┤
│ True                   │
│ True                   │
│ True                   │
│ False                  │
│ True                   │
└────────────────────────┘

nullif

nullif(null_if_expr)

Set values to null if they equal the values null_if_expr.

Commonly used to avoid divide-by-zero problems by replacing zero with NULL in the divisor.

Equivalent to (self == null_if_expr).ifelse(ibis.null(), self).

Parameters

Name Type Description Default
null_if_expr Value Expression indicating what values should be NULL required

Returns

Name Type Description
Value Value expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> vals = letsql.examples.penguins.fetch().head(5).sex
>>> vals
┏━━━━━━━━┓
┃ sex    ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ male   │
│ female │
│ female │
│ NULL   │
│ female │
└────────┘
>>> vals.nullif("male")
┏━━━━━━━━━━━━━━━━━━━━━┓
┃ NullIf(sex, 'male') ┃
┡━━━━━━━━━━━━━━━━━━━━━┩
│ string              │
├─────────────────────┤
│ NULL                │
│ female              │
│ female              │
│ NULL                │
│ female              │
└─────────────────────┘

try_cast

try_cast(target_type)

Try cast expression to indicated data type.

If the cast fails for a row, the value is returned as null or NaN depending on target_type and backend behavior.

Parameters

Name Type Description Default
target_type Any Type to try cast to. Anything accepted by ibis.dtype() required

Returns

Name Type Description
Value Casted expression

See Also

Value.cast() ibis.dtype()

Examples

>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"numbers": [1, 2, 3, 4], "strings": ["1.0", "2", "hello", "world"]})
>>> t
┏━━━━━━━━━┳━━━━━━━━━┓
┃ numbers  strings ┃
┡━━━━━━━━━╇━━━━━━━━━┩
│ int64string  │
├─────────┼─────────┤
│       11.0     │
│       22       │
│       3hello   │
│       4world   │
└─────────┴─────────┘
>>> t = t.mutate(numbers_to_strings=_.numbers.try_cast("string"))
>>> t = t.mutate(strings_to_numbers=_.strings.try_cast("int"))
>>> t
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
┃ numbers  strings  numbers_to_strings  strings_to_numbers ┃
┡━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
│ int64stringstringint64              │
├─────────┼─────────┼────────────────────┼────────────────────┤
│       11.0    1                 1 │
│       22      2                 2 │
│       3hello  3                 NULL │
│       4world  4                 NULL │
└─────────┴─────────┴────────────────────┴────────────────────┘

Scalar

Scalar(self, arg)

Methods

Name Description
as_table Promote the scalar expression to a table.

as_table

as_table()

Promote the scalar expression to a table.

Returns

Name Type Description
Table A table expression

Examples

Promote an aggregation to a table

>>> import ibis
>>> import ibis.expr.types as ir
>>> t = ibis.table(dict(a="str"), name="t")
>>> expr = t.a.length().sum().name("len").as_table()
>>> isinstance(expr, ir.Table)
True

Promote a literal value to a table

>>> import ibis.expr.types as ir
>>> lit = ibis.literal(1).name("a").as_table()
>>> isinstance(lit, ir.Table)
True

Column

Column(self, arg)

Methods

Name Description
approx_median Return an approximate of the median of self.
approx_nunique Return the approximate number of distinct elements in self.
arbitrary Select an arbitrary value in a column.
count Compute the number of rows in an expression.
first Return the first value of a column.
lag Return the row located at offset rows before the current row.
last Return the last value of a column.
lead Return the row located at offset rows after the current row.
max Return the maximum of a column.
median Return the median of the column.
min Return the minimum of a column.
nth Return the nth value (0-indexed) over a window.
nunique Compute the number of distinct rows in an expression.

approx_median

approx_median(where=None)

Return an approximate of the median of self.

The result may or may not be exact

Whether the result is an approximation depends on the backend.

Do not depend on the results being exact

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter in values when where is True None

Returns

Name Type Description
Scalar An approximation of the median of self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.body_mass_g.approx_median()

┌─────────────┐
│ 4026.488095 │
└─────────────┘
>>> t.body_mass_g.approx_median(where=t.species == "Chinstrap")

┌────────┐
│ 3700.0 │
└────────┘

approx_nunique

approx_nunique(where=None)

Return the approximate number of distinct elements in self.

The result may or may not be exact

Whether the result is an approximation depends on the backend.

Do not depend on the results being exact

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter in values when where is True None

Returns

Name Type Description
Scalar An approximate count of the distinct elements of self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.body_mass_g.approx_nunique()

Exception: This feature is not implemented: Support for 'approx_distinct' for data type Float64 is not implemented
>>> t.body_mass_g.approx_nunique(where=t.species == "Adelie")

Exception: This feature is not implemented: Support for 'approx_distinct' for data type Float64 is not implemented

arbitrary

arbitrary(where=None, how=None)

Select an arbitrary value in a column.

Returns an arbitrary (nondeterministic, backend-specific) value from the column. The value will be non-NULL, except if the column is empty or all values are NULL.

Parameters

Name Type Description Default
where ir.BooleanValue | None A filter expression None
how Any DEPRECATED None

Returns

Name Type Description
Scalar An expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"a": [1, 2, 2], "b": list("aaa"), "c": [4.0, 4.1, 4.2]})
>>> t
┏━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓
┃ a      b       c       ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩
│ int64stringfloat64 │
├───────┼────────┼─────────┤
│     1a     4.0 │
│     2a     4.1 │
│     2a     4.2 │
└───────┴────────┴─────────┘
>>> t.group_by("a").agg(arb=t.b.arbitrary(), c=t.c.sum()).order_by("a")
┏━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓
┃ a      arb     c       ┃
┡━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩
│ int64stringfloat64 │
├───────┼────────┼─────────┤
│     1a     4.0 │
│     2a     8.3 │
└───────┴────────┴─────────┘

count

count(where=None)

Compute the number of rows in an expression.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter expression None

Returns

Name Type Description
IntegerScalar Number of elements in an expression

first

first(where=None, order_by=None, include_null=False)

Return the first value of a column.

Parameters

Name Type Description Default
where ir.BooleanValue | None An optional filter expression. If provided, only rows where where is True will be included in the aggregate. None
order_by Any An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of first is undefined and will be backend specific. None
include_null bool Whether to include null values when performing this aggregation. Set to True to include nulls in the result. False

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"chars": ["a", "b", "c", "d"]})
>>> t
┏━━━━━━━━┓
┃ chars  ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ a      │
│ b      │
│ c      │
│ d      │
└────────┘
>>> t.chars.first()

┌───┐
│ a │
└───┘
>>> t.chars.first(where=t.chars != "a")

┌───┐
│ b │
└───┘

lag

lag(offset=None, default=None)

Return the row located at offset rows before the current row.

Parameters

Name Type Description Default
offset int | ir.IntegerValue | None Index of row to select None
default Value | None Value used if no row exists at offset None

last

last(where=None, order_by=None, include_null=False)

Return the last value of a column.

Parameters

Name Type Description Default
where ir.BooleanValue | None An optional filter expression. If provided, only rows where where is True will be included in the aggregate. None
order_by Any An ordering key (or keys) to use to order the rows before aggregating. If not provided, the meaning of last is undefined and will be backend specific. None
include_null bool Whether to include null values when performing this aggregation. Set to True to include nulls in the result. False

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"chars": ["a", "b", "c", "d"]})
>>> t
┏━━━━━━━━┓
┃ chars  ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ a      │
│ b      │
│ c      │
│ d      │
└────────┘
>>> t.chars.last()

┌───┐
│ d │
└───┘
>>> t.chars.last(where=t.chars != "d")

┌───┐
│ c │
└───┘

lead

lead(offset=None, default=None)

Return the row located at offset rows after the current row.

Parameters

Name Type Description Default
offset int | ir.IntegerValue | None Index of row to select None
default Value | None Value used if no row exists at offset None

max

max(where=None)

Return the maximum of a column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter in values when where is True None

Returns

Name Type Description
Scalar The maximum value in self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.body_mass_g.max()

┌────────┐
│ 6300.0 │
└────────┘
>>> t.body_mass_g.max(where=t.species == "Chinstrap")

┌────────┐
│ 4800.0 │
└────────┘

median

median(where=None)

Return the median of the column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Optional boolean expression. If given, only the values where where evaluates to true will be considered for the median. None

Returns

Name Type Description
Scalar Median of the column

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()

Compute the median of bill_depth_mm

>>> t.bill_depth_mm.median()

┌──────┐
│ 17.3 │
└──────┘
>>> t.group_by(t.species).agg(median_bill_depth=t.bill_depth_mm.median()).order_by(
...     ibis.desc("median_bill_depth")
... )
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ species    median_bill_depth ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ stringfloat64           │
├───────────┼───────────────────┤
│ Chinstrap18.45 │
│ Adelie   18.40 │
│ Gentoo   15.00 │
└───────────┴───────────────────┘

In addition to numeric types, any orderable non-numeric types such as strings and dates work with median.

>>> t.group_by(t.island).agg(median_species=t.species.median()).order_by(
...     ibis.desc("median_species")
... )

Exception: External error: External error: This feature is not implemented: MedianAccumulator not supported for median(t0.species) with Utf8

min

min(where=None)

Return the minimum of a column.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter in values when where is True None

Returns

Name Type Description
Scalar The minimum value in self

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.body_mass_g.min()

┌────────┐
│ 2700.0 │
└────────┘
>>> t.body_mass_g.min(where=t.species == "Adelie")

┌────────┐
│ 2850.0 │
└────────┘

nth

nth(n)

Return the nth value (0-indexed) over a window.

.nth(0) is equivalent to .first(). Negative will result in NULL. If the value of n is greater than the number of rows in the window, NULL will be returned.

Parameters

Name Type Description Default
n int | ir.IntegerValue Desired rank value required

Returns

Name Type Description
Column The nth value over a window

nunique

nunique(where=None)

Compute the number of distinct rows in an expression.

Parameters

Name Type Description Default
where ir.BooleanValue | None Filter expression None

Returns

Name Type Description
IntegerScalar Number of distinct elements in an expression

Examples

>>> import ibis
>>> ibis.options.interactive = True
>>> t = letsql.examples.penguins.fetch()
>>> t.body_mass_g.nunique()

┌────┐
│ 94 │
└────┘
>>> t.body_mass_g.nunique(where=t.species == "Adelie")

┌────┐
│ 55 │
└────┘