All string operations are valid for both scalars and columns.
StringValue
Methods
ascii_str
Return the numeric ASCII code of the first character of a string.
authority
Parse a URL and extract authority.
capitalize
Uppercase the first letter, lowercase the rest.
concat
Concatenate strings.
contains
Return whether the expression contains substr
.
endswith
Determine if self
ends with end
.
find
Return the position of the first occurrence of substring.
find_in_set
Find the first occurrence of str_list
within a list of strings.
fragment
Parse a URL and extract fragment identifier.
host
Parse a URL and extract host.
length
Compute the length of a string.
levenshtein
Return the Levenshtein distance between two strings.
lower
Convert string to all lowercase.
lpad
Pad arg
by truncating on the right or padding on the left.
lstrip
Remove whitespace from the left side of string.
path
Parse a URL and extract path.
protocol
Parse a URL and extract protocol.
query
Parse a URL and returns query strring or query string parameter.
re_extract
Return the specified match at index
from a regex pattern
.
re_replace
Replace all matches found by regex pattern
with replacement
.
re_search
Return whether the values match pattern
.
re_split
Split a string by a regular expression pattern
.
repeat
Repeat a string n
times.
replace
Replace each exact match of pattern
with replacement
.
reverse
Reverse the characters of a string.
right
Return up to nchars
from the end of each string.
rpad
Pad self
by truncating or padding on the right.
rstrip
Remove whitespace from the right side of string.
split
Split as string on delimiter
.
startswith
Determine whether self
starts with start
.
strip
Remove whitespace from left and right sides of a string.
substr
Extract a substring.
to_date
translate
Replace from_str
characters in self
characters in to_str
.
upper
Convert string to all uppercase.
userinfo
Parse a URL and extract user info.
ascii_str
Return the numeric ASCII code of the first character of a string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "def" , "ghi" ]})
>>> t.s.ascii_str()
┏━━━━━━━━━━━━━━━━┓
┃ StringAscii(s) ┃
┡━━━━━━━━━━━━━━━━┩
│ int32 │
├────────────────┤
│ 97 │
│ 100 │
│ 103 │
└────────────────┘
authority
Parse a URL and extract authority.
Examples
>>> import ibis
>>> url = ibis.literal("https://user:pass@example.com:80/docs/books" )
>>> result = url.authority() # user:pass@example.com:80
capitalize
Uppercase the first letter, lowercase the rest.
This API matches the semantics of the Python method.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["aBC" , " abc" , "ab cd" , None ]})
>>> t.s.capitalize()
┏━━━━━━━━━━━━━━━┓
┃ Capitalize(s) ┃
┡━━━━━━━━━━━━━━━┩
│ string │
├───────────────┤
│ Abc │
│ abc │
│ Ab cd │
│ NULL │
└───────────────┘
concat
Concatenate strings.
NULLs are propagated. This methods is equivalent to using the +
operator.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , None ]})
>>> t.s.concat("xyz" , "123" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StringConcat((s, 'xyz', '123')) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├─────────────────────────────────┤
│ abcxyz123 │
│ NULL │
└─────────────────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StringConcat((s, 'xyz')) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├──────────────────────────┤
│ abcxyz │
│ NULL │
└──────────────────────────┘
contains
Return whether the expression contains substr
.
Returns
BooleanValue
Boolean indicating the presence of substr
in the expression
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["bab" , "ddd" , "eaf" ]})
>>> t.s.contains("a" )
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StringContains(s, 'a') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean │
├────────────────────────┤
│ True │
│ False │
│ True │
└────────────────────────┘
endswith
Determine if self
ends with end
.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["Ibis project" , "GitHub" ]})
>>> t.s.endswith("project" )
┏━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ EndsWith(s, 'project') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean │
├────────────────────────┤
│ True │
│ False │
└────────────────────────┘
find
find(substr, start= None , end= None )
Return the position of the first occurrence of substring.
Parameters
substr
str | StringValue
Substring to search for
required
start
int | ir .IntegerValue | None
Zero based index of where to start the search
None
end
int | ir .IntegerValue | None
Zero based index of where to stop the search. Currently not implemented.
None
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "bac" , "bca" ]})
>>> t.s.find("a" )
┏━━━━━━━━━━━━━━━━━━━━┓
┃ StringFind(s, 'a') ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ int64 │
├────────────────────┤
│ 0 │
│ 1 │
│ 2 │
└────────────────────┘
┏━━━━━━━━━━━━━━━━━━━━┓
┃ StringFind(s, 'z') ┃
┡━━━━━━━━━━━━━━━━━━━━┩
│ int64 │
├────────────────────┤
│ -1 │
│ -1 │
│ -1 │
└────────────────────┘
find_in_set
Find the first occurrence of str_list
within a list of strings.
No string in str_list
can have a comma.
Returns
IntegerValue
Position of str_list
in self
. Returns -1 if self
isn’t found or if self
contains ','
.
Examples
>>> import ibis
>>> table = ibis.table(dict (string_col= "string" ))
>>> result = table.string_col.find_in_set(["a" , "b" ])
fragment
Parse a URL and extract fragment identifier.
Examples
>>> import ibis
>>> url = ibis.literal("https://example.com:80/docs/#DOWNLOADING" )
>>> result = url.fragment() # DOWNLOADING
host
Parse a URL and extract host.
Examples
>>> import ibis
>>> url = ibis.literal("https://user:pass@example.com:80/docs/books" )
>>> result = url.host() # example.com
length
Compute the length of a string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["aaa" , "a" , "aa" ]})
>>> t.s.length()
┏━━━━━━━━━━━━━━━━━┓
┃ StringLength(s) ┃
┡━━━━━━━━━━━━━━━━━┩
│ int32 │
├─────────────────┤
│ 3 │
│ 1 │
│ 2 │
└─────────────────┘
levenshtein
Return the Levenshtein distance between two strings.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> s = ibis.literal("kitten" )
>>> s.levenshtein("sitting" )
lower
Convert string to all lowercase.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["AAA" , "a" , "AA" ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ AAA │
│ a │
│ AA │
└────────┘
┏━━━━━━━━━━━━━━┓
┃ Lowercase(s) ┃
┡━━━━━━━━━━━━━━┩
│ string │
├──────────────┤
│ aaa │
│ a │
│ aa │
└──────────────┘
lpad
Pad arg
by truncating on the right or padding on the left.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "def" , "ghij" ]})
>>> t.s.lpad(5 , "-" )
┏━━━━━━━━━━━━━━━━━┓
┃ LPad(s, 5, '-') ┃
┡━━━━━━━━━━━━━━━━━┩
│ string │
├─────────────────┤
│ --abc │
│ --def │
│ -ghij │
└─────────────────┘
lstrip
Remove whitespace from the left side of string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : [" \t a \t " , " \n b \n " , " \v c \t " ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ \t a \t │
│ \n b \n │
│ \v c \t │
└────────┘
┏━━━━━━━━━━━┓
┃ LStrip(s) ┃
┡━━━━━━━━━━━┩
│ string │
├───────────┤
│ a \t │
│ b \n │
│ c \t │
└───────────┘
path
Parse a URL and extract path.
Examples
>>> import ibis
>>> url = ibis.literal(
... "https://example.com:80/docs/books/tutorial/index.html?name=networking"
... )
>>> result = url.path() # docs/books/tutorial/index.html
protocol
Parse a URL and extract protocol.
Examples
>>> import ibis
>>> url = ibis.literal("https://user:pass@example.com:80/docs/books" )
>>> result = url.protocol() # https
query
Parse a URL and returns query strring or query string parameter.
If key is passed, return the value of the query string parameter named. If key is absent, return the query string.
Examples
>>> import ibis
>>> url = ibis.literal(
... "https://example.com:80/docs/books/tutorial/index.html?name=networking"
... )
>>> result = url.query() # name=networking
>>> query_name = url.query("name" ) # networking
re_replace
re_replace(pattern, replacement)
Replace all matches found by regex pattern
with replacement
.
Parameters
pattern
str | StringValue
Regular expression string
required
replacement
str | StringValue
Replacement string or regular expression
required
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "bac" , "bca" , "this has multi \t whitespace" ]})
>>> s = t.s
Replace all “a”s that are at the beginning of the string with “b”:
>>> s.re_replace("^a" , "b" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ RegexReplace(s, '^a', 'b') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├───────────────────────────────┤
│ bbc │
│ bac │
│ bca │
│ this has multi \t whitespace │
└───────────────────────────────┘
Double up any “a”s or “b”s, using capture groups and backreferences:
>>> s.re_replace("([ab])" , r"\0\0" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ RegexReplace(s, '()', '\\0\\0') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├─────────────────────────────────────┤
│ aabbc │
│ bbaac │
│ bbcaa │
│ this haas multi \t whitespaace │
└─────────────────────────────────────┘
Normalize all whitespace to a single space:
>>> s.re_replace(r"\s+" , " " )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ RegexReplace(s, '\\s+', ' ') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├──────────────────────────────┤
│ abc │
│ bac │
│ bca │
│ this has multi whitespace │
└──────────────────────────────┘
re_search
Return whether the values match pattern
.
Returns True
if the regex matches a string and False
otherwise.
Parameters
pattern
str | StringValue
Regular expression use for searching
required
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["Ibis project" , "GitHub" ]})
>>> t.s.re_search(".+Hub" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ RegexSearch(s, '.+Hub') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean │
├─────────────────────────┤
│ False │
│ True │
└─────────────────────────┘
re_split
Split a string by a regular expression pattern
.
Parameters
pattern
str | StringValue
Regular expression string to split by
required
Returns
ArrayValue
Array of strings from splitting by pattern
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable(dict (s= ["a.b" , "b.....c" , "c.........a" , "def" ]))
>>> t.s
┏━━━━━━━━━━━━━┓
┃ s ┃
┡━━━━━━━━━━━━━┩
│ string │
├─────────────┤
│ a.b │
│ b.....c │
│ c.........a │
│ def │
└─────────────┘
>>> t.s.re_split(r"\.+" ).name("splits" )
┏━━━━━━━━━━━━━━━┓
┃ splits ┃
┡━━━━━━━━━━━━━━━┩
│ array<string> │
├───────────────┤
│ [ 'a' , 'b' ] │
│ [ 'b' , 'c' ] │
│ [ 'c' , 'a' ] │
│ [ 'def' ] │
└───────────────┘
repeat
Repeat a string n
times.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["a" , "bb" , "c" ]})
>>> t.s.repeat(5 )
┏━━━━━━━━━━━━━━┓
┃ Repeat(s, 5) ┃
┡━━━━━━━━━━━━━━┩
│ string │
├──────────────┤
│ aaaaa │
│ bbbbbbbbbb │
│ ccccc │
└──────────────┘
replace
replace(pattern, replacement)
Replace each exact match of pattern
with replacement
.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "bac" , "bca" ]})
>>> t.s.replace("b" , "z" )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StringReplace(s, 'b', 'z') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ string │
├────────────────────────────┤
│ azc │
│ zac │
│ zca │
└────────────────────────────┘
reverse
Reverse the characters of a string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "def" , "ghi" ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ abc │
│ def │
│ ghi │
└────────┘
┏━━━━━━━━━━━━┓
┃ Reverse(s) ┃
┡━━━━━━━━━━━━┩
│ string │
├────────────┤
│ cba │
│ fed │
│ ihg │
└────────────┘
right
Return up to nchars
from the end of each string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "defg" , "hijlk" ]})
>>> t.s.right(2 )
┏━━━━━━━━━━━━━━━━┓
┃ StrRight(s, 2) ┃
┡━━━━━━━━━━━━━━━━┩
│ string │
├────────────────┤
│ bc │
│ fg │
│ lk │
└────────────────┘
rpad
Pad self
by truncating or padding on the right.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "def" , "ghij" ]})
>>> t.s.rpad(5 , "-" )
┏━━━━━━━━━━━━━━━━━┓
┃ RPad(s, 5, '-') ┃
┡━━━━━━━━━━━━━━━━━┩
│ string │
├─────────────────┤
│ abc-- │
│ def-- │
│ ghij- │
└─────────────────┘
rstrip
Remove whitespace from the right side of string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : [" \t a \t " , " \n b \n " , " \v c \t " ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ \t a \t │
│ \n b \n │
│ \v c \t │
└────────┘
┏━━━━━━━━━━━┓
┃ RStrip(s) ┃
┡━━━━━━━━━━━┩
│ string │
├───────────┤
│ \t a │
│ \n b │
│ \v c │
└───────────┘
split
Split as string on delimiter
.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"col" : ["a,b,c" , "d,e" , "f" ]})
>>> t
┏━━━━━━━━┓
┃ col ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ a,b,c │
│ d,e │
│ f │
└────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StringSplit(col, ',') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━┩
│ array<string> │
├───────────────────────┤
│ [ 'a' , 'b' , ... +1 ] │
│ [ 'd' , 'e' ] │
│ [ 'f' ] │
└───────────────────────┘
startswith
Determine whether self
starts with start
.
Returns
BooleanValue
Boolean indicating whether self
starts with start
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["Ibis project" , "GitHub" ]})
>>> t.s.startswith("Ibis" )
┏━━━━━━━━━━━━━━━━━━━━━━━┓
┃ StartsWith(s, 'Ibis') ┃
┡━━━━━━━━━━━━━━━━━━━━━━━┩
│ boolean │
├───────────────────────┤
│ True │
│ False │
└───────────────────────┘
strip
Remove whitespace from left and right sides of a string.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : [" \t a \t " , " \n b \n " , " \v c \t " ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ \t a \t │
│ \n b \n │
│ \v c \t │
└────────┘
┏━━━━━━━━━━┓
┃ Strip(s) ┃
┡━━━━━━━━━━┩
│ string │
├──────────┤
│ a │
│ b │
│ c │
└──────────┘
substr
substr(start, length= None )
Extract a substring.
Parameters
start
int | ir .IntegerValue
First character to start splitting, indices start at 0
required
length
int | ir .IntegerValue | None
Maximum length of each substring. If not supplied, searches the entire string
None
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["abc" , "defg" , "hijlk" ]})
>>> t.s.substr(2 )
┏━━━━━━━━━━━━━━━━━┓
┃ Substring(s, 2) ┃
┡━━━━━━━━━━━━━━━━━┩
│ string │
├─────────────────┤
│ c │
│ fg │
│ jlk │
└─────────────────┘
translate
translate(from_str, to_str)
Replace from_str
characters in self
characters in to_str
.
To avoid unexpected behavior, from_str
should be shorter than to_str
.
Parameters
from_str
StringValue
Characters in arg
to replace
required
to_str
StringValue
Characters to use for replacement
required
Examples
>>> import ibis
>>> table = ibis.table(dict (string_col= "string" ))
>>> result = table.string_col.translate("a" , "b" )
upper
Convert string to all uppercase.
Examples
>>> import ibis
>>> ibis.options.interactive = True
>>> t = ibis.memtable({"s" : ["aaa" , "A" , "aa" ]})
>>> t
┏━━━━━━━━┓
┃ s ┃
┡━━━━━━━━┩
│ string │
├────────┤
│ aaa │
│ A │
│ aa │
└────────┘
┏━━━━━━━━━━━━━━┓
┃ Uppercase(s) ┃
┡━━━━━━━━━━━━━━┩
│ string │
├──────────────┤
│ AAA │
│ A │
│ AA │
└──────────────┘
userinfo
Parse a URL and extract user info.
Examples
>>> import ibis
>>> url = ibis.literal("https://user:pass@example.com:80/docs/books" )
>>> result = url.userinfo() # user:pass