LENGTH function
LENGTH
returns the code points in
an encoded string.
Signatures
Parameter | Type | Description |
---|---|---|
str | string or bytea |
The string whose length you want. |
encoding_name | string |
The encoding you want to use for calculating the string’s length. Defaults to UTF-8. |
Return value
length
returns an int
.
Details
Errors
length
operations might return NULL
values indicating errors in the
following cases:
- The encoding_name provided is not available in our encoding package.
- Some byte sequence in str was not compatible with the selected encoding.
Fixed-width strings
Materialize returns the length of fixed-width strings as the maximum width of
the string. For example length
on a CHAR(15)
column returns 15
as each
string’s length.
Materialize receives strings from your database in the same format they are
emitted. In the case of fixed-width strings, e.g. CHAR
columns in PostgreSQL,
we receive the value padded by empty spaces. Because we cannot determine whether
those spaces were intentional or an artifact of a fixed-width string, we provide
the length of the string as we received it.
You can find any updates on this behavior in this GitHub issue.
Encoding details
-
Materialize uses the
encoding
crate. See the list of supported encodings, as well as their names within the API. -
Materialize attempts to convert PostgreSQL-style encoding names into the WHATWG-style encoding names used by the API.
For example, you can refer to
iso-8859-5
(WHATWG-style) asISO_8859_5
(PostrgreSQL-style).However, there are some differences in the names of the same encodings that we do not convert. For example, the windows-874 encoding is referred to as
WIN874
in PostgreSQL; Materialize does not convert these names.
Examples
SELECT length('你好') AS len;
len
-----
2
SELECT length('你好', 'big5') AS len;
len
-----
3