Changelog

Unicode string normalization

10.25.2025

Syntax

1
-- Default normalization (NFC)
2
SELECT normalize('café');
3

4
-- Explicit normalization form
5
SELECT normalize('café', NFC);   -- Canonical composition (default)
6
SELECT normalize('café', NFD);   -- Canonical decomposition
7
SELECT normalize('file', NFKC);  -- Compatibility composition
8
SELECT normalize('file', NFKD);  -- Compatibility decomposition
sql

For more details on Unicode normalization forms, see the Unicode Standard documentation.

Use cases

  • Text comparison: Ensure strings with different Unicode representations of the same characters compare as equal
  • Data cleaning: Standardize text input from various sources that may use different Unicode encodings
  • Compatibility: Convert special characters like ligatures (fi → fi) to their compatible forms

This feature is particularly useful when working with international data where the same visual character can have multiple Unicode representations.

For more information, see our official documentation.