Release notes
This page details changes between versions of Materialize, including:
- New features
- Major bug fixes
- Substantial API changes
For information about available versions, see our Versions page.
v0.26.6 LTS
Released on 20 March 2023.
- Correctly decode negative timestamps in Avro-formatted sources (#16609).
v0.26.5
Released on 7 October 2022.
-
In Avro-formatted sinks, avoid emitting schemas with types that reference the null namespace (e.g.,
.foo
) whenever possible (#14333).The Avro specification is ambiguous as to whether such references to the null namespace are allowed. Avro libraries in other languages do not always handle them properly.
-
Avoid crashing when a PostgreSQL or PubNub source produces a retraction and a query that depends on that source uses certain types of
DISTINCT
orGROUP BY
clauses (#12783).
v0.26.4
Released on 30 June 2022.
-
In Kafka sources that enable the
INCLUDE KEY
option in combination withENVELOPE NONE
, support ingesting messages whose key is null (#12275). -
Support Avro-formatted sources with Avro schemas containing arrays or maps with nullable element types (#8917).
v0.26.3
Released on 22 June 2022.
-
Accept the special CORS origin
*
to allow cross-origin requests from any origin. See the--allow-cors-origin
flag for details. -
Add the
dense_rank
,lag
,lead
,first_value
, andlast_value
window functions. -
Fix a crash when using the
DEBEZIUM UPSERT
envelope with composite keys whose order differs between the key and value (#12790).
v0.26.2
Released on 23 May 2022.
- Restore CPU and memory Prometheus metrics in the monitoring dashboard.
v0.26.1
Released on 05 May 2022.
-
Support configuring the
search_path
session parameter. -
Fix a bug that could cause dataflows to fail to reuse existing arrangements (#11793).
-
Fix a bug that caused the memory usage visualization to fail due to a cross-origin resource sharing (CORS) misconfiguration (#11817).
-
Avoid leaking memory when an index is dropped (#11949).
v0.26.0
Released on 13 April 2022.
-
Breaking change. Use a restrictive CORS policy by default. The
--allow-cors-origin
flag can be used to override the default policy. -
Breaking change. Change the return type of the
list_length
function frombigint
tointeger
. -
Optionally emit the message headers in Kafka sources via the new
INCLUDE HEADERS
option. -
Allow
ROW
literals to be used as arguments to thecoalesce
function and inVALUES
clauses (#8597, #10422). -
Change most functions that accept polymorphic parameters to promote arguments to a “best common type.” For example, when concatenating a
smallint list
to abigint list
, Materialize will now cast thesmallint list
to abigint list
, resulting in a concanated list of typebigint list
. -
Add the
map_length
function, which computes the number of entries in amap
value.
v0.25.0
Released on 31 March 2022.
-
Fix
base64
decoding to support numerals in encoded strings. Previously, if the encoded string contained a numeral, decoding would fail. -
Fix a panic that would occur if an object or type was over-qualified (i.e., the fully qualified object name consisted of more than three identifiers separated by a
.
).
v0.24.0
Released on 24 March 2022.
-
Restore the documented behavior of the
--differential-idle-merge-effort
and--timely-progress-mode
command line flags (#11256).Due to a bug, these parameters were silently ignored between v0.9.13 and v0.23.0.
-
Fix a bug where a statement descriptor could change before execution, resulting in an error or a crash (#11214, #11258).
-
Limit the views in the
pg_catalog
schema to reflect the state of only the current database (#11292). The new behavior matches PostgreSQL.
v0.23.0
Released on 18 March 2022.
-
Breaking change. Change the default listen address to
127.0.0.1:6875
.Previously, Materialize would accept HTTP and SQL connections from any machine on the network by default; now it accepts HTTP and SQL connections from only the local machine by default. To return to the old behavior, specify the command line flag
--listen-addr=0.0.0.0:6875
.The
materialized
Docker image continues to use a listen address of0.0.0.0:6875
by default. -
Improve PostgreSQL compatibility:
-
Add the
pg_collation
,pg_inherits
, andpg_policy
relations to thepg_catalog
schema. -
Add several columns to the existing
pg_attribute
,pg_class
,pg_index
, andpg_type
relations in thepg_catalog
schema. -
Add the
pg_get_indexdef
function. -
Change the
pg_get_constraintdef
function to always returnNULL
instead of raising an error. -
Add a
USING <method>
clause toCREATE INDEX
, witharrangement
as the only valid method.
Together these changes enable support for Apache Superset and the
\d <object>
command in the psql terminal. -
-
Support calling
date_trunc
withinterval
values (#9871). -
Remove the mandatory default index on tables.
-
Fix a crash when calling
array_to_string
with an empty array (#11073). -
Fix an error when calling
string_agg
with allNULL
inputs (#11139). -
Include information about the experimental cluster feature in
SHOW INDEX
andSHOW SINKS
. -
Improve recovery of Postgres sources when errors occur during initial data loading (#10938).
-
Make
CREATE VIEWS
on a Postgres source resilient to changes to the upstream publication that are made after the the source is created. (#11083)
v0.22.0
Released on 03 March 2022.
-
Breaking change. Standardize handling of the following unmaterializable functions (#10445):
current_database
current_timestamp
current_role
current_schema
current_schemas
current_user
mz_cluster_id
mz_logical_timestamp
mz_uptime
mz_version
session_user
pg_backend_pid
pg_postmaster_start_time
version
Materialize now allows use of unmaterializable functions in views, but will refuse to create an index that directly or indirectly depends on a unmaterializable function. The one exception is
mz_logical_timestamp
, which can be used in limited contexts in a materialized view as a temporal filter.Previously
current_timestamp
,mz_logical_timestamp
, andmz_uptime
were incorrectly disallowed in unmaterialized views, while the remaining unmaterializable functions were incorrectly allowed in materialized views. -
Breaking change. Store days separately in
interval
. Unlike in previous versions, hours are not automatically converted to days. This means that: an interval of 24 hours will not be equal to an interval of 1 day, you cannot subtract hours from days, and when ordering intervalsd days > h hours
for alld
andh
(#10708).To force a conversion from hours to days, use the new
justify_hours
function. -
Breaking change. Print all negative
interval
units as plural (e.g.,-1 days
will be printed instead of-1 day
). This matches the behavior of PostgreSQL. -
Breaking change. Round microsecond field of
interval
to 6 places before applying the given precision. For exampleINTERVAL '1.2345649' SECOND(5)
will be rounded to00:00:01.23457
, not00:00:01.23456
. This matches the behavior of PostgreSQL. -
Add several new time units to
interval
parsing:yr
,yrs
,hr
,hrs
,min
,mins
,sec
, andsecs
.Thanks to external contributor @sunisdown.
-
Add the
justify_days
,justify_hours
, andjustify_interval
functions. -
Add support for named composite types. Unimplemented features are listed in the original issue (#10734).
-
Change the range of the
oid
type from [-231, 231 - 1] to [0, 232 - 1] to match PostgreSQL. -
Change the claimed PostgreSQL version returned by the
version()
function to 9.5 to match the values of theserver_version
andserver_version_num
session parameters. -
In Kafka sources that use
INCLUDE KEY
, allow the key schema to be directly provided by the Confluent Schema Registry using the bareFORMAT
syntax:CREATE SOURCE src FROM KAFKA BROKER '...' TOPIC '...' FORMAT AVRO USING CONFLUENT SCHEMA REGISTRY '...' INCLUDE KEY AS named;
Previously, this required explicitly using the
KEY FORMAT ... VALUE FORMAT
syntax when also using the Confluent Schema Registry. -
Allow specifying the same command line flag multiple times. The last specification takes precedence. This matches the behavior of many standard Unix tools and is particularly useful for folks using
materialized
via Docker, as it allows overwriting the default--log-file
option. -
Fix a panic that could occur if you rematerialized a source that had previously been materialized under a different name, e.g. via the following sequence of operations (#10904):
CREATE SOURCE src ...;
CREATE INDEX src_idx ON src ...;
ALTER src RENAME TO new_src;
DROP INDEX src_idx;
CREATE INDEX new_src_idx ON new_src ...;
-
Fix a data loss bug in Postgres sources introduced in v0.20.0 (#10981).
v0.21.0
Released on 25 February 2022.
-
Breaking change. Return an empty list for slice operations that yield no elements (e.g., when the beginning of the slice’s range exceeds the length of the list); previously these operations returned
NULL
(#10557). -
Breaking change. Decrease the minimum
interval
value from-2147483647 months -2147483647 days -2147483647 hours -59 minutes -59.999999 seconds
to-2147483648 months -2147483648 days -2147483648 hours -59 minutes -59.999999 seconds
to match PostgreSQL (#10598). -
Support sequences of subscript operations on
array
values when indexing/accessing individual elements (as opposed to slicing/accessing ranges of elements) (#9815). -
Allow setting the
standard_conforming_strings
session parameter to its default value ofon
(#10691). Setting it tooff
remains unsupported. -
Allow setting the
client_min_messages
session parameter, which controls which messages are sent to the client based on the severity level (#10693). -
Improve the clarity of schema resolution errors generated by Avro-formatted sources (#8415).
-
Add the
chr
function to convert a Unicode codepoint into a string. -
Change inclusive ranges of column indices in the plans generated by
EXPLAIN
to use..=
instead of..
. -
Support the
ARRAY(<subquery>)
expression for constructing anarray
from the result of a subquery (#10700). -
In Kafka sources that use
ENVELOPE UPSERT
, fix renaming the key column viaINCLUDE KEY AS
(#10730). -
Return the correct number of columns when both a wildcard (
*
) and a table function appear in theSELECT
list (#10363). -
Avoid panicking when negating certain intervals (#10729).
-
Fix an issue in Avro-formatted sources where an invalid record could corrupt the next record, yielding either wrong results or a panic (#10767).
v0.20.0
Released on 14 February 2022.
-
Breaking change. In Kafka sources and sinks, do not default the SSL parameters for the Confluent Schema Registry to the SSL parameters for the Kafka broker.
SSL parameters for the Confluent Schema Registry must now always be provided explicitly, even when they are identical to the SSL parameters for the Kafka broker. See the Confluent Schema Registry options for details.
Existing source definitions have been automatically migrated to account for the new behavior.
-
Breaking change. Change the return type of the
extract
function fromfloat
tonumeric
(#9853).The new behavior matches PostgreSQL v14.
-
Breaking change. Return an error when
extract
is called with adate
value but a time-related field (e.g.,SECOND
) (#9853).Previous versions of Materialize would incorrectly return
0
in these cases. The new behavior matches PostgreSQL.date_part
still returns a0
in these cases, which matches the PostgreSQL behavior. -
Breaking change. Change the output of
format_type
to use SQL standard type names when possible, rather than PostgreSQL-specific type names. -
Support assuming AWS roles in S3 and Kinesis sources (#5895). See Specifying AWS credentials for details.
-
Support arbitrary
SELECT
statements inTAIL
.Previously,
TAIL
could only operate on sources, tables, and views. -
Add the inverse trigonometric functions
asin
,asinh
,acos
,acosh
,atan
,atanh
. -
Add the cryptography functions
md5
,sha224
,sha256
,sha384
, andsha512
. -
Add
microsecond
,month
,decade
,century
,millennium
units tointerval
parsing using the PostgreSQL verbose format. -
Improve millisecond parsing for
interval
using the PostgreSQL verbose format (#6420). -
Follow PostgreSQL’s type conversion rules for the relations involved in a
UNION
,EXCEPT
, orINTERSECT
operation (#3331). -
Correctly deduplicate Debezium-enveloped Kafka sources when the underlying Kafka topic has more than one partition (#10375). Previous versions of Materialize would lose data unless the
deduplication = 'full'
option was specified. -
Improve the performance of SQL
LIKE
expressions.
v0.19.0
Released on 02 February 2022.
-
Breaking change. Reject unknown
WITH
options in theCONFLUENT SCHEMA REGISTRY
clause when creating a Kafka source or sink (#10129).Previously, unknown options were silently ignored. The new behavior matches with how other clauses handle unknown
WITH
options. -
Breaking change. Fix interpretation of certain
interval
values involving time expressions with two components (e.g.,12:34
) (#7918).Previous versions of Materialize would assume the interval’s head time unit to always be hours. The new behavior matches PostgreSQL.
-
Breaking change. Drop support for the
consistency_topic
option when creating a source withENVELOPE DEBEZIUM
. This was an undocumented option that is no longer relevant. -
Fix planning of repeat constant expressions in a
GROUP BY
clause (#8302). -
Support calling multiple distinct table functions in the
SELECT
list, as long as those table functions are not nested inside other table functions (#9988).
v0.18.0
Released on 26 January 2022.
-
Breaking change. Further improve consistency with PostgreSQL’s column name inference rules:
-
When inferring a column name for a nested cast expression, prefer the name of the outermost cast rather than the innermost cast (#10167).
Consider the following query:
SELECT 'a'::int::text;
This version of Materialize will infer the column name
text
, while previous versions of Materialize would incorrectly infer the nameint4
. -
Infer the name
case
forCASE
expressions unless column name inference on theELSE
expression produces a preferred name.
-
-
When creating Avro-formatted sinks, allow setting fullnames on the generated key and value schemas via the new
avro_key_fullname
andavro_value_fullname
options (#8352). -
Detect and reject multiple materializations of sources that would silently lose data if materialized more than once (#8203).
This enables safe use of unmaterialized PostgreSQL sources and S3 sources with SQS notifications enabled.
-
Support
SHOW TIME ZONE
as an alias forSHOW TIMEZONE
(#9908). -
Fix a bug in the
ILIKE
operator where matching against achar
value did not take trailing spaces into account (#10076). The new behavior matches the behavior of theLIKE
operator. -
Allow wildcards in
LIKE
patterns to match newline characters (#10077). The new behavior matches PostgreSQL. -
Fix parsing of nested empty
SELECT
statements, as inSELECT * FROM (SELECT)
(#8723).
v0.17.0
Released on 20 January 2022.
-
Breaking change. Improve consistency with PostgreSQL’s column name inference rules:
-
When inferring a column name for a cast expression, fall back to choosing the name of the target type.
Consider the following query:
SELECT 'a'::text;
This version of Materialize will infer the column name
text
, while previous versions of Materialize would fall back to the default column name?column?
. -
When inferring a column name for a
boolean
orinterval
literal, fall back to choosingbool
orinterval
, respectively.
-
-
Support subscripting
jsonb
values to retrieve array elements or object values, as in:SELECT ('{"a": 1, "b": 2, "c": 3}'::jsonb)['b'];
jsonb ------- 2
-
In Kafka sources and sinks, support independent SSL configurations for the Confluent Schema Registry and the Kafka broker. See the new Confluent Schema Registry options for details.
-
Fix a bug where using a
ROWS FROM
clause with an alias in a view would cause Materialize to fail to reboot (#10008). -
When initializing a PostgreSQL source, report an error if the configured publication does not exist (#9933). Previously, Materialize would silently import zero tables.
v0.16.0
Released on 13 January 2022.
-
Breaking change. Return an error when
extract
is called with atime
value but a date-related field (e.g.,YEAR
) (#9839).Previous versions of Materialize would incorrectly return
0
in these cases. The new behavior matches PostgreSQL. -
Breaking change. Disallow the string
'sNaN'
(in any casing) as a validnumeric
value. -
Add the
array_remove
andlist_remove
functions. -
Support the special PostgreSQL syntax
SET NAMES
andSET SCHEMA
for setting theclient_encoding
andsearch_path
parameters, respectively. -
Fix a crash when a condition of a
CASE
statement evaluates to an error. (#9995) -
Fix a crash in the optimizer when the branches of a
CASE
statement involved record types whose fields had differing nullability (#9931).
v0.15.0
Released on 05 January 2022.
-
Breaking change. Disallow window functions outside of
SELECT
lists,DISTINCT ON
clauses, andORDER BY
clauses (#9749).Window functions in other positions were never meant to be allowed and do not have sensible semantics, so there is no replacement for the old behavior.
-
Improve timestamp selection when the first statement in a transaction does not reference any sources (#9751).
This facilitates using Npgsql v6 to connect to Materialize.
-
Permit passing the
fetch_message_max_bytes
librdkafka option to Kafka sources.
v0.14.0
Released on 29 December 2021.
-
Breaking change. Disallow views with multiple unnamed columns (#9413).
For example, this view is now rejected, as there are two columns without a name:
CREATE VIEW view AS SELECT 1, 2;
To make this view compatible with v0.14.0, adjust it view to have at most one column without a name:
CREATE VIEW view AS SELECT 1 AS named, 2;
-
Breaking change. Change the internal representation of numbers in
jsonb
(#5919, #9669). Previously, JSON numbers were stored as eitherint8
orfloat8
values; now they are always stored asnumeric
values.The upshot is that the
jsonb
type has a wider range for integers but a smaller range for floats. We expect this to cause very little practical breakage. -
Breaking change. Don’t consider join equivalences when determining whether a column is ungrouped. For example, Materialize now rejects this SQL query because
t1.a
does not appear in theGROUP BY
clause:SELECT t1.a FROM t1 JOIN t2 ON t1.a = t2.a GROUP BY t2.a;
Previous versions of Materialize permitted this query by noticing that the
JOIN
clause guaranteed thatt1.a
andt2.a
were equivalent, but this behavior was incompatible with PostgreSQL.To fix this query, rewrite it to consistently refer to
t1.a
in theGROUP BY
clause and theSELECT
list:SELECT t1.a FROM t1 JOIN t2 ON t1.a = t2.a GROUP BY t1.a;
-
Breaking change. When using an arbitrary expression in an
ORDER BY
orDISTINCT ON
clause, only recognize references to input columns. Previously, Materialize would recognize references to output columns as well.See the column references section of the
SELECT
documentation for details.This change affects only obscure edge cases. We expect it to cause very little breakage in practice.
-
Fix a bug that could cause wrong results in queries that used the
ROWS FROM
clause (#9686). The bug occurred if functions beyond the second function in the clause produced more rows than the first function in the clause. -
Support the
WITH ORDINALITY
modifier for function calls in theFROM
clause (#8445). When present, the function produces an additionalbigint
column namedordinality
that numbers the returned rows, starting with 1. -
Support casts from
timestamp
andtimestamp with time zone
totime
. -
Support casts from
smallint
andbigint
tooid
, as well as casts fromoid
tobigint
.
Improve PostgreSQL compatibility:
-
Add the
pg_catalog.pg_roles
view. -
Add
pronamespace
andproargdefaults
columns to thepg_catalog.pg_proc
view. -
Add a stub implementation of the
pg_backend_pid
function.
v0.13.0
Released on 21 December 2021.
-
Allow join trees that mix
LATERAL
elements withRIGHT
andFULL
joins (#6875). -
Add the
generate_subscripts
function, which generates the valid subscripts of the selected dimension of an array. -
Improve error messages when referencing a column that is inaccessible because it was not included in a
GROUP BY
clause or used in an aggregate function (#5314). -
Avoid crashing when the
ON
clause of aLATERAL
join contains a reference to a column in an outer query (#9605). -
Fix several bugs in the planning of
ROWS FROM
clauses that incorrectly rejected certain valid column and function references (#9653, #9657). -
Correct results for certain queries where the condition in a
CASE
could evaluate toNULL
(#9285). -
Correct results when filtering on
NULL
values in columns joined withUSING
constraints (#7618). -
Correct join associativity when using comma-separated
FROM
items (#9489). Previously the comma operator had higher precedence than theJOIN
operator; now it correctly has lower precedence.This bug could cause incorrect results in queries that combined
RIGHT
andFULL
joins with comma-separatedFROM
items. -
Materialize no longer inlines the CTE where it’s referenced (#4867).
v0.12.0
Released on 16 December 2021.
-
Optionally emit the message partition, offset, and timestamp in Kafka sources via the new
INCLUDE PARTITION
,INCLUDE OFFSET
, andINCLUDE TIMESTAMP
options, respectively. -
Add the
pg_type_is_visible
function. -
Add a stub implementation of the
pg_get_constraintdef
function. -
Avoid crashing when executing certain queries involving the
mz_logical_timestamp
function (#9504).
v0.11.0
Released on 8 December 2021.
-
Breaking change. Remove the
mz_workers
function (#9363). -
Provide pre-compiled binaries for the ARM CPU architecture. Support for this CPU architecture is in beta.
-
Support
generate_series
fortimestamp
data. -
Support the
ROWS FROM
syntax inFROM
clauses (#9076). -
Improve PostgreSQL compatibility:
-
Support qualified operator names via the
OPERATOR ([<schema>.] <op>)
syntax (#9255). If the schema is specified, it must bepg_catalog
. Referencing operators in other schemas is not yet supported. -
Support explicit reference to the default collation via the
<expr> COLLATE pg_catalog.default
syntax (#9280). Other collations are not yet supported. -
Support multiple identical table function invocations in
SELECT
lists (#9366).
These changes enable the
\dt <pattern>
command in the psql terminal and PgJDBC’sgetPrimaryKeys()
API. -
-
Fix a query optimization that could produce wrong results when a condition in a
CASE
expression returnedNULL
rather thanFALSE
(#9287). -
Speed up the creation or restart of a Kafka sink that uses the
reuse_topic
option (#9094). -
Accept message names in Protobuf sources that do not start with a leading dot (#9372). This fixes a regression introduced in v0.9.12.
-
Fix decoding of Protobuf sources whose type contains a nested message type with one or more integer fields (#8930). These messages could cause previous versions of Materialize to crash.
-
Avoid crashing when presented with queries that are too big to fit the limits of our intermediate representations. These queries now report an internal error of the form “exceeded recursion limit of {X}”.
-
Correctly autogenerate views from Postgres sources during
CREATE VIEWS
when the upstream table contains numeric columns with no specified scale and precision (#9268). -
Prevent overflow on operations combining
timestamp
andinterval
(#9254).
v0.10.0
Released on 23 November 2021.
-
Breaking change. Disallow creating views with columns of the same name (#9158). This change brings views' structure into closer alignment with tables.
When creating views whose
SELECT
statements return multiple columns with the same identifier, you can use the optional column renaming syntax to provide unique identifiers for all columns. For example:CREATE MATERIALIZED VIEW m (col_a, col_b) AS SELECT a AS col, b AS col FROM t;
For more details, see
CREATE MATERIALIZED VIEW
andCREATE VIEW
. -
Breaking change. Disallow calls to aggregate functions that use
DISTINCT *
as their arguments, e.g.COUNT(DISTINCT *)
(#9122). -
Breaking change. Disallow
SELECT DISTINCT
when applied to a 0-column relation, like a table with no columns (#9122). -
Allow creating Avro-formatted sources from Avro schemas whose top-level type is not a record.
-
Support invoking a single table function in a
SELECT
list (#9100). -
Fix a bug that could cause wrong results when a window function appeared in a subquery (#9077).
v0.9.13
Released on 17 November 2021.
-
Fix a crash or incorrect results when a join consumes data from a distinct operation (#9027).
-
In Protobuf-formatted Kafka sources, accept messages whose encoded representation is zero bytes, which occurs when all the fields of the message are set to their default values. Previously these messages were incorrectly dropped.
-
Support constructing lists from a subquery via the
LIST(<subquery>)
expression.
v0.9.12
Released on 11 November 2021.
-
Known issue. Message names in Protobuf sources that do not start with a leading dot are erroneously rejected. As a workaround, add a leading dot to the message name. This regression is corrected in v0.11.0.
-
Breaking change: Disallow ambiguous table references in queries. For example:
SELECT * FROM a, a; SELECT * FROM a, b AS a; SELECT * FROM a, generate_series(1,3) AS a;
These queries previously worked, but will now throw the error:
table name "a" specified more than once
However, you can always work around this limitation using aliases, e.g.
SELECT * FROM a, a AS b;
(#4756)
-
Deduplicate columns in arrangements that are shared between keys and values. This can result in memory savings of up to a factor of two for arrangements indexed by the whole row.
-
Add the
date_bin
function, which is similar todate_trunc
but supports “truncating” to arbitrary intervals. -
Add support for the CSV format in
COPY FROM
. -
Fix incorrect results for a certain class of degenerate join queries.
-
Fix a bug in
pg_catalog.pg_attribute
that would incorrectly omit certain rows based on the underlying column type. -
Add the
type_oid
column tomz_columns
. -
When using
COPY FROM
, allow extra data after the end of copy marker (\.
), but discard it. Previously, MZ would error in this case. -
Replace the
mz_kafka_broker_rtt
andmz_kafka_consumer_partitions
system catalog tables with a new table,mz_kafka_source_statistics
, containing raw statistics from the underlying librdkafka library. -
Fix a bug that caused a panic when using a query containing
STRING_AGG
,JSON_AGG
,JSON_OBJECT_AGG
,LIST_AGG
, orARRAY_AGG
on data sets containing exactly one record when that condition is known at optimization time. -
Support the
row_number
window function. -
Support PgJDBC’s
getColumns()
API.
v0.9.11
Released on 02 November 2021.
-
Disallow
UPDATE
andDELETE
operations on tables when boot in--disable-user-indexes
mode. -
Support the
READ ONLY
transaction mode. -
Support
SET
in transactions, as well asSET LOCAL
. This unblocks a problem with PostgreSQL JDBC 42.3.0.
v0.9.10
Released on 27 October 2021.
-
Evaluate TopK operators on constant inputs at query compile time.
-
Add the
session_user
system information function.
v0.9.9
Released on 21 October 2021.
-
Breaking change. Fix a bug that inadvertently let users create
char list
columns and custom types. This type is not meant to be supported. -
Beta support for Redpanda sources.
-
Let users express
JOIN
-likeDELETE
s withDELETE...USING
. -
Optimize some
(table).field1.field2
expressions to only generate the columns fromtable
that are accessed subsequently. This avoids performance issues when extracting a single field from a table expression with several columns, for example records generated fromROW
. (#8596) -
Fix a bug that inadvertently let users create an
array
with elements of typelist
ormap
, which crashes Materialize. (#8672) -
Format dates before AD 1 with the BC notation instead of using negative dates.
-
Fix some sources of crashes and incorrect results from optimizing queries involving constants (#8713, #8717).
-
Support alternative
SUBSTRING(<string> [FROM <int>]? [FOR <int>]?)
syntax.
v0.9.8
Released on 12 October 2021.
- Throw errors on floating point arithmetic overflow and underflow.
v0.9.7
Released on 06 October 2021.
-
Support the
PREPARE
,EXECUTE
, andDEALLOCATE
SQL statements (#3383). -
Support the
IS TRUE
,IS FALSE
,IS UNKNOWN
operators (and theirNOT
variations). (#8455) -
Add support for retention settings on Kafka sinks.
-
Support explicit
DROP DATABASE ... (CASCADE | RESTRICT)
statements. The default behavior remains CASCADE. -
Fix a bug that prevented some users from creating Protobuf-formatted sources. (#8528)
v0.9.6
Released on 29 September 2021.
-
Correctly handle TOASTed columns when using PostgreSQL sources. (#8371)
-
Return control of canceled sessions (
ctrl + c
) whileSELECT
statements await results. Previously, this could cause the session to never terminate. -
Fix a bug that could prevent Materialize from booting when importing views into dataflows whose indexes had not yet been converted to dataflows themselves. (#8021)
v0.9.5
Released on 23 September 2021.
-
Accept case insensitive timezones to be compatible with PostgreSQL.
-
Add support for bitwise operators on integers.
-
Persist the
mz_metrics
andmz_metric_histogram
system tables and rehydrate the previous contents on restart. This is a small test of the system that will power upcoming persistence features. Users are free to opt out of this test by startingmaterialized
with the--disable-persistent-system-tables-test
flag.
v0.9.4
Released on 17 September 2021.
-
Improve the performance of built-in sources and views in the system catalog, which should result in lower latency spikes and increased throughput when issuing many small queries, and reduce overall memory consumption. Additionally, the content of the views is now consistent at the introspection interval boundaries. Prior to this release, some views would reveal more details about ephemeral dataflows and operators.
-
Fix a bug that caused a panic when computing the
max
ofint2
values. -
Ignore the trailing newline character of POSIX compliant files instead of decoding it as an empty byte row. (#8142)
-
Support
ORDER BY
in aggregate functions. -
When issuing
COMMIT
orROLLBACK
commands outside of an explicit transaction, always return a warning. Previously, the warning could be suppressed. -
Fix a bug in the
CREATE SINK
syntax by updating the optionalCONSISTENCY
clause.
v0.9.3
Released on 08 September 2021.
-
Fix a bug that prevented creating Avro sinks on old versions of Confluent Platform
-
Fix a bug that prevented upgrading to 0.9.2 if the catalog referenced CSV file sources with headers.
-
Support the
USING CONFLUENT SCHEMA REGISTRY
schema option for Protobuf-formatted sources.
v0.9.2
Released on 02 September 2021.
-
The metrics scraping interval to populate the
mz_metrics
table and its variants is now independent of the introspection interval. It is controlled by the flag –metrics-scraping-interval. -
Allow users to specify the names of columns that must be present in CSV objects with headers (#7507), and support CSV with headers in S3. (#7913)
-
Add the
ALTER INDEX <name> SET ENABLED
syntax to aid troubleshooting and recovery of Materialize instances. (#8079)
v0.9.1
Released on 23 August 2021.
-
Change the type of the
mz_metrics
.time
column fromtimestamp
totimestamp with time zone
to better reflect that the timestamp is in UTC. -
Add the
array_agg
function. -
Add the
list_agg
function. -
Add the
string_agg
function. -
Add the
generate_series(start, stop, step)
table function. (#7953)
v0.9.0
Released on 10 August 2021.
-
Breaking change. Reject Protobuf sources whose schemas contain unsigned integer types (
uint32
,uint64
,fixed32
, andfixed64
). Materialize previously converted these types tonumeric
.A future version of Materialize is likely to support unsigned integers natively, at which point the aforementioned Protobuf types will be converted to the appropriate Materialize types.
-
Breaking change. The
HTTP_PROXY
variable is no longer respected. Usehttp_proxy
instead. -
Respect the
no_proxy
environment variable to exclude certain hosts from the configured HTTP/HTTPS proxy, if any. -
Add
reuse_topic
as a beta feature for Kafka Sinks. This allows re-using the output topic across restarts of Materialize. -
Add support for JSON-encoded Kafka sinks.
v0.8.3
Released on 20 July 2021.
-
The
MZ_LOG
environment variable is no longer recognized. Setting the log level can be done using the--log-filter
command line parameter or theMZ_LOG_FILTER
environment variable. -
Refactor the
numeric
type’s backing implementation. With this change comes more PostgreSQL-like semantics for unscaled values, as well as bug fixes. (#7312) -
Add support for including Kafka keys in dataflows via new
INCLUDE KEY
syntax. (#6661) -
The
FORMAT
argument toUPSERT
to specify Kafka key formats has been deprecated, use the newKEY FORMAT .. VALUE FORMAT ..
syntax (documentation). TheUPSERT FORMAT ..
syntax will be removed in a future release. -
Add
WITH
options to read from environment variables for SSL and SASL passwords. (#7467)
v0.8.2
Released on 08 July 2021.
-
Stabilized Postgres sources (no longer require
--experimental
) -
Breaking change.
HOST
keyword when creating Postgres sources has been renamed toCONNECTION
. -
Record the initial high watermark offset on the broker for Kafka sources. This enables clients to observe the progress of initial data loading. The table
mz_kafka_consumer_partitions
has an additional columninitial_high_offset
containing the first reportedhi_offset
from the broker for each partition. -
Add
left
to the string function suite.
v0.8.1
Released on 29 June 2021.
-
Add timelines to all sources to prevent joining data whose time is not comparable. This only affects new CDC and Debezium consistency topic sources by default.
-
Add the
isolation_level
WITH
option to Kafka sources to allow changing read behavior of transactionally written messages. -
Add the
kafka_time_offset
WITH
option for Kafka sources, which allows to setstart_offset
based on Kafka timestamps. -
Add the [
timestamp_frequency_ms
]WITH
option to Kinesis, S3, and file sources. -
Breaking change. The
timezone(String, Time)
function can no longer be used in views. -
Debezium sinks emit
collection_data
attributes in their consistency topic. -
Breaking change. Renamed the
timestamp
,diff
, andprogressed
columns inTAIL
tomz_timestamp
,mz_diff
, andmz_progressed
. -
Add the
current_role
system information function. -
Support manually declaring a
(non-enforced) primary key
on sources. -
S3 sources retry failed requests to list buckets and download objects.
v0.8.0
Released on 09 June 2021.
-
Add the
COPY FROM
statement, which allows populating a table via the PostgreSQLCOPY
protocol. -
Stabilize the
ARRAY
constructor. -
Support the
#>
and#>>
jsonb
operators. -
Breaking change. Sort
NULL
s last, to match the default sort order in PostgreSQL. -
Breaking change. Rename
consistency
parameter toconsistency_topic
for both Kafka sources and sinks. Additionally, changeconsistency_topic
on sinks to be a string that specifies a topic name instead of a boolean. This harmonizes the parameter behavior between sources and sinks.
v0.7.3
Released on 17 May 2021.
-
Add a new metric,
mz_log_message_total
that counts the number of log messages emitted per severity. -
Return the number of days between two dates (an integer) when subtracting one date from the other. Previously, the interval between the two dates would be returned. The new behavior matches the behavior in PostgreSQL.
-
Breaking change. Change the default for the
enable_auto_commit
option on Kafka sources tofalse
. -
Support equality operators on array data.
-
Stabilized temporal filters (no longer require
--experimental
)
v0.7.2
Released on 09 April 2021.
-
Introduce the concept of volatility to describe sources that do not provide reliability guarantees that Materialize relies on. The new volatility information is surfaced via
SHOW SOURCES
,SHOW VIEWS
, andSHOW SINKS
. -
Add PubNub sources.
-
Add
S3
sources. -
Add a
--log-filter
command-line option and aMZ_LOG_FILTER
environment variable that control which log messages to emit.This behavior was previously available via the undocumented
MZ_LOG
environment variable, which will be removed in a future release. -
Record Kafka Consumer metrics in the
mz_kafka_consumer_partitions
system table. Enabled by default for all Kafka sources. -
Add the
jsonb_object_agg
function to aggregate rows into a JSON object. -
Permit the
jsonb
type to store all 64-bit integers (#5919). Previously integers in the following ranges were rejected:- [-264, -(2^53-1)]
- [253 - 1, 2^64-1].
-
Add the
pg_postmaster_start_time
function, which reports the time at which the server started. -
Add the
mz_workers
function, which reports the number of workers in use by the server. -
Add the
mz_uptime
function, which reports the duration for which the server has been running. -
Add the
repeat
function, which repeats a string N times. -
Avoid panicking when planning SQL queries of the form
SELECT DISTINCT ... ORDER BY <expr>
whereexpr
is not a simple column reference (#6021). -
Support Kafka log compaction on Debezium topics via the
DEBEZIUM UPSERT
source envelope.
v0.7.1
Released on 25 March 2021.
-
Breaking change. Change the default
--logical-compaction-window
from 60 seconds to 1 millisecond. -
Breaking change. Remove
CREATE SINK ... AS OF
, which did not have sensible behavior after Materialize restarted. The intent is to reintroduce this feature with a more formal model ofAS OF
timestamps. (#3467) -
Add the
cbrt
function for computing the cube root of adouble precision
.Thanks to external contributor @andrioni.
-
Add the
encode
anddecode
functions to convert binary data to and from several textual representations.Thanks to external contributor @Posnet.
-
Add many of the basic trigonometric functions.
Thanks again to external contributor @andrioni.
-
Add
DROP TYPE
andSHOW TYPES
commands. -
Multipartition Kafka sinks with consistency enabled will create single-partition consistency topics.
-
Kafka sinks are now written via an idempotent producer to avoid duplicate or out of order messages.
-
Breaking change. Change the behavior of the
round
function when applied to areal
ordouble precision
argument to round ties to the nearest even number, rather than away from zero. When applied tonumeric
, ties are rounded away from zero, as before.The new behavior matches PostgreSQL.
-
Restore the
-D
command-line option as the short form of the--data-directory
option. -
Allow setting index parameters when creating an index via the new
WITH
clause toCREATE INDEX
. In older versions, setting these parameters required a separate call toALTER INDEX
. -
Fix a bug that prevented upgrading deployments from v0.6.1 or earlier to v0.7.0 if they contained:
-
Reduce memory usage and increase processing speed in materialized views involving sources with the “upsert” envelope. (#5509).
Users of the memory usage visualization will see that the operator “UpsertArrange” has changed to “Upsert”, and that the “Upsert” operator no longer shows any records. Actually, the “Upsert” operator still has a memory footprint proportional to the number of unique keys in the source.
-
Add the basic exponentiation, power and logarithm functions.
-
Add
position
to the string function suite. -
Add
right
to the string function suite.
v0.7.0
Released on 08 February 2021.
-
Known issue. You cannot upgrade deployments created with versions v0.6.1 or earlier to v0.7.0 if they contain:
- Views whose embdedded queries contain functions whose arguments are functions (#5802).
- Sinks using
WITH SNAPSHOT AS OF...
(#5808).
If you encounter this issue, you can:
- Use a previous version of Materialize to drop the view or sink before upgrading.
- Skip upgrading to v0.7.0, and instead upgrade to v0.7.1 which contains fixes for these bugs.
The next release (v0.7.1) contains fixes for these bugs.
-
Known issue. The
-D
command-line option, shorthand for the--data-directory
option, was inadvertently removed.It will be restored in the next release.
-
Breaking change. Require a valid user name when connecting to Materialize. Previously, Materialize did not support the concept of roles, so it accepted all user names.
Materialize instances have a user named
materialize
installed, unless you drop this user withDROP USER
. You can add additional users withCREATE ROLE
. -
Allow setting most command-line flags via environment variables.
-
Fix a bug that would cause
DROP
statements targeting multiple objects to fail when those objects had dependent objects in common (#5316). -
Prevent a bug that would allow
CREATE OR REPLACE
statements to create dependencies on objects that were about to be dropped (#5272). -
Remove deprecated
MZ_THREADS
alias forMZ_WORKERS
. -
Support equality operations on
uuid
data, which enables joins onuuid
columns (#5540). -
Add the
current_user
system information function. -
Add the
CREATE ROLE
,CREATE USER
,DROP ROLE
, andDROP USER
statements to manage roles in a Materialize instance. These roles do not yet serve any purpose, but they will enable authentication in a later release. -
Functions can now be resolved as schema-qualified objects, e.g.
SELECT pg_catalog.abs(-1);
. -
Support multi-partition Kafka sinks (#5537).
-
Support gzip-compressed file sources (#5392).
v0.6.1
Released on 22 January 2021.
-
Backwards-incompatible change. Validate
WITH
clauses inCREATE SOURCE
andCREATE SINK
statements. Previously Materialize would ignore any invalid options in these statement’sWITH
clauses.Upgrading to v0.6.1 will therefore fail if any of the sources or sinks within have invalid
WITH
options. If this occurs, drop these invalid sources or sinks using v0.6.0 and recreate them with validWITH
options. -
Backwards-incompatible change. Change the default value of the
timeout
option toFETCH
from0s
toNone
. The old default causedFETCH
to return immediately even if no rows were available. The new default causesFETCH
to wait for at least one row to be available.To maintain the old behavior, explicitly set the timeout to
0s
, as in:FETCH ... WITH (timeout = '0s');
-
Backwards-incompatible change. Consider the following keywords to be fully reserved in SQL statements:
WITH
,SELECT
,WHERE
,GROUP
,HAVING
,ORDER
,LIMIT
,OFFSET
,FETCH
,OPTION
,UNION
,EXCEPT
,INTERSECT
. Previously only theFROM
keyword was considered fully reserved.You can no longer use these keywords as bare identifiers anywhere in a SQL statement, except following an
AS
keyword in a table or column alias. They can continue to be used as identifiers if escaped. See the Keyword collision documentation for details. -
Backwards-incompatible change. Change the return type of
sum
overbigint
s frombigint
tonumeric
. This avoids the possibility of overflow when summing many large numbers (#5218).We expect the breakage from this change to be minimal, as the semantics of
bigint
andnumeric
are nearly identical. -
Speed up parsing of
real
andnumeric
values by approximately 2x and 100x, respectively (#5341, #5343). -
Ensure the first batch of updates in a source without consistency information is stamped with the current wall clock time, rather than timestamp
1
(#5201). -
When Materialize consumes a message from a Kafka source, commit that message’s offset back to Kafka (#5324). This allows Kafka-related tools to monitor Materialize’s consumer lag.
-
Add the
SHOW OBJECTS
SQL statement to display all objects in a database, regardless of their type. -
Improve the PostgreSQL compatibility of several date and time-related features:
-
Correct
date_trunc
’s rounding behavior when truncating by decade, century, or millenium (#5056).Thanks to external contributor @zRedShift.
-
Allow specifying units of
microseconds
,milliseconds
,month
,quarter
,decade
,century
, ormillenium
when applying theEXTRACT
function to aninterval
(#5107). Previously these units were only supported with thetimestamp
andtimestamptz
types.Thanks again to external contributor @zRedShift.
-
Support multiplying and dividing
interval
s by numbers (#5107).Thanks once more to external contributor @zRedShift.
-
Handle parsing
timestamp
andtimestamptz
from additional compact formats like700203
(#4889). -
Support conversion of
timestamp
andtimestamptz
to other time zones withAT TIME ZONE
andtimezone
functions.
-
-
Add the
upper
andlower
string functions, which convert any alphabetic characters in a string to uppercase and lowercase, respectively. -
Permit specifying
ALL
as a row count toFETCH
to indicate that there is no limit on the number of rows you wish to fetch. -
Support the
ISNULL
operator as an alias for theIS NULL
operator, which tests whether its argument isNULL
(#5048). -
Support the
ILIKE
operator, which is the case-insensitive version of theLIKE
operator for pattern matching on a string. -
Permit the
USING
clause of a join to reference columns with different types on the left and right-hand side of the join if there is an implicit cast between the types (#5276). -
Use SQL standard type names in error messages, rather than Materialize’s internal type names (#5175).
-
Fix two bugs involving common-table expressions (CTEs):
-
Fix a bug that caused incorrect results when multiple aggregations of a certain type appeared in the same
SELECT
query (#5304). -
Add the advanced
--timely-progress-mode
and--differential-idle-merge-effort
command-line arguments to tune dataflow performance. These arguments replace existing undocumented environment variables.
v0.6.0
Released on 18 December 2020.
-
Support specifying default values for table columns via the new
DEFAULT
column option inCREATE TABLE
. Thanks to external contributor @petrosagg. -
Add a
timeout
option toFETCH
to facilitate usingFETCH
to poll aTAIL
operation for new records. -
Add several new SQL functions:
-
The
digest
andhmac
cryptography functions compute message digests and authentication codes, respectively. These functions are based on thepgcrypto
PostgreSQL extension. -
The
version
andmz_version
functions report PostgreSQL-specific and Materialize-specific version information, respectively. -
The
current_schema
function reports the name of the SQL schema that appears first in the search path.
-
-
Fix a bug that would cause invalid data to be returned when requesting binary-formatted values with
FETCH
(#4976). -
Fix a bug when using
COPY
withTAIL
that could cause some drivers to fail if theTAIL
was idle for at least one second (#4976). -
Avoid panicking if a record in a regex-formatted source fails to decode as UTF-8 (#5008).
-
Allow query hints in
SELECT
statements.
v0.5.3
Released on 8 December 2020.
-
Add support for SQL cursors via the new
DECLARE
,FETCH
, andCLOSE
statements. Cursors facilitate fetching partial results from a query and are therefore particularly useful in conjuction withTAIL
.Known issue. Requesting binary-formatted values with
FETCH
does not work correctly. This bug will be fixed in the next release. -
Support common-table expressions (CTEs) in
SELECT
statements. -
Add a
map
type to represent unordered key-value pairs. Avro map values in Avro-formatted sources will be decoded into the newmap
type. -
Fix a regression in the SQL parser, introduced in v0.5.2, in which nested field accesses, e.g.
SELECT ((col).field1).field2
would fail to parse (#4827).
-
Fix a bug that caused the
real
/real
types to be incorrectly interpreted asdouble precision
(#4918).
v0.5.2
Released on 18 November 2020.
-
Provide the
list
type, which is an ordered sequences of homogenously typed elements; they’re nestable, too! The type was previously available in v0.5.1, but this release lets you createlist
s fromtext
, making their creation more accessible. -
Support the
pg_typeof
function. -
Teach
COPY TO
to supportFORMAT binary
. -
Support the
DISCARD
SQL statement. -
Change
TAIL
to:-
Produce output ordered by timestamp.
-
Support timestamp progress with the
PROGRESSED
option. -
Backwards-incompatible change. Use Materialize’s standard
WITH
option syntax, meaning:-
WITH SNAPSHOT
is nowWITH (SNAPSHOT)
. -
WITHOUT SNAPSHOT
is nowWITH (SNAPSHOT = false)
.
-
-
-
Report an error without crashing when a query contains unexpected UTF-8 characters, e.g.,
SELECT ’1’
. (#4755) -
Suppress logging of warnings and errors to stderr when users supply the
--log-file
command line flag (#4777). -
When using the systemd service distributed in the APT package, write log messages to the systemd journal instead of a file in the
mzdata
directory (#4781). -
Ingest SQL Server-style Debezium data (#4762).
-
Allow slightly more complicated
INSERT
bodies, e.g. insertingSELECT
ed literals (#4748). characters, e.g.,SELECT ’1’
(#4755).
v0.5.1
Released on 6 November 2020.
-
Known issue.
COPY TO
panics if executed via the “simple query” protocol, which is notably used by thepsql
command-line client (#4742).A fix is available in the latest unstable builds and will ship in v0.5.2.
Note that some PostgreSQL clients instead use the “extended query” protocol to issue
COPY TO
statements, or let you choose which protocol to use. If you are using one of these clients, you can safely issueCOPY TO
statements in v0.5.1. -
Backwards-incompatible change. Send the rows returned by the
TAIL
statement to the client normally (i.e., as if the rows were returned by aSELECT
statement) rather than via the PostgreSQLCOPY
protocol. The new format additionally moves the timestamp and diff information to dedicatedtimestamp
anddiff
columns at the beginning of each row.To replicate the old behavior of sending
TAIL
results via theCOPY
protocol, explicitly wrap theTAIL
statement in aCOPY TO
statement:COPY (TAIL some_materialized_view) TO STDOUT
-
Add the
COPY TO
statement, which sends the results of the statement it wraps via the special PostgreSQLCOPY
protocol. -
When creating a Kafka sink, permit specifying the columns to include in the key of each record via the new
KEY
connector option inCREATE SINK
. -
Default to using a worker thread count equal to half of the machine’s physical cores if the
--workers
command-line option is not specified. -
Add the
regexp_match
function to search a string for a match against a regular expression. -
Support
SELECT DISTINCT ON (...)
to deduplicate the output of a query based on only the specified columns in each row. Prior to this release, theSELECT
documentation incorrectly claimed support for this feature. -
Reduce memory usage in:
v0.5.0
Released on 21 October 2020.
-
Support tables via the new
CREATE TABLE
,DROP TABLE
,INSERT
andSHOW CREATE TABLE
statements. Tables are conceptually similar to a source, but the data in a table is managed by Materialize, rather than by Kafka or a filesystem.Note that table data is currently ephemeral: data inserted into a table does not persist across restarts. We expect to add support for persistent table data in a future release.
-
Generate a persistent, unique identifier associated with each cluster. This can be retrieved using the new
mz_cluster_id
SQL function. -
Automatically check for new versions of Materialize on server startup. If a new version is available, a warning will be logged encouraging you to upgrade.
This version check involves reporting the cluster ID and current version to a server operated by Materialize Inc. To disable telemetry of this sort, use the new
--disable-telemetry
command-line option. -
Add a web-based, interactive memory usage visualization to aid in understanding and diagnosing unexpected memory consumption.
-
Add the
lpad
function, which extends a string to a given length by prepending characters. -
Improve PostgreSQL compatibility:
-
Permit qualifying function names in SQL queries with the name of the schema and optionally the database to which the function belongs, as in
pg_catalog.abs(-1)
(#4293).Presently all built-in functions belong to the system
mz_catalog
orpg_catalog
schemas. -
Add an
oid
type to represent PostgreSQL object IDs. -
Add basic support for array types, including the new
array_to_string
function. -
Add the
current_schemas
,obj_description
,pg_table_is_visible
, andpg_encoding_to_char
compatibility functions.
Together these changes enable the
\l
,\d
,\dv
,\dt
,\di
commands in the psql terminal. -
-
Correct a query optimization that could misplan queries that referenced the same relation multiple times with varying filters (#4361).
-
Rename the output columns for
SHOW
statements to match the PostgreSQL convention of using all lowercase characters with words separated by underscores.For example, the
SHOW INDEX
command now returns a column namedseq_in_index
rather thanSeq_in_index
. This makes it possible to refer to the column without quoting when supplying aWHERE
clause.The renamings are described in more detail in the documentation for each
SHOW
command that changed: -
Expose metadata about the running Materialize instance in the new system catalog, which contains various sources, tables, and views that can be queried via SQL.
-
Rename the
global_id
column of themz_avro_ocf_sinks
andmz_kafka_sinks
tables tosink_id
, for better consistency with the other system catalog tables. -
Support Kafka sources on topics that use Zstandard compression (#4342).
v0.4.3
Released on 17 September 2020.
-
Permit adjusting the logical compaction window on a per-index basis via the
logical_compaction_window
parameter to the newALTER INDEX
statement. -
Add the
uuid
type to efficiently represent universally-unique identifiers (UUIDs). -
Report the
integer_datetime
parameter ason
to ensure that PgJDBC correctly decodes date and time values returned by prepared statements (#4117). -
Fix a bug in the query optimizer that could result in incorrect plans for queries involving
UNION
operators and literals (#4195).
v0.4.2
Released on 3 September 2020.
-
Remove the
max_timestamp_batch_size
WITH
option from sources. Materialize now automatically selects the optimal batch size. Backwards-incompatible change. -
Restore support for specifying multiple Kafka broker addresses in Kafka sources (#3986).
This fixes a regression introduced in v0.4.1.
-
Sort the output of
SHOW COLUMNS
by the order in which the columns are defined in the targeted source, table, or view. Prior versions did not guarantee any particular ordering. -
Improve memory utilization:
-
Reduce memory usage of outer joins when the join key consists only of simple column equalities (#4047).
-
Consume only a constant amount of memory when computing a
min
ormax
aggregation on an append-only source (#3994).
-
-
Always permit memory profiling via the
/prof
web UI, even if theMALLOC_CONF
environment variable is not configured to enable profiling {% gh 4005 %}. -
Handle large
VALUES
expressions. Previously,VALUES
expressions with more than several hundred entries would cause a stack overflow (#3995). -
Add the
mz_records_per_dataflow_global
metric to expose the number of active records in each dataflow (#4036).
v0.4.1
Released on 19 August 2020.
-
Known regression. Specifying multiple Kafka broker addresses in Kafka sources, as in
CREATE SOURCE ... FROM KAFKA BROKER 'host1:9092,host2:9092' ...;
is incorrectly prohibited in this version. This change was unintentional and is reverted in v0.5.0.
-
Enhance internal monitoring tools:
-
Improve robustness of several source types:
-
Permit broker addresses in Kafka sources and Kafka sinks to use IP addresses in addition to hostnames.
-
Handle Snappy-encoded Avro OCF files.
-
In Avro sources that use the Debezium envelope, automatically filter out duplicate records generated by Debezium’s PostgreSQL connector.
This brings support for the PostgreSQL connector on par with the support for the MySQL connector.
-
-
Improve the performance of the
TopK
operator (#3758). -
Add several new SQL features:
-
Add support for
LATERAL
subqueries in joins.LATERAL
subqueries can be used to express Top-K by group queries -
Add the regular expression matching operators
~
,~*
,!~
, and!~*
, which report whether a string does or does not match a regular expression. -
Add the
split_part
function, which splits a string on a delimiter and returns one of the resulting chunks. -
Allow ordinal references in
GROUP BY
clauses to refer to items in theSELECT
list that are formed from arbitrary expressions, as in:SELECT a + 1, sum(b) FROM ... GROUP BY 1;
Previously, Materialize only handled ordinal references to items that were simple column references, as in:
SELECT a, sum(b) FROM ... GROUP BY 1;
-
-
Fix two PostgreSQL compatibility issues:
-
Change the text format of the
timestamp with time zone
type to match PostgreSQL (#3798). -
Respect client-provided parameter types in prepared statements (#3625).
-
v0.4.0
Released on 27 July 2020.
-
Rename the
--threads
command-line option to--workers
, since it controls only the number of dataflow workers that Materialize will start, not the total number of threads that Materialize may use. The short form of this option,-w
, remains unchanged. Backwards-incompatible change. -
Add the
--experimental
command-line option to enable a new experimental mode, which grants access to experimental features at the risk of compromising stability and backwards compatibility. Forthcoming features that require experimental mode will be marked as such in their documentation. -
Support SASL PLAIN authentication for Kafka sources. Notably, this allows Materialize to connect to Kafka clusters hosted by Confluent Cloud.
-
Do not require Kafka Avro sources that use
ENVELOPE NONE
orENVELOPE DEBEZIUM
to have key schemas whose fields are a subset of the value schema (#3677). -
Teach Kafka sinks to emit Debezium style consistency metadata if the new
consistency
option is enabled. The consistency metadata is emitted to a Kafka topic alongside the data topic; the combination of these two topics is considered the Materialize change data capture (CDC) format. -
Introduce the
AS OF
andWITH SNAPSHOT
options forCREATE SINK
to provide more control over what data the sink will produce. -
Change the default
TAIL
snapshot behavior fromWITHOUT SNAPSHOT
toWITH SNAPSHOT
. Backwards-incompatible change. -
Actively shut down Kafka sinks that encounter an unrecoverable error, rather than attempting to produce data until the sink is dropped (#3419).
-
Improve the performance, stability, and standards compliance of Avro encoding and decoding (#3397, #3557, #3568, #3579, #3583, #3584, #3585).
-
Support record types, which permit the representation of nested data in SQL. Avro sources also gain support for decoding nested records, which were previously disallowed, into this new SQL record type.
-
Allow dropping databases with cross-schema dependencies (#3558).
-
Avoid crashing if
date_trunc('week', ...)
is called on a date that is in the first week of a month (#3651). -
Ensure the built-in
mz_avro_ocf_sinks
,mz_catalog_names
, andmz_kafka_sinks
views always reflect the latest state of the system (#3682). Previously these views could contain stale data that did not reflect the results of recentCREATE
orDROP
statements. -
Introduce several new SQL statements:
-
ALTER RENAME
renames an index, sink, source, or view. -
SHOW CREATE INDEX
displays information about an index. -
EXPLAIN <statement>
is shorthand forEXPLAIN OPTIMIZED PLAN FOR <statement>
. -
SHOW TRANSACTION ISOLATION LEVEL
displays a dummy transaction isolation level,serializable
, in order to satisfy various PostgreSQL tools that depend upon this statement (#800).
-
-
Adjust the semantics of several SQL expressions to match PostgreSQL’s semantics:
-
Consider
NULL < ANY(...)
to be false andNULL < ALL (...)
to be true when the right-hand side is the empty set (#3319). Backwards-incompatible change. -
Change the meaning of ordinal references in a
GROUP BY
clause, as inSELECT ... GROUP BY 1
, to refer to columns in the target list, rather than columns in the input set of tables (#3686). Backwards-incompatible change. -
When casting from
numeric
orfloat
toint
, round to the nearest integer rather than discarding the fractional component (#3700). Backwards-incompatible change. -
Allow expressions in
GROUP BY
to refer to output columns, not just input columns, to match PostgreSQL. In the case of ambiguity, the input column takes precedence (#1673). -
Permit expressions in
ORDER BY
to refer to input columns that are not selected for output, as inSELECT rel.a FROM rel ORDER BY rel.b
(#3645).
-
v0.3.1
Released on 3 July 2020.
-
Improve the ingestion speed of Kafka sources with multiple partitions by sharding responsibility for each partition across the available worker threads (#3190).
-
Improve JSON decoding performance when casting a
text
column tojson
, as inSELECT text_col::json
(#3195). -
Simplify converting non-materialized views into materialized views with
CREATE DEFAULT INDEX ON foo
. This creates the same index on a view that would have been created if you had usedCREATE MATERIALIZED VIEW
. -
Permit control over the timestamp selection logic on a per-Kafka-source basis via three new
WITH
options:timestamp_frequency_ms
max_timestamp_batch_size
topic_metadata_refresh_interval_ms
-
Support assigning aliases for column names when referecing a relation in a
SELECT
query, as in:SELECT col1_alias, col2_alias FROM rel AS rel_alias (col1_alias, col2_alias);
-
Improve the string function suite:
- Add the trim family of functions to trim characters from the start and/or
end of strings. The new functions are
btrim
,ltrim
,rtrim
, andtrim
. - Add the SQL standard length functions
char_length
,octet_length
, andbit_length
. - Improve the
length
function’s PostgreSQL compatibility by acceptingbytea
as the first argument, rather thantext
, when getting the length of encoded bytes.
- Add the trim family of functions to trim characters from the start and/or
end of strings. The new functions are
-
Enhance compatibility with PostgreSQL string literals:
- Allow the
TYPE 'string'
syntax to explicitly specify the type of a string literal. This syntax is equivalent toCAST('string' AS TYPE)
and'string'::TYPE
. - Support escape string literals of the form
E'hello\nworld'
, which permit C-style escapes for several special characters. - Automatically coerce string literals to the appropriate type, as required by their usage in calls to functions and operators (#481).
- Allow the
-
Produce runtime errors in several new situations:
-
Fix several misplanned queries:
-
Fix several bugs related to negative intervals:
-
Expose monitoring metrics for Kafka sinks (#3336).
v0.3.0
Released on 1 June 2020.
-
Support temporary views.
-
Improve the reliability and performance of Kafka sources, especially when the underlying Kafka topic has many partitions and data is not evenly distributed across the partitions.
-
Infer primary keys based on the key schema for Kafka Avro sources that use the Debezium envelope to facilitate query optimization. This corrects a regression in v0.2.2.
The new
ignore_source_keys
option can be set totrue
to explicitly disable this behavior. -
In Avro sources that use the Debezium envelope, automatically filter out duplicate records generated by Debezium’s MySQL connector.
This release does not include support for deduplicating records generated by other Debezium connectors (e.g., PostgreSQL).
-
Automatically refresh AWS credentials for Kinesis sources when credentials are sourced from an IAM instance or container profile (#2928).
-
Support TLS encryption for SQL and HTTP connections.
-
Improve compatibility with the pg8000 Python driver, and likely other drivers, by including the number of rows returned by a
SELECT
statement in the SQL protocol command tag (#2987). -
Correct plans for
OUTER
joins that appear within subqueries, which could previously cause Materialize to crash (#3048). -
Prevent a small memory leak when a TAIL command is uncleanly terminated (#2996).
-
Adjust the precedence of several SQL operators to match PostgreSQL (#3087).
-
Add a new command-line option,
-vv
, that prints some build information in addition to the version.
v0.2.2
Released on 11 May 2020.
-
Introduce an “upsert” envelope for sources that follow the Kafka key–value convention for representing inserts, upserts, and deletes. See the Upsert envelope section of the
CREATE SOURCE
docs for details. -
Enable connections to Kafka brokers using either SSL authentication or Kerberos authentication. This includes support for SSL authentication with Confluent Schema Registries.
-
Introduce the
AS OF
and [WITH SNAPSHOT
](/sql/tail/#WITH SNAPSHOT or WITHOUT SNAPSHOT) options forTAIL
to provide more control over what dataTAIL
will produce. -
Improve reliability of Kinesis sources by rate-limiting Kinesis API calls. (#2807)
-
Improve startup speed for Kafka sources with many partitions by fetching from partitions evenly, rather than processing partitions sequentially, one after the next. (#2936)
-
Add two
WITH
options to Kafka sources:- The
group_id_prefix
option affords some control over the consumer group ID Materialize uses when consuming from Kafka. - The
statistics_interval_ms
controls how often the underlying Kafka library reports statistics to the logs.
- The
-
Improve reliability and performance of Kafka sinks with a smarter buffering and flushing policy (#2855) and a faster Avro encoding implementation (#2907).
-
Support decoding enum (#2923) and union (#2943) values in Avro-formatted sources.
-
Produce runtime errors when some numeric operations overflow, rather than silently wrapping around. (#2896)
-
Humanize the output of
SHOW CREATE VIEW
by avoiding quotes around identifiers that do not require them. (#2667) -
Add the
generate_series
table function. (#2857) -
Fix several bugs in the query optimizer that could cause crashes or incorrect query plans. (#2731, #2724)
-
Correct output for
LEFT JOIN
s when the same join key appears multiple times in the relation on the left-hand side of the join. (#2724) -
Disallow trailing commas in
SELECT
lists, so thatSELECT a, b, FROM table
results in a syntax error outright, rather than parsing asSELECT a, b, "from" AS table
, which would result in a confusing error about the unknown column"from"
. (#2893)
v0.2.1
Released on 30 April 2020.
-
Allow query parameters (
$1
,$2
, etc) to appear inEXPLAIN
statements. -
Avoid crashing if queries are executed without a value for each parameter in the query.
-
Support runtime errors in dataflows. Views that encounter an error (e.g., division by zero) while executing will report that error when queried. Previously, the error would be silenced, and the erroring expression would be replaced with
NULL
. -
Permit filtering the output of several
SHOW
commands with aWHERE
orLIKE
clause: -
Support reading from Kinesis streams with multiple shards. For details, about Kinesis sources, see CREATE SOURCE: Kinesis Data Streams.
v0.2.0
Released on 11 April 2020.
-
Require the
-w
/--threads
command-line option. Consult the CLI documentation to determine the correct value for your deployment. -
Introduce the
--listen-addr
command-line option to control the address and port thatmaterialized
binds to. -
Make formatting and parsing for
real
anddouble precision
numbers more consistent with PostgreSQL. The stringsNaN
, and[+-]Infinity
are accepted as input, to select the special not-a-number and infinity states, respectively, of floating-point numbers. -
Allow CSV-formatted sources to include a header row (
CREATE SOURCE ... FORMAT CSV WITH HEADER
). -
Provide the option to name columns in sources (e.g.
CREATE SOURCE foo (col_foo, col_bar)...
). -
Improve conformance of the Avro parser, enabling support for a wider variety of Avro schemas in Avro sources.
-
Introduce Avro Object Container File (OCF) sinks.
-
Make sink output more correct and consistent by writing to a new Kafka topic or file on every restart.
-
Add the
jsonb_agg()
aggregate function. -
Support casts for
time
->text
,time
->interval
,interval
->time
. -
Improve the usability of the
EXPLAIN
statement:-
Change the output format of to make large plans more readable by avoiding nesting.
-
Add
EXPLAIN ... FOR VIEW ...
to display the plan for an existing view. -
Add
EXPLAIN <stage> PLAN FOR ...
to display the plan at various stages of the planning process.
-
v0.1.3
Released on 17 March 2020.
-
Support Amazon Kinesis Data Streams sources.
-
Support the number functions
round(x: N)
andround(x: N, y: N)
, which roundx
to they
th digit after the decimal. (Default 0). -
Support addition and subtraction between
interval
s. -
Support the string concatenation operator,
||
. -
In the event of a crash, print the stack trace to the log file, if logging to a file is enabled, as well as the standard error stream.
v0.1.2
Released on 04 March 2020.
-
Change
SHOW CREATE SOURCE
to render the full SQL statement used to create the source, in the style ofSHOW CREATE VIEW
, rather than displaying a URL that partially describes the source. The URL was a vestigial format used inCREATE SOURCE
statements before v0.1.0. -
Raise the maximum SQL statement length from approximately 8KiB to approximately 64MiB.
-
Support casts from
text
todate
,timestamp
,timestamp with time zone
, andinterval
. -
Support the
IF NOT EXISTS
clause inCREATE VIEW
andCREATE MATERIALIZED VIEW
. -
Attempt to automatically increase the nofile rlimit to acceptable levels, as creating multiple Kafka sources can quickly exhaust the default nofile rlimit on some platforms.
-
Improve CSV parsing speed by 5-6x.
v0.1.1
Released on 22 February 2020.
- Specifying the message name in a Protobuf-formatted source no longer requires a leading period.
- Indexes on sources: You can now create and drop indexes on sources, which lets you automatically store all of a source’s data in an index. Previously, you would have to create a source, and then create a materialized view that selected all of the source’s content.
v0.1.0
Released on 13 February 2020.