This past year Team Materialize struck out to do each day of 2023’s Advent of Code, an annual programming event with thought-provoking problems that you are encouraged to approach from non-standard directions. We figured we’d try and use SQL for the whole thing.
SQL is a bold choice because it is meant for querying data, and not as much for general computation. Several of the problems call for interesting algorithms, specific data structures, and some flexibility. However, Materialize’s core thesis is that you can do so much more with SQL that just query your data. If you want to move operational logic from bespoke code into SQL, you’ll need to be able to express that logic. And so, Advent of Code was a great opportunity to stretch our legs, and fingers, and see just how much logic fits into SQL.
Preliminaries
There’s a lot of content in the month’s problems. There are 49 problems, and although there is some overlap really there is too much to say about all of them. We aren’t going to recount each of the problems, the whimsical backstories, and the shape of the problem inputs. We’ll try and flag some surprising moments, though, and you should dive into those problems if you are keen (they can each be done on their own).
I (Frank) wrote all of my solutions using Materialize’s WITH MUTUALLY RECURSIVE
even when recursion was not required.
This just helped me start writing, as the blocks allow you to just start naming subqueries and writing SQL.
My solutions all had the same skeletal structure:
WITH MUTUALLY RECURSIVE
-- Parse the problem input into tabular form.
lines(line TEXT) AS ( .. ),
-- SQL leading up to part 1.
part1(part1 BIGINT) AS ( .. ),
-- SQL leading up to part 2.
part2(part2 BIGINT) AS ( .. )
SELECT * FROM part1, part2;
As mentioned, we won’t always need recursion. However, we often do use recursion, and may even need it. We’ll call this out, as the use (and ease) of recursion in SQL was one of the main unlocks.
Week one
Day one was largely about text manipulation, specifically extracting numbers from text, and was well-addressed by using regular expressions to manipulate and search the text.
See the solution
Part one
The newly-improved calibration document consists of lines of text; each line originally contained a specific calibration value that the Elves now need to recover. On each line, the calibration value can be found by combining the first digit and the last digit (in that order) to form a single two-digit number.
Consider your entire calibration document. What is the sum of all of the calibration values?
SELECT SUM(LEFT(r, 1)::int * 10 + RIGHT(r, 1)::int) AS part1
FROM (
SELECT regexp_replace(input, '[^\d]', '', 'g') AS r
FROM aoc_1201
);
Part two
Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters: one, two, three, four, five, six, seven, eight, and nine also count as valid “digits”.
Equipped with this new information, you now need to find the real first and last digit on each line.
WITH
lines AS (
SELECT regexp_split_to_table(input, '\n') AS line
FROM aoc_1201
),
slices AS (
SELECT line, index, substring(line, index, width) AS slice
FROM
lines,
generate_series(1, length(line)) AS index,
generate_series(1, 5) AS width
),
numbers (t, n) AS (
VALUES ('0', 0), ('1', 1), ('2', 2), ('3', 3), ('4', 4), ('5', 5), ('6', 6), ('7', 7), ('8', 8), ('9', 9),
('zero', 0), ('one', 1), ('two', 2), ('three', 3), ('four', 4), ('five', 5), ('six', 6), ('seven', 7), ('eight', 8), ('nine', 9)
),
findings AS (
SELECT line, index, n AS number
FROM slices, numbers
WHERE slices.slice = numbers.t
),
first AS ( SELECT DISTINCT ON (line) line, number AS f FROM findings ORDER BY line, index ),
last AS ( SELECT DISTINCT ON (line) line, number AS l FROM findings ORDER BY line, index DESC )
SELECT SUM(f * 10 + l)
FROM first, last
WHERE first.line = last.line
Contributors
Day 1 was brought to you by: @chass, @def-, @doy-materialize, @frankmcsherry, @josharenberg, @morsapaes, @nrainer-materialize
Day two was largely about aggregation: rolling up counts and maxima for games involving numbers of colored cubes; SQL did great here.
See the solution
Part one
Given a table with the following format:
game_id | set_id | green_cnt | red_cnt | blue_cnt
----------+--------+-----------+---------+----------
Game 4 | set_2 | 12 | 0 | 0
...
WITH game_cnt AS (
SELECT split_part(game_id,' ', 2)::int AS game_id,
COUNT(set_id) AS total_set_cnt,
COUNT(set_id) FILTER (WHERE (green_cnt <= 13) AND (red_cnt <= 12) AND (blue_cnt <= 14)) AS possible_set_cnt
FROM aoc_1202
GROUP BY game_id
)
SELECT SUM(game_id) FROM game_cnt WHERE total_set_cnt = possible_set_cnt;
Part two
WITH game_min AS (
SELECT split_part(game_id,' ', 2)::int AS game_id,
MAX(green_cnt) AS green_min,
MAX(red_cnt) AS red_min,
MAX(blue_cnt) AS blue_min
FROM aoc_1202
GROUP BY split_part(game_id,' ', 2)::int
)
SELECT SUM(green_min*red_min*blue_min) FROM game_min;
Part one + two in one go!
-- Pre-supposes a view `input(input TEXT)` containing the string from AOC
with mutually recursive
-- Parse the input up
lines(line TEXT) as (select regexp_split_to_table(input, '\n') as line from input),
games(game TEXT, report TEXT) as (select regexp_split_to_array(line, ':')[1], regexp_split_to_array(line, ':')[2] from lines),
round(game TEXT, visible TEXT) as (select game, regexp_split_to_table(report, ';') from games),
bacon(game TEXT, color TEXT) as (select game, regexp_split_to_table(visible, ',') from round),
parsed(game INT, color TEXT, number INT) as (
select
substring(game, 5)::INT as game,
regexp_split_to_array(color, ' ')[3] as color,
regexp_split_to_array(color, ' ')[2]::INT as number
from bacon
),
-- PART 1
limits(color TEXT, number INT) as (SELECT * FROM (VALUES ('red', 12), ('green', 13), ('blue', 14))),
bad_news(game INT) as (
select game
from parsed, limits
where parsed.color = limits.color
AND parsed.number > limits.number
),
plausible(game INT) as (select distinct parsed.game from parsed left join bad_news on(parsed.game = bad_news.game) where bad_news.game IS NULL),
part1(part1 BIGINT) as (select SUM(game) from plausible),
-- PART 2
maximum(game INT, color TEXT, number INT) as (select game, color, max(number) from parsed GROUP BY game, color),
red(game INT) as (select game from maximum, generate_series(1, number) where color = 'red'),
blue(game INT) as (select game from maximum, generate_series(1, number) where color = 'blue'),
green(game INT) as (select game from maximum, generate_series(1, number) where color = 'green'),
power(game INT, product BIGINT) as (SELECT red.game, count(*) from red, blue, green where red.game = blue.game and blue.game = green.game GROUP BY red.game),
part2(part2 BIGINT) as (select sum(product)::BIGINT from power)
select * from part1, part2;
Contributors
Day 2 was brought to you by: @def-, @frankmcsherry, @morsapaes
Day three has inputs in grid form, where there can be interaction between multiple lines (with symbols above or below others).
You are looking for runs of numerals, and I used WMR
to track these down; reportedly you can also use regular expressions, but I was not clever enough for that!
See the solution
Part one + two in one go!
-- Pre-supposes a view `input(input TEXT)` containing the string from AOC
WITH MUTUALLY RECURSIVE
-- PART 0
-- Parse the input as lines of text with line numbers.
lines(line TEXT, row_idx INT) AS (
SELECT
regexp_split_to_array(input, '\n')[row_idx],
row_idx
FROM
input,
generate_series(1, (SELECT COUNT(*)::INT FROM (SELECT regexp_split_to_table(input, '\n') FROM input))) as row_idx
),
chars(symbol TEXT, row_idx INT, col_idx INT) AS (
SELECT
substring(line, start, 1),
row_idx,
start
FROM
lines,
generate_series(1, length(line)) as start
WHERE
substring(line, start, 1) != '.'
),
numerals(number TEXT, row_idx INT, col_idx INT) AS (
SELECT symbol, row_idx, col_idx
FROM chars
WHERE symbol IN ( VALUES ('0'), ('1'), ('2'), ('3'), ('4'), ('5'), ('6'), ('7'), ('8'), ('9') )
),
symbols(symbol TEXT, row_idx INT, col_idx INT) AS (
SELECT symbol, row_idx, col_idx
FROM chars
WHERE symbol NOT IN ( VALUES ('0'), ('1'), ('2'), ('3'), ('4'), ('5'), ('6'), ('7'), ('8'), ('9') )
),
-- PART 1
-- Recursively build up ranges of numerals that are "active", in the sense of being adjacent to a symbol.
-- Each range has an accumulated number (as a string), a row index, a column index and length of the run.
active(number TEXT, row_idx INT, col_idx INT, length INT) AS (
-- Base case: numerals adjacent to a symbol
SELECT numerals.*, 1
FROM
numerals,
symbols,
generate_series(-1, 1) row_off,
generate_series(-1, 1) col_off
WHERE numerals.row_idx = symbols.row_idx + row_off
AND numerals.col_idx = symbols.col_idx + col_off
UNION
-- Inductive case 1: Join to the left
SELECT numerals.number || active.number, numerals.row_idx, numerals.col_idx, active.length + 1
FROM numerals, active
WHERE numerals.row_idx = active.row_idx
AND numerals.col_idx = active.col_idx - 1
UNION
-- Inductive case 2: Join to the right
SELECT active.number || numerals.number, numerals.row_idx, active.col_idx, active.length + 1
FROM numerals, active
WHERE numerals.row_idx = active.row_idx
AND numerals.col_idx = active.col_idx + active.length
),
parts(number INT, row_idx INT, col_idx INT, length INT) AS (
SELECT active.number::INT, row_idx, col_idx, length
FROM active
WHERE (active.row_idx, active.col_idx-1) NOT IN (SELECT row_idx, col_idx FROM numerals)
AND (active.row_idx, active.col_idx+length) NOT IN (SELECT row_idx, col_idx FROM numerals)
),
part1(part1 BIGINT) AS ( SELECT SUM(parts.number::INT) FROM parts ),
-- PART 2
-- A "gear" is a `*` adjacent to exactly two part numbers. We want the sum over gears of their product.
-- A gear is identified by a location, which we will want to attempt to join with part numbers.
gear_adjacent(row_idx INT, col_idx INT, number INT, part_row INT, part_col INT) AS (
SELECT DISTINCT symbols.row_idx, symbols.col_idx, parts.number, parts.row_idx, parts.col_idx
FROM
symbols,
generate_series(-1, 1) gear_r_off,
generate_series(-1, 1) gear_c_off,
parts,
generate_series(parts.col_idx, parts.col_idx + parts.length - 1) part_col
WHERE symbols.symbol = '*'
AND symbols.row_idx + gear_r_off = parts.row_idx
AND symbols.col_idx + gear_c_off = part_col
),
gears(row_idx INT, col_idx INT) AS (
SELECT row_idx, col_idx
FROM gear_adjacent
GROUP BY row_idx, col_idx
HAVING COUNT(*) = 2
),
gear_products(row_idx INT, col_idx INT, product INT) AS (
SELECT DISTINCT gears.row_idx, gears.col_idx, p1.number * p2.number
FROM gears, gear_adjacent p1, gear_adjacent p2
WHERE gears.row_idx = p1.row_idx
AND gears.col_idx = p1.col_idx
AND gears.row_idx = p2.row_idx
AND gears.col_idx = p2.col_idx
AND (p1.part_row != p2.part_row OR p1.part_col != p2.part_col)
),
part2(part2 BIGINT) AS ( SELECT SUM(product) FROM gear_products)
SELECT * FROM part1, part2;
Contributors
Day 3 was brought to you by: @frankmcsherry, @morsapaes
Day four introduced scratch cards where each line of input has some winners and losers. This was easy SQL until part two, in which winners give you other scratch cards, which have winners that give you other scratch cards, which .. you can see the recursion. Despite being wordy and complicated, the SQL isn’t so bad:
-- PART 2
-- Each card provides a copy of the next `score` cards.
expanded(card INT, score BIGINT) AS (
SELECT * FROM matches
UNION ALL
SELECT
matches.card,
matches.score
FROM
expanded,
matches,
generate_series(1, expanded.score) as step
WHERE
expanded.card + step = matches.card
),
part2(part2 BIGINT) AS ( SELECT COUNT(*) FROM expanded)
This would be tricky to do with non-recursive SQL, as the data itself tells us how to unfold the results. Hooray for recursion!
See the solution
Part one
WITH parsed AS (
SELECT regexp_split_to_table(input, '\n') AS line FROM aoc_1204
),
numbers AS (
SELECT split_part(line,':',1) AS card_id,
replace(split_part(line,':',2),'|','') AS nrs
FROM parsed
),
arr AS (
SELECT card_id,
nrs,
regexp_split_to_array(ltrim(rtrim(nrs)),'\s') AS nrs_arr
FROM numbers
),
winning AS (
SELECT card_id,
unnest(array_remove(nrs_arr,'')) nr,
ROW_NUMBER() OVER (PARTITION BY card_id) AS row_num
FROM arr
GROUP BY card_id, nr HAVING COUNT(*)>1
ORDER BY card_id
),
winning_points AS (
SELECT ROUND(EXP(SUM(LN(CASE WHEN row_num = 1 THEN row_num ELSE 2 END)))) AS points
FROM winning
GROUP BY card_id
)
SELECT SUM(points)
FROM winning_points;
Part two
WITH MUTUALLY RECURSIVE
lines(line string) AS (
SELECT
regexp_split_to_table(input, '\n') AS line
FROM
aoc_1204
),
cards(match string[]) AS (
SELECT
regexp_match(line, 'Card +(\d+): (.*)') AS match
FROM
lines
),
card_parts(card_id int, parts string[]) AS (
SELECT
match[1]::int AS card_id,
regexp_split_to_array(match[2], ' \| ') AS parts
FROM
cards
),
winners(card_id int, val int) AS (
SELECT
card_id,
regexp_split_to_table(trim(parts[1]), '\s+')::int AS val
FROM
card_parts
),
ours(card_id int, val int) AS (
SELECT
card_id,
regexp_split_to_table(trim(parts[2]), '\s+')::int AS val
FROM
card_parts
),
count_winning_numbers(card_id int, count int) AS (
SELECT
ours.card_id,
count(winners.val)::int AS count
FROM
ours LEFT OUTER JOIN winners ON (
ours.card_id = winners.card_id AND
ours.val = winners.val
)
GROUP BY ours.card_id
),
prizes(card_id int, prize_id int) AS (
SELECT
card_id,
prize_id
FROM
count_winning_numbers CROSS JOIN generate_series(card_id + 1, card_id + count) AS prize_id
UNION
SELECT
0 AS card_id,
ours.card_id AS prize_id
FROM
ours
),
multipliers(card_id int, multiplier int) AS (
SELECT
prizes.prize_id AS card_id,
SUM(coalesce(multipliers.multiplier, 1))::int AS multiplier
FROM
prizes left outer JOIN multipliers ON (
prizes.card_id = multipliers.card_id
)
GROUP BY prizes.prize_id
)
SELECT
SUM(multiplier) AS answer
FROM
multipliers;
Part one + two in one go!
-- Pre-supposes a view `input(input TEXT)` containing the string FROM AOC
WITH MUTUALLY RECURSIVE
-- PART 0
-- Parse the input as lines of text with line numbers.
lines(line TEXT) AS (
SELECT regexp_split_to_table(input, '\n')
FROM input
),
blocks(card TEXT, wins TEXT, have TEXT) AS (
SELECT
TRIM (regexp_split_to_array(line, '(:|\|)')[1]),
TRIM (regexp_split_to_array(line, '(:|\|)')[2]),
TRIM (regexp_split_to_array(line, '(:|\|)')[3])
FROM
lines
),
parsed(card INT, wins TEXT[], have TEXT[]) AS (
SELECT
regexp_match(card, '[0-9]+')[1]::INT,
regexp_split_to_array(wins, ' '),
regexp_split_to_array(have, ' ')
FROM blocks
),
-- PART 1
-- Count "have"s in "wins" for each row, exponentiate, sum.
matches(card INT, score BIGINT) AS (
SELECT card, (
SELECT COUNT(*)
FROM (
SELECT unnest(wins) w
INTERSECT
SELECT unnest(have) w
)
WHERE w != ''
)
FROM parsed
),
part1(part1 NUMERIC) AS (
SELECT SUM(pow(2, score - 1))::NUMERIC
FROM matches
WHERE score > 0
),
-- PART 2
-- Each card provides a copy of the next `score` cards.
-- This could be prefix sum if we want to be clever ...
expanded(card INT, score BIGINT) AS (
SELECT * FROM matches
UNION ALL
SELECT
matches.card,
matches.score
FROM
expanded,
matches,
generate_series(1, expanded.score) as step
WHERE
expanded.card + step = matches.card
),
part2(part2 BIGINT) AS ( SELECT COUNT(*) FROM expanded)
select * from part1, part2;
Contributors
Day 4 was brought to you by: @chass, @doy-materialize, @frankmcsherry, @morsapaes
Day five was a bit of a bear. It was the same day we were doing a Materialize on-site and we were all a bit distracted, but also it was pretty beefy. You first have to “route” various elements through a sequence of remappings, whose length is defined in the data. You then have to expand that out to routing whole intervals (rather than elements), and .. there is just lots of potential for error. I used recursive SQL to handle all the remapping, but other folks just expanded out their SQL for each of the (ten-ish) remappings.
See the solution
Part one
WITH seeds AS (
SELECT
regexp_split_to_table(
regexp_split_to_array(
regexp_split_to_array(input, '\n')[1],
': '
)[2],
' '
)::bigint AS seed
FROM
input
),
seed_to_soil_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'seed-to-soil map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
seed_to_soil AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
seed_to_soil_lines
),
soil_to_fertilizer_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'soil-to-fertilizer map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
soil_to_fertilizer AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
soil_to_fertilizer_lines
),
fertilizer_to_water_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'fertilizer-to-water map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
fertilizer_to_water AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
fertilizer_to_water_lines
),
water_to_light_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'water-to-light map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
water_to_light AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
water_to_light_lines
),
light_to_temperature_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'light-to-temperature map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
light_to_temperature AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
light_to_temperature_lines
),
temperature_to_humidity_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'temperature-to-humidity map:\n([0-9 \n]*?)\n\n')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
temperature_to_humidity AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
temperature_to_humidity_lines
),
humidity_to_location_lines AS (
SELECT
regexp_split_to_array(
regexp_split_to_table(
regexp_match(input, 'humidity-to-location map:\n([0-9 \n]*)')[1],
'\n'
),
' '
)::bigint[] AS line
FROM
input
),
humidity_to_location AS (
SELECT
line[1] AS dst_base,
line[2] AS src_base,
line[3] AS len
FROM
humidity_to_location_lines
),
soil AS (
SELECT
seed,
coalesce(
MAX(
CASE
WHEN seed >= src_base AND seed < src_base + len
THEN dst_base + (seed - src_base)
ELSE null
END
),
seed
) AS soil
FROM
seeds, seed_to_soil
GROUP BY seed
),
fertilizer AS (
SELECT
soil,
coalesce(
MAX(
CASE
WHEN soil >= src_base AND soil < src_base + len
THEN dst_base + (soil - src_base)
ELSE null
END
),
soil
) AS fertilizer
FROM
soil, soil_to_fertilizer
GROUP BY soil
),
water AS (
SELECT
fertilizer,
coalesce(
MAX(
CASE
when fertilizer >= src_base AND fertilizer < src_base + len
then dst_base + (fertilizer - src_base)
else null
END
),
fertilizer
) AS water
FROM
fertilizer, fertilizer_to_water
GROUP BY fertilizer
),
light AS (
SELECT
water,
coalesce(
MAX(
CASE
WHEN water >= src_base AND water < src_base + len
THEN dst_base + (water - src_base)
ELSE null
END
),
water
) AS light
FROM
water, water_to_light
GROUP BY water
),
temperature AS (
SELECT
light,
coalesce(
MAX(
CASE
WHEN light >= src_base AND light < src_base + len
THEN dst_base + (light - src_base)
ELSE null
END
),
light
) AS temperature
FROM
light, light_to_temperature
GROUP BY light
),
humidity AS (
SELECT
temperature,
coalesce(
MAX(
CASE
WHEN temperature >= src_base AND temperature < src_base + len
THEN dst_base + (temperature - src_base)
ELSE null
END
),
temperature
) AS humidity
FROM
temperature, temperature_to_humidity
GROUP BY temperature
),
location AS (
SELECT
humidity,
coalesce(
MAX(
CASE
WHEN humidity >= src_base AND humidity < src_base + len
THEN dst_base + (humidity - src_base)
ELSE null
END
),
humidity
) AS location
FROM
humidity, humidity_to_location
GROUP BY humidity
)
SELECT
MIN(location) AS answer
FROM
location;
Part one + two in one go!
WITH MUTUALLY RECURSIVE
blocks(head TEXT, body TEXT) AS (
SELECT
split_part(regexp_split_to_table(input, '\n\n'), ':', 1),
split_part(regexp_split_to_table(input, '\n\n'), ':', 2)
FROM
input
),
seeds(seed BIGINT) AS (
SELECT regexp_split_to_table(trim(body), ' ')::BIGINT
FROM blocks
WHERE head = 'seeds'
),
entry0(src_name TEXT, dst_name TEXT, dst_idx TEXT, src_idx TEXT, len TEXT) AS (
SELECT
split_part(split_part(head, ' ', 1), '-', 1),
split_part(split_part(head, ' ', 1), '-', 3),
split_part(regexp_split_to_table(body, '\n'), ' ', 1),
split_part(regexp_split_to_table(body, '\n'), ' ', 2),
split_part(regexp_split_to_table(body, '\n'), ' ', 3)
FROM
blocks
WHERE
head != 'seeds'
),
entry(src_name TEXT, dst_name TEXT, src_idx BIGINT, dst_idx BIGINT, len BIGINT) AS (
SELECT
src_name,
dst_name,
src_idx::BIGINT,
dst_idx::BIGINT,
len::BIGINT
FROM
entry0
WHERE
src_idx != ''
),
-- PART 1
-- Our active inventory of .. "stuff"
active(name TEXT, idx BIGINT) AS (
SELECT 'seed', seed FROM seeds
UNION ALL
SELECT
intent.dst_name,
COALESCE(intent.idx + (entry.dst_idx - entry.src_idx), idx)
FROM intent LEFT JOIN entry ON (
intent.src_name = entry.src_name AND
intent.dst_name = entry.dst_name AND
intent.idx BETWEEN entry.src_idx AND entry.src_idx + len - 1)
),
-- We would like to perform this mapping, but must find a range.
intent(src_name TEXT, dst_name TEXT, idx BIGINT) AS (
SELECT DISTINCT entry.src_name, dst_name, idx
FROM active, entry
WHERE active.name = entry.src_name
),
part1(part1 BIGINT) AS (
SELECT MIN(idx) FROM active WHERE name = 'location'
),
-- PART 2
-- Now we are doing *ranges* of seeds, rather than seed identifiers.
-- They are big ranges, so we'll need to be smarter!
seeds2(start_idx BIGINT, end_idx BIGINT) AS (
SELECT
regexp_split_to_array(trim(body), ' ')[2*x-1]::BIGINT,
regexp_split_to_array(trim(body), ' ')[2*x-1]::BIGINT + regexp_split_to_array(trim(body), ' ')[2*x]::BIGINT
FROM
blocks,
generate_series(1, array_length(regexp_split_to_array(trim(body), ' '), 1)/2) x
WHERE head = 'seeds'
),
active2(name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
SELECT 'seed', start_idx, end_idx
FROM seeds2
UNION
SELECT
dst_name,
clipped_start + (entry_dst - entry_start),
clipped_end + (entry_dst - entry_start)
FROM intersection
UNION
SELECT
name,
start_idx,
end_idx
FROM hole
),
-- We would like to perform this mapping, but must find a range.
intent2(src_name TEXT, dst_name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
SELECT DISTINCT entry.src_name, dst_name, start_idx, end_idx
FROM active2, entry
WHERE active2.name = entry.src_name
),
-- Each mapping has a potential intersection with a requested range.
intersection(src_name TEXT, dst_name TEXT, start_idx BIGINT, end_idx BIGINT, entry_start BIGINT, entry_end BIGINT, clipped_start BIGINT, clipped_end BIGINT, entry_dst BIGINT) AS (
SELECT
intent2.src_name,
intent2.dst_name,
intent2.start_idx,
intent2.end_idx,
entry.src_idx,
entry.src_idx + entry.len,
GREATEST(start_idx, entry.src_idx),
LEAST(end_idx, entry.src_idx + entry.len),
entry.dst_idx
FROM intent2, entry
WHERE intent2.src_name = entry.src_name
AND intent2.dst_name = entry.dst_name
AND GREATEST(intent2.start_idx, entry.src_idx)
< LEAST(intent2.end_idx, entry.src_idx + entry.len)
),
-- We may have holes in our intervals. Each intersection's start and end is the end and
-- start, respectively, of some hole we may have that needs to remain the identity.
hole(name TEXT, start_idx BIGINT, end_idx BIGINT) AS (
SELECT * FROM (
SELECT
dst_name,
clipped_end start_idx,
(
SELECT COALESCE(MIN(i2.clipped_start), i1.end_idx)
FROM intersection i2
WHERE i2.clipped_start >= i1.clipped_end
AND i2.clipped_start < i1.end_idx
AND i1.src_name = i2.src_name
AND i1.dst_name = i2.dst_name
AND i1.start_idx = i2.start_idx
AND i1.end_idx = i2.end_idx
) end_idx
FROM intersection i1
UNION
SELECT DISTINCT
dst_name,
start_idx,
(
SELECT COALESCE(MIN(i2.clipped_start), i1.end_idx)
FROM intersection i2
WHERE i2.clipped_start >= i1.start_idx
AND i2.clipped_start < i1.end_idx
AND i1.src_name = i2.src_name
AND i1.dst_name = i2.dst_name
AND i1.start_idx = i2.start_idx
AND i1.end_idx = i2.end_idx
)
FROM intent2 i1
)
WHERE start_idx < end_idx
),
part2(part2 BIGINT) AS ( SELECT MIN(start_idx) FROM active2 WHERE name = 'location')
SELECT * FROM part1, part2;
Contributors
Day 5 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize
Day six was about whether you knew (or were willing to learn about) the quadratic formula.
See the solution
Part one
WITH options AS
(
SELECT
(floor((time - sqrt(time * time - 4 * record)) / 2) + 1)::int low,
(ceil((time + sqrt(time * time - 4 * record)) / 2) - 1)::int hi,
FROM input
)
SELECT exp(sum(ln(hi - low + 1)))::int
FROM options;
Part one + two in one go!
WITH MUTUALLY RECURSIVE
ties(slower NUMERIC, faster NUMERIC) AS (
SELECT
(time + sqrt(time * time - 4 * distance)) / 2 as slower,
(time - sqrt(time * time - 4 * distance)) / 2 as faster
FROM input
),
options(choices NUMERIC) AS (
SELECT 1 + FLOOR(slower)::NUMERIC - CEIL(faster)::NUMERIC FROM ties
),
part12(part12 NUMERIC) AS (
SELECT pow(10.0, SUM(log(choices))) FROM options
)
SELECT * FROM part12;
Contributors
Day 6 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize, @petrosagg
Day seven is about scoring poker hands, using some new rules for tie breaking.
This was mostly SQL aggregation, as the numbers of each card in each hand largely determine the outcome, other than tie-breaking where I learned about the translate
function.
See the solution
Part one + two in one go!
-- Hands of cards (e.g. 'AKJQT') and integer bids.
WITH MUTUALLY RECURSIVE
lines(line TEXT) AS ( SELECT regexp_split_to_table(input, '\n') FROM input ),
hands(hand TEXT, bid INT) as (
SELECT regexp_split_to_array(line, ' ')[1],
regexp_split_to_array(line, ' ')[2]::INT
FROM lines
),
cards(hand TEXT, value TEXT, position INT) AS (
SELECT hand, substring(hand, pos, 1), pos
FROM hands, generate_series(1, 5) pos
),
-- Part1
counts(hand TEXT, value TEXT, count INT) AS (
SELECT hand, value, COUNT(*)
FROM cards
GROUP BY hand, value
),
ranked(hand TEXT, bid INT, rank INT, score TEXT) AS (
SELECT
hand,
bid,
CASE WHEN hand IN (SELECT hand FROM counts WHERE count = 5) THEN 1
WHEN hand IN (SELECT hand FROM counts WHERE count = 4) THEN 2
WHEN hand IN (SELECT hand FROM counts WHERE count = 3)
AND hand IN (SELECT hand FROM counts WHERE count = 2) THEN 3
WHEN hand IN (SELECT hand FROM counts WHERE count = 3) THEN 4
WHEN hand IN (SELECT hand FROM (SELECT hand FROM counts WHERE count = 2) GROUP BY hand HAVING COUNT(*) = 2) THEN 5
WHEN hand IN (SELECT hand FROM counts WHERE count = 2) THEN 6
ELSE 7
END,
translate(hand, 'AKQJT98765432', 'ABCDEFGHIJKLM')
FROM
hands
),
part1(part1 INT) AS (
SELECT SUM(r1.bid)
FROM ranked r1, ranked r2
WHERE r1.rank < r2.rank OR (r1.rank = r2.rank AND r1.score <= r2.score)
),
-- Part2: J are now wild for determining rank, but last for score.
wild(hand TEXT, value TEXT, position INT) AS (
SELECT * FROM cards
UNION
SELECT c1.hand, c2.value, c1.position
FROM cards c1, cards c2
WHERE c1.hand = c2.hand
AND c1.value = 'J'
),
wild_hands(hand TEXT, new_hand TEXT) AS (
SELECT DISTINCT w1.hand, w1.value || w2.value || w3.value || w4.value || w5.value
FROM (SELECT * FROM wild w1 WHERE position = 1) w1,
(SELECT * FROM wild w2 WHERE position = 2) w2,
(SELECT * FROM wild w3 WHERE position = 3) w3,
(SELECT * FROM wild w4 WHERE position = 4) w4,
(SELECT * FROM wild w5 WHERE position = 5) w5
WHERE w1.hand = w2.hand
AND w1.hand = w3.hand
AND w1.hand = w4.hand
AND w1.hand = w5.hand
),
wild_cards(hand TEXT, value TEXT, position INT) AS (
SELECT DISTINCT new_hand, substring(new_hand, pos, 1), pos
FROM wild_hands, generate_series(1, 5) pos
),
wild_counts(hand TEXT, value TEXT, count INT) AS (
SELECT hand, value, COUNT(*)
FROM wild_cards
GROUP BY hand, value
),
wild_ranked(hand TEXT, new_hand TEXT, rank INT, score TEXT) AS (
SELECT
hand,
new_hand,
CASE WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 5) THEN 1
WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 4) THEN 2
WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 3)
AND new_hand IN (SELECT hand FROM wild_counts WHERE count = 2) THEN 3
WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 3) THEN 4
WHEN new_hand IN (SELECT hand FROM (SELECT hand FROM wild_counts WHERE count = 2) GROUP BY hand HAVING COUNT(*) = 2) THEN 5
WHEN new_hand IN (SELECT hand FROM wild_counts WHERE count = 2) THEN 6
ELSE 7
END,
translate(hand, 'AKQT98765432J', 'ABCDEFGHIJKLM')
FROM
wild_hands
),
best_hands(hand TEXT, new_hand TEXT, rank INT, score TEXT) AS (
SELECT DISTINCT ON (hand) hand, new_hand, rank, score
FROM wild_ranked
ORDER BY hand, rank, score
),
wild_bids(hand TEXT, bid INT, rank INT, score TEXT) AS (
SELECT hands.hand, hands.bid, rank, score
FROM hands, best_hands
WHERE hands.hand = best_hands.hand
),
part2(part2 INT) AS (
SELECT SUM(r1.bid)
FROM wild_bids r1, wild_bids r2
WHERE r1.rank < r2.rank OR (r1.rank = r2.rank AND r1.score <= r2.score)
)
SELECT * FROM part1, part2;
Contributors
Day 7 was brought to you by: @frankmcsherry, @nrainer-materialize
Week two
Day eight involved some graph navigation (recursion), and some mathematics. The mathematics were of the form “notice that various things are relatively prime”, and it was important to rely on SQL as a tool to support reasoning, as opposed to directly attacking the specified computation. In this case, my problem called for 14,935,034,899,483 steps, and no tool is going to make direct simulation be the right answer.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
route(step TEXT, steps INT) AS (
SELECT substring(input, steps, 1), steps
FROM steps_input, generate_series(1, length(input)) steps
),
-- Part 1: Start at 'AAA` and go until `ZZZ`.
pos1(state TEXT, steps INT) AS (
SELECT 'AAA', 0
UNION ALL
SELECT
CASE WHEN route.step = 'L' THEN paths.left
WHEN route.step = 'R' THEN paths.right
ELSE '???'
END,
pos1.steps + 1
FROM paths, pos1, route
WHERE pos1.state = paths.state
AND 1 + (pos1.steps % 263) = route.steps
AND pos1.state != 'ZZZ'
AND pos1.state != '???'
)
part1(part1 INT) AS (SELECT steps FROM pos1 WHERE pos1.state = 'ZZZ'),
-- Part 2: Start at all '**A` and go until all at '**Z'
pos2(start TEXT, state TEXT, steps INT) AS (
SELECT state, state, 0
FROM paths
WHERE substring(state, 3, 1) = 'A'
UNION ALL
SELECT
pos2.start,
CASE WHEN route.step = 'L' THEN paths.left
WHEN route.step = 'R' THEN paths.right
ELSE '???'
END,
pos2.steps + 1
FROM paths, pos2, route
WHERE pos2.state = paths.state
AND 1 + (pos2.steps % 263) = route.steps
AND substring(pos2.state, 3, 1) != 'Z'
)
SELECT * FROM pos2 WHERE substring(state, 3, 1) = 'Z';
Contributors
Day 8 was brought to you by: @doy-materialize, @frankmcsherry, @nrainer-materialize
Day nine was a refreshing introduction to polynomials, and how if you take enough derivatives of them they end up at zero. The task was to do this, repeatedly difference adjacent measurements, or adjacent differences, etc., until you get all zeros. Then, integrate back up to get projections in the forward and reverse direction. I used recursion here to accommodate the unknown degree of the polynomial (somewhere in the twenties).
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 30)
lines (line TEXT, line_no INT) AS (
SELECT regexp_split_to_array(input, '\n')[i], i
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
),
numbers(value INT, line_no INT, col_no INT) AS (
SELECT regexp_split_to_array(line, ' ')[j]::INT, line_no, j
FROM lines, generate_series(1, array_length(regexp_split_to_array(line, ' '), 1)) j
),
-- Contains non-zero values of differences after each round.
derivatives(value INT, line_no INT, col_no INT, round INT) AS (
SELECT numbers.*, 1
FROM numbers
UNION
SELECT
COALESCE(i2.value, 0) - COALESCE(i1.value, 0),
COALESCE(i1.line_no, i2.line_no),
COALESCE(i1.col_no + 1, i2.col_no),
COALESCE(i1.round, i2.round) + 1
FROM derivatives i1 FULL OUTER JOIN derivatives i2 ON (i1.line_no = i2.line_no AND i1.round = i2.round AND i1.col_no + 1 = i2.col_no)
WHERE COALESCE(i2.value, 0) - COALESCE(i1.value, 0) != 0
AND COALESCE(i1.col_no + 1, i2.col_no) > COALESCE(i1.round, i2.round)
AND COALESCE(i1.col_no + 1, i2.col_no) <= 21
),
-- Accumulate the derivatives at the leading edge
part1(part1 BIGINT) AS (
SELECT SUM(value)
FROM derivatives
WHERE col_no = 21
),
-- Accumulate the derivatives at the preceding edge
part2(part2 BIGINT) AS (
SELECT SUM(pow(-1, round + 1) * value)
FROM derivatives
WHERE col_no = round
)
-- SELECT * FROM derivatives WHERE line_no = 1 ORDER BY round, col_no;
SELECT * FROM part1, part2;
Contributors
Day 9 was brought to you by: @frankmcsherry, @nrainer-materialize
Day ten presents you with a grid of pipe (symbols |
, -
, J
, 7
, F
, and L
), and questions about how long a loop of pipe is, and then how many cells are contained within it. The first part involved recursion, and I used it again for a dynamic programming solution to the second part.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(line TEXT, row_no INT) AS (
SELECT regexp_split_to_array(input, '\n')[i], i
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
),
symbols(symb TEXT, row_no INT, col_no INT) as (
SELECT substring(line, j, 1), row_no, j
FROM lines, generate_series(1, length(line)) j
),
-- Each location that is pipe has two neighbors
edge1(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT
row_no,
col_no,
CASE WHEN symb = '-' THEN row_no
WHEN symb = '|' THEN row_no - 1
WHEN symb = 'F' THEN row_no + 1
WHEN symb = 'L' THEN row_no - 1
WHEN symb = 'J' THEN row_no
WHEN symb = '7' THEN row_no
ELSE NULL
END,
CASE WHEN symb = '-' THEN col_no - 1
WHEN symb = '|' THEN col_no
WHEN symb = 'F' THEN col_no
WHEN symb = 'L' THEN col_no
WHEN symb = 'J' THEN col_no - 1
WHEN symb = '7' THEN col_no - 1
ELSE NULL
END
FROM symbols
WHERE symb != '.' AND symb != 'S'
),
edge2(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT
row_no,
col_no,
CASE WHEN symb = '-' THEN row_no
WHEN symb = '|' THEN row_no + 1
WHEN symb = 'F' THEN row_no
WHEN symb = 'L' THEN row_no
WHEN symb = 'J' THEN row_no - 1
WHEN symb = '7' THEN row_no + 1
ELSE NULL
END,
CASE WHEN symb = '-' THEN col_no + 1
WHEN symb = '|' THEN col_no
WHEN symb = 'F' THEN col_no + 1
WHEN symb = 'L' THEN col_no + 1
WHEN symb = 'J' THEN col_no
WHEN symb = '7' THEN col_no
ELSE NULL
END
FROM symbols
WHERE symb != '.' AND symb != 'S'
),
-- Symmetrized graph
symm(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT r1, c1, r2, c2
FROM (
SELECT * FROM edge1
UNION ALL
SELECT * FROM edge2
UNION ALL
SELECT r2, c2, r1, c1 FROM edge1
UNION ALL
SELECT r2, c2, r1, c1 FROM edge2
UNION ALL
SELECT row_no, col_no, row_no + 1, col_no FROM symbols WHERE symb = 'S'
UNION ALL
SELECT row_no, col_no, row_no, col_no + 1 FROM symbols WHERE symb = 'S'
UNION ALL
SELECT row_no, col_no, row_no - 1, col_no FROM symbols WHERE symb = 'S'
UNION ALL
SELECT row_no, col_no, row_no, col_no - 1 FROM symbols WHERE symb = 'S'
)
GROUP BY r1, c1, r2, c2
HAVING COUNT(*) = 2
),
reach(r INT, c INT) AS (
SELECT row_no, col_no
FROM symbols
WHERE symb = 'S'
UNION
SELECT r2, c2
FROM reach, symm
WHERE r = r1 AND c = c1
),
part1(part1 BIGINT) AS (
SELECT COUNT(*)/2 FROM reach
),
-- Part 2: how many cells are *inside* the loop?
-- All (1, *) and (*, 1) cells have their upper left outside the loop (outer edge of the diagram).
-- Each cell inherits from its UL neighbor, toggled by any pipe except '7' and 'L' pipe.
-- Rewrite the pipe to have symbols, and resolve 'S' to actual oriented pipe.
pipe(r INT, c INT, symb TEXT) AS (
SELECT r, c, symb
FROM reach, symbols
WHERE r = row_no AND c = col_no AND symb != 'S'
UNION
SELECT
row_no,
col_no,
CASE WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 + 1 AND col_no = s2.c2 THEN 'J' -- toggle
WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN '-' -- toggle
WHEN row_no = s1.r1 AND col_no = s1.c1 + 1 AND row_no = s2.r2 - 1 AND col_no = s2.c2 THEN '7' -- no toggle
WHEN row_no = s1.r1 + 1 AND col_no = s1.c1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN 'L' -- no toggle
WHEN row_no = s1.r1 + 1 AND col_no = s1.c1 AND row_no = s2.r2 - 1 AND col_no = s2.c2 THEN '|' -- toggle
WHEN row_no = s1.r1 AND col_no = s1.c1 - 1 AND row_no = s2.r2 AND col_no = s2.c2 - 1 THEN 'F' -- toggle
ELSE '???'
END
FROM symbols, symm s1, symm s2
WHERE symb = 'S'
AND row_no = s1.r1
AND col_no = s1.c1
AND row_no = s2.r1
AND col_no = s2.c1
),
-- Enclosed(1,*) and Enclosed(*,1) are all false.
-- Enclosed(x+1,y+1) = Enclosed(x,y) perhaps toggled by pipe(x,y)
status(r INT, c INT, encl BOOL) AS (
SELECT row_no, col_no, false
FROM symbols
WHERE row_no = 1 OR col_no = 1
UNION
SELECT
row_no + 1,
col_no + 1,
CASE WHEN pipe.symb IN (VALUES ('J'),('-'),('|'),('F')) THEN NOT encl
ELSE encl
END
FROM status LEFT JOIN pipe ON (status.r = pipe.r AND status.c = pipe.c)
JOIN symbols ON (status.r = symbols.row_no AND status.c = symbols.col_no)
),
part2(part2 BIGINT) AS (
SELECT COUNT(*)
FROM status
WHERE encl = true AND (r, c) NOT IN (SELECT r, c FROM pipe)
)
SELECT * FROM part1, part2;
Contributors
Day 10 was brought to you by: @frankmcsherry
Day eleven presents a grid of “galaxies” and has you calculate the distance between pairs (the L1 or “Manhattan” distance, always the sum of absolute values of coordinate differences). Parts one and two were the same, but with different magnitudes of numbers. No recursion here!
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(line TEXT, r INT) AS (
SELECT regexp_split_to_array(input, '\n')[i], i
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) i
),
symbols(symb TEXT, r INT, c INT) as (
SELECT substring(line, j, 1), r, j
FROM lines, generate_series(1, length(line)) j
),
row_gaps(r INT) AS (
SELECT r
FROM symbols
GROUP BY r
HAVING COUNT(*) FILTER (WHERE symb = '#') = 0
),
col_gaps(c INT) AS (
SELECT c
FROM symbols
GROUP BY c
HAVING COUNT(*) FILTER (WHERE symb = '#') = 0
),
-- Part1: Expand space and restrict to galaxies
galaxies(r INT, c INT) AS (
SELECT
r + (SELECT COUNT(*) FROM row_gaps WHERE row_gaps.r < symbols.r),
c + (SELECT COUNT(*) FROM col_gaps WHERE col_gaps.c < symbols.c)
FROM symbols
WHERE symb = '#'
),
-- Sum of L1 distance between distinct galaxies
part1(part1 BIGINT) AS (
SELECT SUM(ABS(g1.r - g2.r) + ABS(g1.c - g2.c))
FROM galaxies g1, galaxies g2
WHERE g1.r < g2.r
OR (g1.r = g2.r AND g1.c < g2.c)
)
-- Part2: Expand space MORE and restrict to galaxies
galaxies2(r INT, c INT) AS (
SELECT
r + 999999 * (SELECT COUNT(*) FROM row_gaps WHERE row_gaps.r < symbols.r),
c + 999999 * (SELECT COUNT(*) FROM col_gaps WHERE col_gaps.c < symbols.c)
FROM symbols
WHERE symb = '#'
),
-- Sum of L1 distance between distinct galaxies
part2(part2 BIGINT) AS (
SELECT SUM(ABS(g1.r - g2.r) + ABS(g1.c - g2.c))
FROM galaxies2 g1, galaxies2 g2
WHERE g1.r < g2.r
OR (g1.r = g2.r AND g1.c < g2.c)
)
SELECT * FROM part1, part2;
Contributors
Day 11 was brought to you by: @frankmcsherry, @nrainer-materialize
Day twelve was about sequence alignment, matching partial observations with hard constraints. Dynamic programming was a great solution here, using recursion.
See the solution
Part one
WITH MUTUALLY RECURSIVE
lines(r INT, characters TEXT, springs TEXT) AS (
SELECT
row_id,
regexp_split_to_array(regexp_split_to_array(input, '\n')[row_id], ' ')[1] || '.',
regexp_split_to_array(regexp_split_to_array(input, '\n')[row_id], ' ')[2]
FROM
input,
generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) row_id
),
characters(r INT, pos INT, symb TEXT) AS (
SELECT
r,
pos,
substring(characters, pos, 1)
FROM
lines,
generate_series(1, length(characters)) pos
),
springs(r INT, pos INT, len INT) AS (
SELECT
r,
pos,
regexp_split_to_array(springs, ',')[pos]::INT
FROM
lines,
generate_series(1, array_length(regexp_split_to_array(springs, ','), 1)) pos
),
-- How many ways can we pack row `r`'s first `spring` springs (plus a space) into the first `chars` characters?
-- Importantly, the "plus a space" applies to the last spring also! Each of these should admit the immediate appending of a new spring.
fits(r INT, chars INT, spring INT) AS (
-- We can pack no springs into no characters.
SELECT r, 0, 0
FROM lines
-- We can extend any fits with a blank, as long as there are no '#' observations.
UNION ALL
SELECT fits.r, fits.chars + 1, fits.spring
FROM fits, characters
WHERE fits.r = characters.r
AND fits.chars + 1 = characters.pos
AND characters.symb != '#'
-- We can extend any fits with the next spring and a blank, as long as no '.' in the spring and no '#' in the blank.
UNION ALL
SELECT fits.r, fits.chars + springs.len + 1, fits.spring + 1
FROM
fits,
springs,
characters
WHERE fits.r = springs.r
AND fits.spring + 1 = springs.pos
AND fits.r = characters.r
AND fits.chars + springs.len + 1 = characters.pos
AND characters.symb != '#'
AND NOT EXISTS (SELECT FROM characters c WHERE c.r = fits.r AND c.symb = '.' AND c.pos BETWEEN fits.chars + 1 AND fits.chars + springs.len)
),
fit_counts(r INT, chars INT, spring INT, count BIGINT) AS (
SELECT r, chars, spring, COUNT(*) AS count
FROM fits
GROUP BY r, chars, spring
),
counts(r INT, chars INT, spring INT, count BIGINT) AS (
SELECT DISTINCT ON (r) r, chars, spring, count
FROM fit_counts
ORDER BY r, chars DESC, spring DESC
),
potato (x INT) AS ( SELECT 1 )
SELECT SUM(count) FROM counts;
Contributors
Day 12 was brought to you by: @frankmcsherry, @nrainer-materialize
Day thirteen had grids of observations with the hypothesis that each is mirrored, horizontally or vertically, at some point that you need to find. SQL and subqueries were a great way to validate hypothetical mirroring axes.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
blocks(b INT, block TEXT) AS (
SELECT b, regexp_split_to_array(input, '\n\n')[b] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n\n'), 1)) b
),
lines(b INT, r INT, line TEXT) AS (
SELECT b, r, regexp_split_to_array(block, '\n')[r] as block
FROM blocks, generate_series(1, array_length(regexp_split_to_array(block, '\n'), 1)) r
),
cells(b INT, r INT, c INT, symbol TEXT) AS (
SELECT b, r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
columns(b INT, c INT, column TEXT) AS (
SELECT b, c, string_agg(symbol, '' ORDER BY r) FROM cells GROUP BY b, c
),
row_mirror(b INT, r INT) AS (
SELECT *
FROM (SELECT DISTINCT b, r FROM cells) o
WHERE NOT EXISTS (
-- We would be upset to find rows at mirrored positions that do not match
-- Rows that match, or have no mirrored position, are fine.
SELECT FROM lines
WHERE o.b = lines.b
GROUP BY abs(2 * lines.r - (2 * o.r - 1))
HAVING COUNT(DISTINCT lines.line) > 1
)
),
col_mirror(b INT, c INT) AS (
SELECT *
FROM (SELECT DISTINCT b, c FROM cells) o
WHERE NOT EXISTS (
-- We would be upset to find rows at mirrored positions that do not match
-- Rows that match, or have no mirrored position, are fine.
SELECT FROM columns
WHERE o.b = columns.b
GROUP BY abs(2 * columns.c - (2 * o.c - 1))
HAVING COUNT(DISTINCT columns.column) > 1
)
),
part1(part1 BIGINT) AS (
SELECT COALESCE((SELECT SUM(r-1) FROM row_mirror), 0) * 100
+ COALESCE((SELECT SUM(c-1) FROM col_mirror), 0)
),
row_mirror2(b INT, r INT) AS (
SELECT *
FROM (SELECT DISTINCT b, r FROM cells) o
WHERE 1 = (
SELECT COUNT(*)
FROM cells c1, cells c2
WHERE abs(2 * c1.r - (2 * o.r - 1)) = abs(2 * c2.r - (2 * o.r - 1))
AND c1.r < c2.r
AND c1.c = c2.c
AND c1.b = c2.b
AND c1.b = o.b
AND c1.symbol != c2.symbol
)
),
col_mirror2(b INT, c INT) AS (
SELECT *
FROM (SELECT DISTINCT b, c FROM cells) o
WHERE 1 = (
SELECT COUNT(*)
FROM cells c1, cells c2
WHERE abs(2 * c1.c - (2 * o.c - 1)) = abs(2 * c2.c - (2 * o.c - 1))
AND c1.c < c2.c
AND c1.r = c2.r
AND c1.b = c2.b
AND c1.b = o.b
AND c1.symbol != c2.symbol
)
),
part2(part2 BIGINT) AS (
SELECT COALESCE((SELECT SUM(r-1) FROM row_mirror2), 0) * 100
+ COALESCE((SELECT SUM(c-1) FROM col_mirror2), 0)
),
potato (x INT) AS ( SELECT 1 )
SELECT * FROM part1, part2;
Contributors
Day 13 was brought to you by: @frankmcsherry, @nrainer-materialize
Day fourteen was a treat, in that it used nested recursion: a WMR
block within a WMR
block.
The problem was simulation of rocks that roll in cardinal directions, changing the direction ninety degrees, and repeating.
Each simulation was recursive (rocks roll until they stop), and we were meant to repeat the larger progress a great many times (1,000,000,000 cycles).
The only bummer here was the amount of copy/paste re-use, as each of the four cardinal directions had different subqueries.
See the solution
Part 1
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, symbol TEXT) AS (
SELECT r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
northward(r INT, c INT, symbol TEXT) AS (
SELECT * FROM northward
-- Anyone on the move does so
UNION ALL SELECT r - 1, c, 'O' FROM north_move
EXCEPT ALL SELECT r - 1, c, '.' FROM north_move
UNION ALL SELECT r, c, '.' FROM north_move
EXCEPT ALL SELECT r, c, 'O' FROM north_move
-- Initial state is cells, but not refreshed each round.
UNION ALL SELECT * FROM cells
EXCEPT ALL SELECT * FROM cells_delay
),
-- Each 'O' with a '.' to the north will move.
north_move(r INT, c INT) AS (
SELECT n1.r, n1.c
FROM northward n1, northward n2
WHERE n1.symbol = 'O'
AND n1.r = n2.r + 1
AND n1.c = n2.c
AND n2.symbol = '.'
),
part1(part1 BIGINT) AS (
SELECT SUM(1 + (SELECT MAX(r) FROM lines) - r)
FROM northward
WHERE symbol = 'O'
),
output (r INT, line TEXT) AS (
SELECT r, string_agg(symbol, ' ' ORDER BY c)
FROM northward
GROUP BY r
),
cells_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM cells )
SELECT * FROM part1;
Part 2
WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 142)
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, symbol TEXT) AS (
SELECT r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
-- Where should we start each iteration from?
-- From `east`, once it exits, but initially `cells`.
round(r INT, c INT, symbol TEXT) AS (
SELECT * FROM east
UNION ALL SELECT * FROM cells
EXCEPT ALL SELECT * FROM cells_delay
),
north(r INT, c INT, symbol TEXT) AS (
WITH MUTUALLY RECURSIVE
start(r INT, c INT, symbol TEXT) AS (
SELECT * FROM round
),
northward(r INT, c INT, symbol TEXT) AS (
SELECT * FROM northward
-- Anyone on the move does so
UNION ALL SELECT r - 1, c, 'O' FROM north_move
EXCEPT ALL SELECT r - 1, c, '.' FROM north_move
UNION ALL SELECT r, c, '.' FROM north_move
EXCEPT ALL SELECT r, c, 'O' FROM north_move
-- Second time around, the above cancels and `east` is non-empty.
UNION ALL SELECT * FROM start
EXCEPT ALL SELECT * FROM start_delay
),
-- Each 'O' with a '.' in front of them will move.
north_move(r INT, c INT) AS (
SELECT n1.r, n1.c
FROM northward n1, northward n2
WHERE n1.symbol = 'O'
AND n1.r = n2.r + 1
AND n1.c = n2.c
AND n2.symbol = '.'
),
start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
SELECT * FROM northward
),
west(r INT, c INT, symbol TEXT) AS (
WITH MUTUALLY RECURSIVE
start(r INT, c INT, symbol TEXT) AS (
SELECT * FROM north
),
westward(r INT, c INT, symbol TEXT) AS (
SELECT * FROM westward
-- Anyone on the move does so
UNION ALL SELECT r, c - 1, 'O' FROM west_move
EXCEPT ALL SELECT r, c - 1, '.' FROM west_move
UNION ALL SELECT r, c, '.' FROM west_move
EXCEPT ALL SELECT r, c, 'O' FROM west_move
-- Initial state is cells, but not refreshed each round.
UNION ALL SELECT * FROM start
EXCEPT ALL SELECT * FROM start_delay
),
-- Each 'O' with a '.' in front of them will move.
west_move(r INT, c INT) AS (
SELECT w1.r, w1.c
FROM westward w1, westward w2
WHERE w1.symbol = 'O'
AND w1.r = w2.r
AND w1.c = w2.c + 1
AND w2.symbol = '.'
),
start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
SELECT * FROM westward
),
south(r INT, c INT, symbol TEXT) AS (
WITH MUTUALLY RECURSIVE
start(r INT, c INT, symbol TEXT) AS (
SELECT * FROM west
),
southward(r INT, c INT, symbol TEXT) AS (
SELECT * FROM southward
-- Anyone on the move does so
UNION ALL SELECT r + 1, c, 'O' FROM south_move
EXCEPT ALL SELECT r + 1, c, '.' FROM south_move
UNION ALL SELECT r, c, '.' FROM south_move
EXCEPT ALL SELECT r, c, 'O' FROM south_move
-- Initial state is cells, but not refreshed each round.
UNION ALL SELECT * FROM start
EXCEPT ALL SELECT * FROM start_delay
),
-- Each 'O' with a '.' in front of them will move.
south_move(r INT, c INT) AS (
SELECT s1.r, s1.c
FROM southward s1, southward s2
WHERE s1.symbol = 'O'
AND s1.r = s2.r - 1
AND s1.c = s2.c
AND s2.symbol = '.'
),
start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
SELECT * FROM southward
),
east(r INT, c INT, symbol TEXT) AS (
WITH MUTUALLY RECURSIVE
start(r INT, c INT, symbol TEXT) AS (
SELECT * FROM south
),
eastward(r INT, c INT, symbol TEXT) AS (
SELECT * FROM eastward
-- Anyone on the move does so
UNION ALL SELECT r, c + 1, 'O' FROM east_move
EXCEPT ALL SELECT r, c + 1, '.' FROM east_move
UNION ALL SELECT r, c, '.' FROM east_move
EXCEPT ALL SELECT r, c, 'O' FROM east_move
-- Initial state is cells, but not refreshed each round.
UNION ALL SELECT * FROM start
EXCEPT ALL SELECT * FROM start_delay
),
-- Each 'O' with a '.' in front of them will move.
east_move(r INT, c INT) AS (
SELECT e1.r, e1.c
FROM eastward e1, eastward e2
WHERE e1.symbol = 'O'
AND e1.r = e2.r
AND e1.c = e2.c - 1
AND e2.symbol = '.'
),
start_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM start )
SELECT * FROM eastward
),
output (r INT, line TEXT) AS (
SELECT r, string_agg(symbol, ' ' ORDER BY c)
FROM round
GROUP BY r
),
transitions(source TEXT, target TEXT) AS (
SELECT
(SELECT string_agg(symbol, '' ORDER BY r, c) FROM round),
(SELECT string_agg(symbol, '' ORDER BY r, c) FROM east)
UNION ALL
SELECT * FROM transitions
),
part2(part2 BIGINT) AS (
SELECT SUM(1 + (SELECT MAX(r) FROM lines) - r)
FROM east
WHERE symbol = 'O'
),
cells_delay(r INT, c INT, symbol TEXT) AS ( SELECT * FROM cells )
-- SELECT count, COUNT(*)
-- FROM (
-- SELECT source, target, COUNT(*) count
-- FROM transitions
-- GROUP BY source, target)
-- GROUP BY count;
-- SELECT * FROM output ORDER BY r;
SELECT * FROM part2;
Contributors
Day 14 was brought to you by: @frankmcsherry
Week three
Day fifteen has you implement a hash function, and then a hash map. Recursion was a handy way to walk through the input to be hashed, though the hash function was simple enough that you could have used math directly instead. The second part (the hash map) did not require recursion, as rather than simulate the operations you could leap to the final state you were looking for.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 10)
strings(r INT, string TEXT) AS (
SELECT r, regexp_split_to_array(input, ',')[r]
FROM input, generate_series(1, array_length(regexp_split_to_array(input, ','), 1)) r
),
-- Advance the hash by one character, until all strings are empty.
hashes(string TEXT, hash BIGINT) AS (
SELECT string, 0 as hash
FROM strings
UNION ALL
SELECT substring(string, 2), ((hash + ascii(substring(string, 1, 1))) * 17) % 256
FROM hashes
WHERE length(string) > 0
),
part1(part1 BIGINT) AS (
SELECT SUM(hash)
FROM hashes
WHERE string = ''
),
-- Parse strings as symbol plus commands; either `-` or `=X`.
commands(r INT, symb TEXT, op INT) AS (
SELECT
r,
CASE WHEN substring(string, length(string)) = '-'
THEN substring(string, 1, length(string)-1)
ELSE substring(string, 1, length(string)-2)
END,
CASE WHEN substring(string, length(string)) = '-'
THEN 0
ELSE substring(string, length(string))::INT
END
FROM strings
),
-- Operations that happen after a symbol's last delete operation.
-- All other operations do not matter, and do not affect the state.
final_ops(r INT, symb TEXT, op INT) AS (
SELECT *
FROM commands
WHERE r > COALESCE(
(SELECT MAX(r)
FROM commands c2
WHERE commands.symb = c2.symb
AND c2.op = 0), 0)
),
-- Each symbol is summarized by their first final insert time, and the last final operation
final_state(r INT, symb TEXT, op INT) AS (
SELECT DISTINCT ON(symb)
(SELECT MIN(r) FROM final_ops fo2 WHERE fo2.symb = final_ops.symb),
symb,
op
FROM final_ops
ORDER BY symb, r DESC, op
),
-- Redo the hash computation on symbols rather than commands.
hashes2(start TEXT, string TEXT, hash BIGINT) AS (
SELECT symb as start, symb as string, 0 as hash
FROM final_state
UNION ALL
SELECT start, substring(string, 2), ((hash + ascii(substring(string, 1, 1))) * 17) % 256
FROM hashes2
WHERE length(string) > 0
),
-- Bin up the state, so's we can tabulate it
binned(hash BIGINT, r INT, symb TEXT, op INT) AS (
SELECT hash, final_state.*
FROM hashes2, final_state
WHERE hashes2.start = symb
AND hashes2.string = ''
),
-- Sum the product of 1 + hash, the position in bin by r, and the op.
part2(part2 BIGINT) AS (
SELECT SUM(
(1 + hash) *
(SELECT COUNT(*) FROM binned b2 WHERE binned.hash = b2.hash AND binned.r >= b2.r) *
op
)
FROM binned
),
potato(x int) as (select 1)
SELECT * FROM part1, part2;
Contributors
Day 15 was brought to you by: @frankmcsherry, @nrainer-materialize
Day sixteen was about bouncing light around in a grid, and seeing how many grid cells are illuminated.
The illumination process was classic recursive SQL, where you keep expanding (row, col, dir)
triples until the set reaches a fixed point.
In the second part the light sources had an origin, which is just a fourth column to add, tracking the source of each ray of light.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, symbol TEXT) AS (
SELECT r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
shift(dir TEXT, symbol TEXT, dr INT, dc INT, new_dir TEXT) AS (
VALUES
('r', '.', 0, 1, 'r'),
('r', '-', 0, 1, 'r'),
('r', '|', 1, 0, 'd'),
('r', '|', -1, 0, 'u'),
('r', '/', -1, 0, 'u'),
('r', '\', 1, 0, 'd'),
('l', '.', 0, -1, 'l'),
('l', '-', 0, -1, 'l'),
('l', '|', 1, 0, 'd'),
('l', '|', -1, 0, 'u'),
('l', '/', 1, 0, 'd'),
('l', '\', -1, 0, 'u'),
('u', '.', -1, 0, 'u'),
('u', '-', 0, 1, 'r'),
('u', '-', 0, -1, 'l'),
('u', '|', -1, 0, 'u'),
('u', '/', 0, 1, 'r'),
('u', '\', 0, -1, 'l'),
('d', '.', 1, 0, 'd'),
('d', '-', 0, 1, 'r'),
('d', '-', 0, -1, 'l'),
('d', '|', 1, 0, 'd'),
('d', '/', 0, -1, 'l'),
('d', '\', 0, 1, 'r')
),
-- Light is in a location, and has a direction.
light(r INT, c INT, dir TEXT) AS (
SELECT 1, 1, 'r'
UNION
SELECT light.r + dr, light.c + dc, new_dir
FROM light, cells, shift
WHERE light.r = cells.r
AND light.c = cells.c
AND light.dir = shift.dir
AND cells.symbol = shift.symbol
),
part1(part1 BIGINT) AS (
SELECT COUNT(*) FROM (
SELECT DISTINCT light.r, light.c
FROM light, cells
WHERE light.r = cells.r
AND light.c = cells.c
)
),
-- Light is in a location, a direction, and an origin.
light2(r INT, c INT, dir TEXT, source TEXT) AS (
SELECT DISTINCT * FROM (SELECT r, (SELECT MIN(c) FROM cells), 'r', 'r' || r FROM cells) UNION
SELECT DISTINCT * FROM (SELECT r, (SELECT MAX(c) FROM cells), 'l', 'l' || r FROM cells) UNION
SELECT DISTINCT * FROM (SELECT (SELECT MIN(r) FROM cells), c, 'd', 'd' || c FROM cells) UNION
SELECT DISTINCT * FROM (SELECT (SELECT MAX(c) FROM cells), c, 'u', 'u' || c FROM cells) UNION
SELECT light2.r + dr, light2.c + dc, new_dir, source
FROM light2, cells, shift
WHERE light2.r = cells.r
AND light2.c = cells.c
AND light2.dir = shift.dir
AND cells.symbol = shift.symbol
),
part2(part2 BIGINT) AS (
SELECT MAX(count) FROM (
SELECT source, COUNT(*) FROM (
SELECT DISTINCT light2.r, light2.c, source
FROM light2, cells
WHERE light2.r = cells.r
AND light2.c = cells.c
)
GROUP BY source
)
)
SELECT * FROM part1, part2;
Contributors
Day 16 was brought to you by: @frankmcsherry
Day seventeen is a pathfinding problem, with constraints on how you move around the path (not too short or too long in any direction at once). Classic recursive SQL to implement Bellman-Ford.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, cost INT) AS (
SELECT r, c, substring(line, c, 1)::INT
FROM lines, generate_series(1, length(line)) c
),
-- For each cell, we can be headed n, e, w, s and have gone 1, 2, 3 steps already.
-- There is a mimimum cost path to reach this configuration, and .. we might need
-- to remember how we got there but let's do that in part 2.
min_cost(r INT, c INT, dr INT, dc INT, steps INT, cost INT) AS (
SELECT r, c, dr, dc, steps, MIN(cost)
FROM (
SELECT 1 as r, 1 as c, 1 as dr, 0 as dc, 0 as steps, 0 as cost
UNION ALL
SELECT 1, 1, 0, 1, 0, 0
-- We could have just stepped to r, c in a few ways, incurring its cost.
UNION ALL
SELECT cells.r, cells.c, dr, dc, steps + 1, min_cost.cost + cells.cost
FROM min_cost, cells
WHERE steps < 3
AND cells.r = min_cost.r + dr
AND cells.c = min_cost.c + dc
-- We could take a ??? turn
UNION ALL
SELECT cells.r, cells.c, dc, dr, 1, min_cost.cost + cells.cost
FROM min_cost, cells
WHERE cells.r = min_cost.r + dc
AND cells.c = min_cost.c + dr
-- We could take a ??? turn
UNION ALL
SELECT cells.r, cells.c, -dc, -dr, 1, min_cost.cost + cells.cost
FROM min_cost, cells
WHERE cells.r = min_cost.r - dc
AND cells.c = min_cost.c - dr
)
GROUP BY r, c, dr, dc, steps
),
part1(part1 INT) AS (
SELECT MIN(cost)
FROM min_cost
WHERE r = (SELECT MAX(r) FROM cells)
AND c = (SELECT MAX(c) FROM cells)
),
potato(x INT) AS (SELECT 1),
-- For each cell, we can be headed n, e, w, s and have gone 1, 2, 3 steps already.
-- There is a mimimum cost path to reach this configuration, and .. we might need
-- to remember how we got there but let's do that in part 2.
min_cost2(r INT, c INT, dr INT, dc INT, steps INT, cost INT) AS (
SELECT r, c, dr, dc, steps, MIN(cost)
FROM (
SELECT 1 as r, 1 as c, 1 as dr, 0 as dc, 0 as steps, 0 as cost
UNION ALL
SELECT 1, 1, 0, 1, 0, 0
-- We could have just stepped to r, c in a few ways, incurring its cost.
UNION ALL
SELECT cells.r, cells.c, dr, dc, steps + 1, min_cost2.cost + cells.cost
FROM min_cost2, cells
WHERE steps < 10
AND cells.r = min_cost2.r + dr
AND cells.c = min_cost2.c + dc
-- We could take a XYZ turn
UNION ALL
SELECT cells.r, cells.c, dc, dr, 1, min_cost2.cost + cells.cost
FROM min_cost2, cells
WHERE steps >= 4
AND cells.r = min_cost2.r + dc
AND cells.c = min_cost2.c + dr
-- We could take a ZYX turn
UNION ALL
SELECT cells.r, cells.c, -dc, -dr, 1, min_cost2.cost + cells.cost
FROM min_cost2, cells
WHERE steps >= 4
AND cells.r = min_cost2.r - dc
AND cells.c = min_cost2.c - dr
)
GROUP BY r, c, dr, dc, steps
),
part2(part2 INT) AS (
SELECT MIN(cost)
FROM min_cost2
WHERE r = (SELECT MAX(r) FROM cells)
AND c = (SELECT MAX(c) FROM cells)
AND steps >= 4
),
SELECT * FROM part1, part2;
Contributors
Day 17 was brought to you by: @frankmcsherry
Day eighteen provides instructions of how a digger will move around, excavating earth, and asks you to calculate the area. This is an opportunity to learn about the Trapezoid formula for computing the area as the addition and subtraction of trapezoid areas.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as line
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
split1(r INT, dr INT, dc INT, steps INT) AS (
SELECT
r,
CASE WHEN regexp_split_to_array(line, ' ')[1] = 'U' THEN -1
WHEN regexp_split_to_array(line, ' ')[1] = 'D' THEN 1
ELSE 0
END,
CASE WHEN regexp_split_to_array(line, ' ')[1] = 'L' THEN -1
WHEN regexp_split_to_array(line, ' ')[1] = 'R' THEN 1
ELSE 0
END,
regexp_split_to_array(line, ' ')[2]::INT
FROM lines
),
-- Part 1 is prefix sum followed by area calculations.
-- We'll brute force the prefix sum part, and use the
-- "trapezoid formula", summing + and - contributions
-- as the path moves around.
path1(r1 INT, c1 INT, r2 INT, c2 INT, rounds INT) AS (
SELECT 0, 0, 0, 0, 1
UNION
SELECT
path1.r2,
path1.c2,
path1.r2 + split1.dr * split1.steps,
path1.c2 + split1.dc * split1.steps,
path1.rounds + 1
FROM path1, split1
WHERE path1.rounds = split1.r
),
-- The area carved by the path, plus half a unit of area
-- for each path step, plus 4 * (1/4) units for the net
-- four 90 degree turns.
part1(part1 BIGINT) AS (
SELECT
ABS((SELECT SUM((r1 + r2) * (c1 - c2)) FROM path1)) / 2
+ (SELECT SUM(steps) FROM split1) / 2
+ 1
),
-- Part 2 changes how we parse each line to give long paths.
split2(r INT, dr INT, dc INT, steps INT) AS (
SELECT
r,
CASE WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '3' THEN -1
WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '1' THEN 1
ELSE 0
END,
CASE WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '2' THEN -1
WHEN substring(regexp_split_to_array(line, ' ')[3], 8, 1) = '0' THEN 1
ELSE 0
END,
256 * 256 * get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 0)
+ 256 * get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 1)
+ get_byte(decode('0' || substring(regexp_split_to_array(line, ' ')[3], 3, 5), 'hex'), 2)
FROM lines
),
path2(r1 BIGINT, c1 BIGINT, r2 BIGINT, c2 BIGINT, rounds INT) AS (
SELECT 0, 0, 0, 0, 1
UNION
SELECT
path2.r2,
path2.c2,
path2.r2 + split2.dr * split2.steps,
path2.c2 + split2.dc * split2.steps,
path2.rounds + 1
FROM path2, split2
WHERE path2.rounds = split2.r
),
-- The area carved by the path, plus half a unit of area
-- for each path step, plus 4 * (1/4) units for the net
-- four 90 degree turns.
part2(part2 BIGINT) AS (
SELECT
ABS((SELECT SUM((r1 + r2) * (c1 - c2)) FROM path2)) / 2
+ (SELECT SUM(steps) FROM split2) / 2
+ 1
)
SELECT * FROM part1, part2;
Contributors
Day 18 was brought to you by: @frankmcsherry
Day nineteen sneakily introduces you to binary space partitioning, where rules based on inequality tests route you to new rules, until eventually you reach some rule that says “accept” or “reject”. This was all pretty easy, except for a substantial amount of SQL overhead related to the various symbols and characters and coordinates all of which required their own columns.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
blocks(block1 TEXT, block2 TEXT) AS (
SELECT
regexp_split_to_array(input, '\n\n')[1] block1,
regexp_split_to_array(input, '\n\n')[2] block2
FROM input
),
states(state TEXT, trans TEXT) AS (
SELECT
regexp_split_to_array(line, '\{')[1] state,
trim('}' FROM regexp_split_to_array(line, '\{')[2]) trans
FROM (SELECT regexp_split_to_table(block1, '\n') line FROM blocks)
),
steps(state TEXT, priority INT, rule TEXT) AS (
SELECT
state,
priority,
regexp_split_to_array(trans, ',')[priority]
FROM states, generate_series(1, array_length(regexp_split_to_array(trans, ','), 1)) priority
),
starts(x INT, m INT, a INT, s INT) AS (
SELECT
substring(regexp_split_to_array(trimmed, ',')[1], 3)::INT,
substring(regexp_split_to_array(trimmed, ',')[2], 3)::INT,
substring(regexp_split_to_array(trimmed, ',')[3], 3)::INT,
substring(regexp_split_to_array(trimmed, ',')[4], 3)::INT
FROM (SELECT trim('\{' FROM trim('\}' FROM regexp_split_to_table(block2, '\n'))) trimmed FROM blocks)
),
--
rules(state TEXT, priority INT, field TEXT, cmp TEXT, val INT, next TEXT) AS (
SELECT
state,
priority,
CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
THEN substring(rule, 1, 1)
ELSE 'x'
END,
CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
THEN substring(rule, 2, 1)
ELSE '>'
END,
CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
THEN regexp_split_to_array(substring(rule, 3), ':')[1]::INT
ELSE '0'
END,
CASE WHEN substring(rule, 2, 1) = '<' OR substring(rule, 2, 1) = '>'
THEN regexp_split_to_array(substring(rule, 3), ':')[2]
ELSE rule
END
FROM steps
),
-- PART 1: iterate folks forward from `in`
movement(state TEXT, x INT, m INT, a INT, s INT) AS (
SELECT 'in' state, * FROM starts
UNION ALL
SELECT next, x, m, a, s
FROM (
SELECT DISTINCT ON (state, x, m, a, s) state, x, m, a, s, priority, next
FROM (
SELECT movement.*, rules.next, rules.priority
FROM movement, rules
WHERE movement.state = rules.state
AND CASE WHEN rules.cmp = '<'
THEN CASE WHEN rules.field = 'x' THEN x < val
WHEN rules.field = 'm' THEN m < val
WHEN rules.field = 'a' THEN a < val
WHEN rules.field = 's' THEN s < val
ELSE false
END
WHEN rules.cmp = '>'
THEN CASE WHEN rules.field = 'x' THEN x > val
WHEN rules.field = 'm' THEN m > val
WHEN rules.field = 'a' THEN a > val
WHEN rules.field = 's' THEN s > val
ELSE false
END
ELSE false
END
)
ORDER BY state, x, m, a, s, priority
)
),
part1(part1 BIGINT) AS (
SELECT SUM(x + m + a + s)
FROM movement
WHERE state = 'A'
),
-- PART 2: just find all the bounding regions and label them 'A' or 'R'.
region(state TEXT, priority INT, xl INT, xu INT, ml INT, mu INT, al INT, au INT, sl INT, su INT) AS (
SELECT 'in', 1, 1, 4000, 1, 4000, 1, 4000, 1, 4000
-- Could satisfy the rule, and transition to the next state ..
UNION ALL
SELECT
next,
1,
CASE WHEN rules.field = 'x' AND rules.cmp = '>' THEN GREATEST(val+1, xl) ELSE xl END,
CASE WHEN rules.field = 'x' AND rules.cmp = '<' THEN LEAST(val-1, xu) ELSE xu END,
CASE WHEN rules.field = 'm' AND rules.cmp = '>' THEN GREATEST(val+1, ml) ELSE ml END,
CASE WHEN rules.field = 'm' AND rules.cmp = '<' THEN LEAST(val-1, mu) ELSE mu END,
CASE WHEN rules.field = 'a' AND rules.cmp = '>' THEN GREATEST(val+1, al) ELSE al END,
CASE WHEN rules.field = 'a' AND rules.cmp = '<' THEN LEAST(val-1, au) ELSE au END,
CASE WHEN rules.field = 's' AND rules.cmp = '>' THEN GREATEST(val+1, sl) ELSE sl END,
CASE WHEN rules.field = 's' AND rules.cmp = '<' THEN LEAST(val-1, su) ELSE su END
FROM region, rules
WHERE region.state = rules.state
AND region.priority = rules.priority
-- .. or could fail the rule, and advance to the next priority.
UNION ALL
SELECT
region.state,
region.priority + 1,
CASE WHEN rules.field = 'x' AND rules.cmp = '<' THEN GREATEST(val, xl) ELSE xl END,
CASE WHEN rules.field = 'x' AND rules.cmp = '>' THEN LEAST(val, xu) ELSE xu END,
CASE WHEN rules.field = 'm' AND rules.cmp = '<' THEN GREATEST(val, ml) ELSE ml END,
CASE WHEN rules.field = 'm' AND rules.cmp = '>' THEN LEAST(val, mu) ELSE mu END,
CASE WHEN rules.field = 'a' AND rules.cmp = '<' THEN GREATEST(val, al) ELSE al END,
CASE WHEN rules.field = 'a' AND rules.cmp = '>' THEN LEAST(val, au) ELSE au END,
CASE WHEN rules.field = 's' AND rules.cmp = '<' THEN GREATEST(val, sl) ELSE sl END,
CASE WHEN rules.field = 's' AND rules.cmp = '>' THEN LEAST(val, su) ELSE su END
FROM region, rules
WHERE region.state = rules.state
AND region.priority = rules.priority
),
part2(part2 NUMERIC) AS (
SELECT SUM((1 + xu - xl)::BIGINT * (1 + mu - ml)::BIGINT * (1 + au - al)::BIGINT * (1 + su - sl)::BIGINT)
FROM region
WHERE state = 'A'
),
potato(x INT) AS (SELECT 1)
SELECT * FROM part1, part2;
Contributors
Day 19 was brought to you by: @frankmcsherry
Day twenty presents you with the simulation of an asynchronous circuit, and this is the day that almost broke me. Mechanically the SQL isn’t that complicated, but debugging the SQL was a real challenge. It got done over the course of a quite long train ride into the evening.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(line TEXT) AS ( SELECT regexp_split_to_table(input, '\n') FROM input ),
links(name TEXT, link TEXT) AS (
SELECT
substring(regexp_split_to_array(line, ' ')[1], 2),
trim(',' FROM regexp_split_to_array(line, ' ')[x])
FROM
lines, generate_series(3, array_length(regexp_split_to_array(line, ' '), 1)) x
),
-- One special line has op 'b' and name 'roadcaster'.
types(op TEXT, name TEXT) AS (
SELECT
substring(regexp_split_to_array(line, ' ')[1], 1, 1),
substring(regexp_split_to_array(line, ' ')[1], 2)
FROM
lines
),
-- Part one: simulate 1000 steps of 'broadcaster' being activated with a low pulse.
-- tally up total low and high pulses, and then multiply.
-- The state carried across steps are the last-transmitted pulses of each operator.
-- This should also tell us the final state of the `%` operators.
-- We'll also need the totals of low and high pulses, so that we can add them up.
seed(press INT, counter INT) AS (
SELECT 1, 1
UNION
SELECT press, counter - 1
FROM seed
WHERE counter > 0
UNION
SELECT press + 1, 20
FROM seed
WHERE counter = 0
AND press < 4100
),
-- Emitted pulses after various button presses, in various rounds of resolution.
pulses(name TEXT, press INT, round INT, pulse TEXT) AS (
-- One thousand button presses, each followed by rounds of resolution.
SELECT 'roadcaster', press, 1, 'lo' FROM seed WHERE counter = 0
UNION ALL SELECT * FROM flip
UNION ALL SELECT * FROM conj
),
-- Counters; every 'lo' input pulse flips and emits the state.
flip(name TEXT, press INT, round INT, pulse TEXT) AS (
-- Each `signal` needs to behave as if all "prior" signals have been processed, ordered by (press, round, source).
SELECT
name,
press,
round + 1,
-- Look for the most recently emitted signal, and we'll produce the opposite of that one.
CASE WHEN (
SELECT COUNT(*)
FROM signal s1
WHERE s1.target = types.name
AND s1.pulse = 'lo'
AND ((s1.press < signal.press) OR
(s1.press = signal.press AND s1.round < signal.round) OR
(s1.press = signal.press AND s1.round = signal.round AND s1.source < signal.source))
) % 2 = 0
THEN 'hi'
ELSE 'lo'
END
FROM signal, types
WHERE signal.target = types.name
AND types.op = '%'
AND signal.pulse = 'lo'
),
-- NAND gates; every input pulse evokes the NAND of most recent inputs.
conj(name TEXT, press INT, round INT, pulse TEXT) AS (
SELECT
name,
press,
round + 1,
-- Look for the most recently received signals from each input,
-- including this one, and iff all 'hi' then 'lo'.
CASE WHEN (
(SELECT COUNT(*) FROM links WHERE link = types.name)
=
(SELECT COUNT(*) FROM (
SELECT DISTINCT ON (source) source, pulse
FROM signal s1
WHERE s1.target = types.name
AND ((s1.press < signal.press) OR
(s1.press = signal.press AND s1.round < signal.round) OR
(s1.press = signal.press AND s1.round = signal.round AND s1.source <= signal.source))
OPTIONS (DISTINCT ON INPUT GROUP SIZE = 1000)
ORDER BY source, press DESC, round DESC
)
WHERE pulse = 'hi'))
THEN 'lo'
ELSE 'hi'
END
FROM signal, types
WHERE signal.target = types.name
AND types.op = '&'
),
-- A record of a pulse into an operator, from another operator.
-- We track the source so that '&' operators can make any sense.
signal(source TEXT, target TEXT, press INT, round INT, pulse TEXT) AS (
SELECT pulses.name, links.link, pulses.press, pulses.round, pulses.pulse
FROM pulses, links
WHERE pulses.name = links.name
AND pulses.round > 0
),
part1(pulse TEXT, count BIGINT) AS (
SELECT pulse, count(*) FROM signal GROUP BY pulse
),
potato(x INT) AS (SELECT 1)
SELECT * FROM signal WHERE target = 'cn' AND pulse = 'hi';
Contributors
Day 20 was brought to you by: @frankmcsherry
Day twenty-one was another example of some (recursive) SQL for grid exploration, followed by some mathematics. In this case the grid exploration was standard, determining reachable locations on the grid, and then the math was quadratic extrapolation from a sequence of measurements (to something too large to actually evaluate, an answer of 621,289,922,886,149 reachable states).
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as block
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, symbol TEXT) AS (
SELECT r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
steps(r INT, c INT) AS (
SELECT r, c FROM cells WHERE symbol = 'S'
EXCEPT ALL
SELECT * FROM s_delay
UNION
SELECT cells.r, cells.c
FROM cells, (
SELECT r + 1, c FROM steps
UNION SELECT r - 1, c FROM steps
UNION SELECT r, c + 1 FROM steps
UNION SELECT r, c - 1 FROM steps
) as potato(r,c)
WHERE cells.r = potato.r
AND cells.c = potato.c
AND cells.symbol != '#'
),
s_delay(r INT, c INT) AS (
SELECT r, c FROM cells WHERE symbol = 'S'
),
part1(part1 BIGINT) AS (
SELECT COUNT(*) FROM (SELECT DISTINCT * FROM steps)
),
-- PART 2 wants a much larger step count on an infinite repeating grid.
-- We know it will be quadratic based on the clear paths if nothing else.
-- Map out enough points to reverse out polynomial coefficients.
-- For me they were `ax^2 + bx + c` with a = 60724, b = 30602, c = 3849.
dists(r INT, c INT, d INT) AS (
SELECT r, c, MIN(d)
FROM (
SELECT r, c, 0 d
FROM cells
WHERE symbol = 'S'
UNION ALL
SELECT potato.r, potato.c, d + 1
FROM cells, (
SELECT r + 1, c, d FROM dists
UNION SELECT r - 1, c, d FROM dists
UNION SELECT r, c + 1, d FROM dists
UNION SELECT r, c - 1, d FROM dists
) as potato(r,c,d)
WHERE cells.r = 1 + (((potato.r - 1) % 131) + 131) % 131
AND cells.c = 1 + (((potato.c - 1) % 131) + 131) % 131
AND cells.symbol != '#'
AND potato.d < 1000
)
GROUP BY r, c
),
part2(x0 BIGINT, x2 BIGINT, x4 BIGINT, x6 BIGINT) AS (
SELECT
(SELECT COUNT(*) FROM dists WHERE d <= 0 * 131 + 65 AND d % 2 = 1),
(SELECT COUNT(*) FROM dists WHERE d <= 2 * 131 + 65 AND d % 2 = 1),
(SELECT COUNT(*) FROM dists WHERE d <= 4 * 131 + 65 AND d % 2 = 1),
(SELECT COUNT(*) FROM dists WHERE d <= 6 * 131 + 65 AND d % 2 = 1)
),
potato (x INT) AS ( SELECT 1 )
SELECT 'idk';
Contributors
Day 21 was brought to you by: @frankmcsherry
Week four
The last week was shorter, but also culminated in some pretty exciting problems and techniques.
Day twenty-two had shapes made of cubes falling into a well, and coming to rest on others (or the ground). There were then questions about how many pieces are load bearing, and also for each load bearing piece how many others would fall if they were removed. Dropping the pieces used recursive SQL, determining the load bearing pieces did not, but then scoring the load bearing pieces again required recursion.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as line
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, x INT, y INT, z INT) AS (
SELECT xs.r, x, y, z
FROM (SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[1]::INT,
regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[1]::INT) x FROM lines) xs,
(SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[2]::INT,
regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[2]::INT) y FROM lines) ys,
(SELECT r, generate_series(regexp_split_to_array(regexp_split_to_array(line, '~')[1], ',')[3]::INT,
regexp_split_to_array(regexp_split_to_array(line, '~')[2], ',')[3]::INT) z FROM lines) zs
WHERE xs.r = ys.r
AND xs.r = zs.r
),
-- Part one: let the pieces fall, with a minimum z value of one.
parts(r INT, x INT, y INT, z INT) AS (
SELECT * FROM cells
EXCEPT ALL SELECT * FROM cells_delayed
UNION ALL
SELECT r, x, y, CASE WHEN r IN (SELECT * FROM supported) THEN z ELSE z - 1 END
FROM parts
),
-- One piece supports a *different* piece if it is directly below a piece of the other.
supports(r1 INT, r2 INT) AS (
SELECT DISTINCT p1.r, p2.r
FROM parts p1, parts p2
WHERE p1.x = p2.x
AND p1.y = p2.y
AND p1.z + 1 = p2.z
AND p1.r != p2.r
),
supported(r INT) AS (
SELECT r FROM parts WHERE z = 1
UNION
SELECT r2 FROM supports
),
-- A piece is safe to remove if it is does not uniquely support any other piece.
part1(part1 BIGINT) AS (
SELECT COUNT(DISTINCT r)
FROM lines
WHERE r NOT IN (
SELECT r1
FROM supports
WHERE r2 IN (
SELECT r2
FROM supports
GROUP BY r2
HAVING COUNT(*) = 1
)
)
),
cells_delayed(r INT, x INT, y INT, z INT) AS ( SELECT * FROM cells ),
-- Part two: for each piece, how many pieces would fall if you removed it?
-- Extend `supports` to transitive support: if r1 vanished would r2 fall?
supports_trans(r1 INT, r2 INT) AS (
-- Uniquely supported pieces would certainly fall.
SELECT *
FROM supports
WHERE r2 IN (SELECT r2 FROM supports GROUP BY r2 HAVING COUNT(*) = 1)
-- Any piece all of whose supports would fall without 'a' also falls without it.
UNION
SELECT st.r1, s1.r2
FROM supports_trans st, supports s1
WHERE st.r2 = s1.r1
GROUP BY st.r1, s1.r2
HAVING COUNT(*) = (SELECT COUNT(*) FROM supports WHERE supports.r2 = s1.r2)
),
part2(part2 BIGINT) AS (SELECT COUNT(*) FROM supports_trans)
SELECT * FROM part1, part2;
Contributors
Day 22 was brought to you by: @frankmcsherry
Day twenty-three is a classic example of finding the “longest path” in a directed graph. This is a relatively easy problem when the input is acyclic (part one), and it is NP-hard when the input may have cycles (part two). Part one was a mostly vanilla recursive SQL query, and part two encoded the 32 prior state options in a large integer and just did a lot of work.
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as line
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
cells(r INT, c INT, symbol TEXT) AS (
SELECT r, c, substring(line, c, 1)
FROM lines, generate_series(1, length(line)) c
),
-- Part one: longest path (on probably a DAG)
paths(r INT, c INT) AS (
SELECT r, c FROM cells WHERE symbol = '.'
),
steps(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT r, c, r + 1, c FROM paths WHERE (r + 1, c) IN (SELECT * FROM PATHS) UNION
SELECT r, c, r - 1, c FROM paths WHERE (r - 1, c) IN (SELECT * FROM PATHS) UNION
SELECT r, c, r, c + 1 FROM paths WHERE (r, c + 1) IN (SELECT * FROM PATHS) UNION
SELECT r, c, r, c - 1 FROM paths WHERE (r, c - 1) IN (SELECT * FROM PATHS)
),
-- A directional trip, forced by a slope and the no-revisting rule.
force(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT r-1, c, r+1, c FROM cells WHERE symbol = 'v' UNION ALL
SELECT r+1, c, r-1, c FROM cells WHERE symbol = '^' UNION ALL
SELECT r, c-1, r, c+1 FROM cells WHERE symbol = '>' UNION ALL
SELECT r, c+1, r, c-1 FROM cells WHERE symbol = '<'
),
dists(r INT, c INT, d INT) AS (
SELECT 1, 2, 0
UNION
SELECT steps.r2, steps.c2, 1 + MIN(d)
FROM dists, steps
WHERE dists.r = steps.r1
AND dists.c = steps.c1
GROUP BY steps.r2, steps.c2
UNION
SELECT force.r2, force.c2, 2 + MAX(d)
FROM dists, force
WHERE dists.r = force.r1
AND dists.c = force.c1
GROUP BY force.r2, force.c2
),
-- Part two: longest path on definitely not a DAG.
-- There are 32 optional nodes (not including first and last nodes)
-- Clearly meant to pack in to an int and avoid duplication.
paths2(r INT, c INT) AS (
SELECT r, c FROM cells WHERE symbol != '#'
),
steps2(r1 INT, c1 INT, r2 INT, c2 INT) AS (
SELECT r, c, r + 1, c FROM paths2 WHERE (r + 1, c) IN (SELECT * FROM paths2) UNION
SELECT r, c, r - 1, c FROM paths2 WHERE (r - 1, c) IN (SELECT * FROM paths2) UNION
SELECT r, c, r, c + 1 FROM paths2 WHERE (r, c + 1) IN (SELECT * FROM paths2) UNION
SELECT r, c, r, c - 1 FROM paths2 WHERE (r, c - 1) IN (SELECT * FROM paths2)
),
-- Locations where a choice exists (or start/end).
nodes(r INT, c INT) AS (
SELECT r1, c1 FROM steps2 GROUP BY r1, c1 HAVING COUNT(*) != 2
),
-- Determine node-to-node path lengths. Do not cross nodes.
trail(r1 INT, c1 INT, d INT, r2 INT, c2 INT) AS (
SELECT r1, c1, MIN(d), r2, c2
FROM (
SELECT r1, c1, 1 d, r2, c2 FROM steps2 WHERE (r1, c1) IN (SELECT * FROM nodes)
UNION ALL
SELECT trail.r1, trail.c1, d + 1, steps2.r2, steps2.c2
FROM trail, steps2
WHERE trail.r2 = steps2.r1
AND trail.c2 = steps2.c1
AND (trail.r1 != steps2.r2 OR trail.c1 != steps2.c2)
AND (steps2.r1, steps2.c1) NOT IN (SELECT * FROM nodes)
)
GROUP BY r1, c1, r2, c2
),
links(r1 INT, c1 INT, d INT, r2 INT, c2 INT) AS (
SELECT * FROM trail WHERE (r2, c2) IN (SELECT * FROM nodes)
),
-- These rows in links show that (12, 20) and (130, 126) are mandatory,
-- and are the first moments we have a choice. The remainaing 32 nodes
-- can each get a number, and be used in a bit pattern somewhere.
--
-- 1 | 2 | 105 | 12 | 20
-- 141 | 140 | 121 | 130 | 126
-- Re-key nodes to dense integers.
internal(r INT, c INT, id INT) AS (
SELECT r, c, (
SELECT COUNT(*)
FROM nodes n1
WHERE (n1.r < n2.r OR (n1.r = n2.r AND n1.c < n2.c))
AND (n1.r, n1.c) NOT IN (VALUES (1,2), (12,20), (130,126), (141,140))
)
FROM nodes n2
WHERE (r, c) NOT IN (VALUES (1,2), (12,20), (130,126), (141,140))
),
longest(r INT, c INT, d INT, v BIGINT) AS (
SELECT r, c, MAX(d), v
FROM (
SELECT 12 r, 20 c, 0 d, 0 v
UNION ALL
SELECT r2, c2, longest.d + links.d, v + (1::BIGINT << internal.id)
FROM longest, links, internal
WHERE longest.r = links.r1
AND longest.c = links.c1
AND links.r2 = internal.r
AND links.c2 = internal.c
AND ((v >> internal.id) % 2) != 1
)
GROUP BY r, c, v
),
potato(x INT) AS ( SELECT 1 )
SELECT * FROM longest ORDER BY d DESC;
Contributors
Day 23 was brought to you by: @frankmcsherry
Day twenty-four had most folks reach for a numerical solver, something like Mathematica or z3. That is less easy in SQL, and I needed to learn some math instead (specifically how to find the intersection of two line segments). Although part two seemed quite complex, it ended up being relatively easy when you realize a few simplifications (an added dimension that can be ignored until the end, allowing you to re-use part one).
See the solution
Part one + two in one go!
WITH MUTUALLY RECURSIVE
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as line
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
observation(r INT, x NUMERIC, y NUMERIC, z NUMERIC, dx NUMERIC, dy NUMERIC, dz NUMERIC) AS (
SELECT
r,
trim(',' FROM regexp_split_to_array(line, ' ')[1])::NUMERIC,
trim(',' FROM regexp_split_to_array(line, ' ')[2])::NUMERIC,
trim(',' FROM regexp_split_to_array(line, ' ')[3])::NUMERIC,
trim(',' FROM regexp_split_to_array(line, ' ')[5])::NUMERIC,
trim(',' FROM regexp_split_to_array(line, ' ')[6])::NUMERIC,
trim(',' FROM regexp_split_to_array(line, ' ')[7])::NUMERIC
FROM
lines
),
-- Part one: for each pair, solve for a future (x,y) intersection of their traced paths.
-- https://en.wikipedia.org/wiki/Line–line_intersection#Given_two_points_on_each_line_segment
meeting(r1 INT, r2 INT, x NUMERIC, y NUMERIC, t1 NUMERIC, t2 NUMERIC) AS (
SELECT
o1.r,
o2.r,
o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
(((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
(((o2.x - o1.x) * o1.dy) - ((o2.y - o1.y) * o1.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx)
FROM observation o1, observation o2
WHERE o1.dx * o2.dy != o1.dy * o2.dx
AND o1.r < o2.r
),
part1(part1 BIGINT) AS (
SELECT COUNT(*)
FROM meeting
WHERE t1 >= 0
AND t2 >= 0
AND x BETWEEN 200000000000000 AND 400000000000000
AND y BETWEEN 200000000000000 AND 400000000000000
),
-- Part two: find an initial x, y, z, dx, dy, dz such that you intersect every observation in the future.
-- Hypothesize dx and dy, subtract them, and assses the number of coincidences.
hypotheses(r INT, x NUMERIC, y NUMERIC, dx NUMERIC, dy NUMERIC, ox NUMERIC, oy NUMERIC) AS (
SELECT
r, x, y, dx - ox, dy - oy, ox, oy
FROM
observation,
generate_series(-500, 500) ox,
generate_series(-500, 500) oy
WHERE r < 10
AND 5 * (ox + 21) = 16 * (oy + 39) -- derived from input pair with same (dx, dy).
),
coincidence(r1 INT, r2 INT, x NUMERIC, y NUMERIC, ox NUMERIC, oy NUMERIC) AS (
SELECT
o1.r,
o2.r,
o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
o1.ox,
o1.oy
FROM hypotheses o1, hypotheses o2
WHERE o1.dx * o2.dy != o1.dy * o2.dx
AND o1.r < o2.r
AND o1.ox = o2.ox
AND o1.oy = o2.oy
),
hypotheses_xz(r INT, x NUMERIC, y NUMERIC, dx NUMERIC, dy NUMERIC, ox NUMERIC, oy NUMERIC) AS (
SELECT
r, x, z, dx - ox, dz - oz, ox, oz
FROM
observation,
generate_series(-117, -117) ox,
generate_series(-500, 500) oz
WHERE r < 10
),
coincidence_xz(r1 INT, r2 INT, x NUMERIC, y NUMERIC, ox NUMERIC, oy NUMERIC) AS (
SELECT
o1.r,
o2.r,
o1.x + o1.dx * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
o1.y + o1.dy * (((o2.x - o1.x) * o2.dy) - ((o2.y - o1.y) * o2.dx)) / (o1.dx * o2.dy - o1.dy * o2.dx),
o1.ox,
o1.oy
FROM hypotheses_xz o1, hypotheses_xz o2
WHERE o1.dx * o2.dy != o1.dy * o2.dx
AND o1.r < o2.r
AND o1.ox = o2.ox
AND o1.oy = o2.oy
),
potato (x INT) AS ( SELECT 1 )
-- SELECT x, y, ox, oy, COUNT(*) FROM coincidence GROUP BY x, y, ox, oy HAVING COUNT(*) > 1;
SELECT x, y, ox, oy, COUNT(*) FROM coincidence_xz GROUP BY x, y, ox, oy HAVING COUNT(*) > 1;
Contributors
Day 24 was brought to you by: @frankmcsherry
Day twenty-five asked for a minimum graph cut (of three edges). This is a standard optimization problem, but rather than try to implement the Stoer-Wagner algorithm I went with something from my PhD thesis: partitioning the graph based on the Fiedler vector. It turns out this gave the right answer on the first try, and the holidays were saved!
See the solution
Part one
WITH MUTUALLY RECURSIVE (RETURN AT RECURSION LIMIT 50)
lines(r INT, line TEXT) AS (
SELECT r, regexp_split_to_array(input, '\n')[r] as line
FROM input, generate_series(1, array_length(regexp_split_to_array(input, '\n'), 1)) r
),
edges(src TEXT, dst TEXT) AS (
SELECT
trim(':' FROM regexp_split_to_array(line, ' ')[1]),
trim(',' FROM regexp_split_to_array(line, ' ')[x])
FROM
lines, generate_series(2, array_length(regexp_split_to_array(line, ' '), 1)) x
),
symm(src TEXT, dst TEXT) AS (
SELECT src, dst FROM edges
UNION ALL
SELECT dst, src FROM edges
),
init(src TEXT, val NUMERIC) AS (
SELECT src, CASE WHEN src < 'n' THEN 1.0 ELSE -1.0 END
FROM (SELECT src FROM edges UNION ALL SELECT dst FROM edges)
),
-- determine the second eigenvector of the adjacency matrix
weight(src TEXT, val NUMERIC) AS (
SELECT * FROM init
EXCEPT ALL
SELECT * FROM init_delayed
UNION ALL
SELECT symm.src, SUM((val - (SELECT AVG(val) FROM weight))/(SELECT STDDEV(val) FROM weight))
FROM symm, weight
WHERE symm.dst = weight.src
GROUP BY symm.src
),
init_delayed(src TEXT, val NUMERIC) AS ( SELECT * FROM init ),
part1(part1 BIGINT) AS (
SELECT
(SELECT COUNT(*) FROM weight WHERE val < 0.0) *
(SELECT COUNT(*) FROM weight WHERE val > 0.0)
),
potato(x INT) AS ( SELECT 1 )
SELECT * FROM part1;
Contributors
Day 25 was brought to you by: @frankmcsherry
Conclusions
The exercise was certainly helpful and informative, on multiple levels.
First, it really reinforced for me that WITH MUTUALLY RECURSIVE
is a very valuable tool to have access to when faced with a new problem.
Often your problem is a bunch of joins and reductions, but when it isn’t you are immediately in a bit of a pickle.
In most cases, algorithmic challenges immediately gave way to recursive SQL.
That being said, there’s clearly an accessibility gap when reaching for recursive SQL. I find the idioms approachable, but I’ve spent a while working with data-parallel algorithms, and have seen several of the tricks. There’s still plenty of work to do before the casual SQL author feels comfortable with recursive SQL.
Second, the majority of my time was spent debugging rather than authoring. This is a classic challenge with declaritive languages, who go from input program to output data in often inscrutable ways. I borrowed some techniques from debugging Datalog, but ideally the system itself would help me with this (and several research systems do provide integrated lineage).
Debugging the logic of SQL queries only gets harder when the data are changing underneath you.
Techniques like spot checking data become infeasible when the data changes faster than you can observe records that are meant to line up.
Materialize should help in these cases, with maintained diagnostic views that represent assertions, or better violations thereof, whose contents spell out records that at some moment violated something that was meant to be true.
Materialize’s SUBSCRIBE
provides a full account of these views, reporting records that existed even for a moment, where anything other than “always empty” represents an error in your SQL (or your assertions).
Third, using Materialize in new and weird ways shook out several bugs.
We’ve already fixed them.
Dogfooding your own product, especially in surprising contexts, is a great way to forcibly increase your test coverage.
Issues ranged from the silly (“why would you name a table count
?”) to the abstruse (doubly nested recursive SQL blocks), but they spilled out in the early days and became less frequent as the weeks went on.
Finally, the main conclusion was that it was all possible. Despite substantial anxiety about whether and when we would need to bail out, defeated, the whole project did work out. We were able to express a rich variety of computational tasks as data-driven SQL both expressed and maintained by Materialize.