R setup
library(tidyverse)
library(lubridate)
library(gt)
My solutions to the #AdventOfCode2021 coding challenges, days 1 through 5.
December 1, 2021
The Advent of Code has begun for 2021, and I decided to participate this year to work on my programming and problem solving skills in R and, when I have the time, I’ll try to translate the solutions to Python. Load the reticulate
package and activate my virtual Python environment:
I’ll also be “competing” in a private leaderboard started by Tan Ho. I don’t expect to rank highly here because the puzzles are released at 1AM my time (and scores are based on time from release) but it’ll be a good source of motivation throughout the month. There are 25 days of challenges, so my current plan is to split up the posts into 5-day chunks.
The first order of business is to figure out how quickly the depth increases, just so you know what you’re dealing with - you never know if the keys will get carried into deeper water by an ocean current or a fish or something. To do this, count the number of times a depth measurement increases from the previous measurement. (There is no measurement before the first measurement.)
Import the measurements:
The tidyverse
solution to this problem is to use the dplyr::lag()
/lead()
function to refer to previous/next values. For example, for a vector of values 1-10 (in random order), I can show the cases where the value has increased like this:
# A tibble: 10 × 2
value increased
<int> <lgl>
1 10 NA
2 9 FALSE
3 8 FALSE
4 2 FALSE
5 3 TRUE
6 7 TRUE
7 1 FALSE
8 5 TRUE
9 6 TRUE
10 4 FALSE
Excluding the NA
value, which occurs due to being the first element, sum()
up the cases of larger measurements:
For the Python solution, I will use the numpy.diff
function to calculate the difference between consecutive values:
array([False, True, True, ..., True, True, True])
Then chain the .sum()
function to add up the True
values:
Note that this method is also possible in base R, and is a bit simpler than the tidyverse
solution:
Your goal now is to count the number of times the sum of measurements in this sliding window increases from the previous sum. So, compare A with B, then compare B with C, then C with D, and so on. Stop when there aren’t enough measurements left to create a new three-measurement sum.
Here, I will use both lag
and lead
to compute the sum of the window:
d_sum3 <- lag(d) + d + lead(d)
bind_cols(
value = d,
sum3 = d_sum3,
increased = lag(d_sum3) < d_sum3
)
# A tibble: 10 × 3
value sum3 increased
<int> <int> <lgl>
1 10 NA NA
2 9 27 NA
3 8 19 FALSE
4 2 13 FALSE
5 3 12 FALSE
6 7 11 FALSE
7 1 13 TRUE
8 5 12 FALSE
9 6 15 TRUE
10 4 NA NA
Now sum
the number of increases in the day 1 data:
In Python, the np.convolve
function allows computation in sliding windows:
Above, we provided the np.ones(3, dtype = int)
array which is simply [1, 1, 1]
and works as the convolution operator that slides along the r.day1
array. Note that the first two elements are not correct, however, because the boundaries (with fewer than 3 values) were returned. Fix this with the mode
argument:
Your horizontal position and depth both start at 0. The steps above would then modify them as follows:
- forward 5 adds 5 to your horizontal position, a total of 5.
- down 5 adds 5 to your depth, resulting in a value of 5.
- forward 8 adds 8 to your horizontal position, a total of 13.
- up 3 decreases your depth by 3, resulting in a value of 2.
- down 8 adds 8 to your depth, resulting in a value of 10.
- forward 2 adds 2 to your horizontal position, a total of 15.
Calculate the horizontal position and depth you would have after following the planned course. What do you get if you multiply your final horizontal position by your final depth?
Import the steps:
[1] "forward 7" "down 2" "forward 7" "down 6" "forward 1" "forward 7"
Put it in a tibble
, and tidyr::separate
the instruction and the amount:
d_day2 <- tibble(step = day2) %>%
separate(step, into = c("instruction", "amount"), sep = " ", convert = TRUE)
head(d_day2)
# A tibble: 6 × 2
instruction amount
<chr> <int>
1 forward 7
2 down 2
3 forward 7
4 down 6
5 forward 1
6 forward 7
Then summarize the horizontal position and depth, and multiply the result:
d_day2 %>%
summarise(
horizontal_position = sum(amount[instruction == "forward"]),
# Depth is inverse, so down - up
depth = sum(amount[instruction == "down"]) -
sum(amount[instruction == "up"]),
.groups = "drop"
) %>%
mutate(product = horizontal_position * depth)
# A tibble: 1 × 3
horizontal_position depth product
<int> <int> <int>
1 1990 1000 1990000
For the Python solution, I’ll use pandas
:
import pandas as pd
day2_df = pd.DataFrame(r.day2, dtype = str, columns = ['step']) \
.step.str.split(' ', expand = True) \
.rename(columns = {0: 'instruction', 1: 'amount'}) \
.astype({'amount': 'int32'})
day2_df
instruction amount
0 forward 7
1 down 2
2 forward 7
3 down 6
4 forward 1
.. ... ...
995 forward 9
996 down 3
997 down 7
998 down 5
999 forward 7
[1000 rows x 2 columns]
Then it is easy enough to sum up the different columns:
1990
day2_df[day2_df.instruction == 'down'].amount.sum() - \
day2_df[day2_df.instruction == 'up'].amount.sum()
1000
Here is another way with the groupby
and aggregate
functions:
In addition to horizontal position and depth, you’ll also need to track a third value, aim, which also starts at 0. The commands also mean something entirely different than you first thought:
- down X increases your aim by X units.
- up X decreases your aim by X units.
- forward X does two things:
- It increases your horizontal position by X units.
- It increases your depth by your aim multiplied by X.
Using this new interpretation of the commands, calculate the horizontal position and depth you would have after following the planned course. What do you get if you multiply your final horizontal position by your final depth?
First, I’ll use cumsum()
to add a running total of the aim
variable from the “down” and “up” instructions:
d_day2 <- d_day2 %>%
mutate(
# Have to use a placeholder variable so it has the same length as the
# "aim" variable below
aim_placeholder = case_when(
instruction == "down" ~ amount,
instruction == "up" ~ -amount,
TRUE ~ 0L
),
aim = cumsum(aim_placeholder)
) %>%
select(-aim_placeholder)
head(d_day2, 9)
# A tibble: 9 × 3
instruction amount aim
<chr> <int> <int>
1 forward 7 0
2 down 2 2
3 forward 7 2
4 down 6 8
5 forward 1 8
6 forward 7 8
7 down 3 11
8 up 5 6
9 forward 7 6
Now with the running total of aim
, I can compute horizontal position and depth:
d_day2 %>%
summarise(
horizontal_position = sum(amount[instruction == "forward"]),
depth = sum(
# Depth is aim multiplied by forward amount
aim[instruction == "forward"] * amount[instruction == "forward"]
),
.groups = "drop"
) %>%
mutate(product = horizontal_position * depth)
# A tibble: 1 × 3
horizontal_position depth product
<int> <int> <int>
1 1990 992674 1975421260
In Python, I will assign
a new aim column, and use the np.select()
function to conditionally sum the values:
The aggregate
function can only operate on single columns, so need to make a new depth
column first by multiplying aim
with amount
(for instruction
= ‘forward’):
day2_df = day2_df \
.assign(
depth = np.where(
day2_df.instruction == 'forward', day2_df.aim * day2_df.amount, 0
),
horizontal_position = np.where(
day2_df.instruction == 'forward', day2_df.amount, 0
)
)
day2_df
instruction amount aim depth horizontal_position
0 forward 7 0 0 7
1 down 2 2 0 0
2 forward 7 2 14 7
3 down 6 8 0 0
4 forward 1 8 8 1
.. ... ... ... ... ...
995 forward 9 985 8865 9
996 down 3 988 0 0
997 down 7 995 0 0
998 down 5 1000 0 0
999 forward 7 1000 7000 7
[1000 rows x 5 columns]
I’ve also added the horizontal_position
variable, so that I can compute the sums with a simple aggregate
:
The diagnostic report (your puzzle input) consists of a list of binary numbers which, when decoded properly, can tell you many useful things about the conditions of the submarine. The first parameter to check is the power consumption.
You need to use the binary numbers in the diagnostic report to generate two new binary numbers (called the gamma rate and the epsilon rate). The power consumption can then be found by multiplying the gamma rate by the epsilon rate.
Each bit in the gamma rate can be determined by finding the most common bit in the corresponding position of all numbers in the diagnostic report.
The epsilon rate is calculated in a similar way; rather than use the most common bit, the least common bit from each position is used.
Use the binary numbers in your diagnostic report to calculate the gamma rate and epsilon rate, then multiply them together. What is the power consumption of the submarine?
Import the binary numbers:
[1] "001000010101" "010010111110" "001010110111" "001001011101" "001001010011"
[6] "001111100111"
Each bit needs to be considered separately, so use strsplit
like this:
[[1]]
[1] "0" "0" "1" "0" "0" "0" "0" "1" "0" "1" "0" "1"
[[2]]
[1] "0" "1" "0" "0" "1" "0" "1" "1" "1" "1" "1" "0"
Split every binary number and put it into a tibble of integers:
day3_split <- strsplit(day3, split = "")
day3_df <- matrix(unlist(day3_split), ncol = 12, byrow = TRUE) %>%
as_tibble(.name_repair = "unique") %>%
mutate(across(everything(), as.integer))
head(day3_df)
# A tibble: 6 × 12
...1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 0 0 1 0 0 0 0 1 0 1 0 1
2 0 1 0 0 1 0 1 1 1 1 1 0
3 0 0 1 0 1 0 1 1 0 1 1 1
4 0 0 1 0 0 1 0 1 1 1 0 1
5 0 0 1 0 0 1 0 1 0 0 1 1
6 0 0 1 1 1 1 1 0 0 1 1 1
Before computing the solution, and there any bits with an equal number of 0s and 1s?
day3_df %>%
summarise(across(everything(), mean)) %>%
pivot_longer(everything(), names_to = "bit", values_to = "prop")
# A tibble: 12 × 2
bit prop
<chr> <dbl>
1 ...1 0.489
2 ...2 0.509
3 ...3 0.487
4 ...4 0.504
5 ...5 0.498
6 ...6 0.498
7 ...7 0.493
8 ...8 0.52
9 ...9 0.493
10 ...10 0.534
11 ...11 0.486
12 ...12 0.493
No, none the bits have prop
= 0.500. Compute the gamma and epsilon rates:
# There isn't a function available in base R to compute the mode of a vector,
# so define one here that takes the most frequent by default (freq_rank = 1)
vector_mode <- function(x, freq_rank = 1) {
# Frequency table
table(x) %>%
# Sort it by count
sort(decreasing = TRUE) %>%
# Get the labels of the counts
names() %>%
pluck(freq_rank)
}
day3_rates <- day3_df %>%
pivot_longer(everything(), names_to = "bit", values_to = "value") %>%
mutate(bit = as.integer(str_remove(bit, "..."))) %>%
group_by(bit) %>%
summarise(
gamma = vector_mode(value, freq_rank = 1),
epsilon = vector_mode(value, freq_rank = 2),
.groups = "drop"
) %>%
summarise(
# Collapse the most/least frequent values into a single string
across(c(gamma, epsilon), str_c, collapse = "")
)
day3_rates
# A tibble: 1 × 2
gamma epsilon
<chr> <chr>
1 010100010100 101011101011
We now have the binary representations, which we convert using strtoi
:
# A tibble: 1 × 3
gamma epsilon prod
<int> <int> <int>
1 1300 2795 3633500
To put this into a pandas
DataFrame
, use list comprehension to split the strings into characters:
0 1 2 3 4 5 6 7 8 9 10 11
0 0 0 1 0 0 0 0 1 0 1 0 1
1 0 1 0 0 1 0 1 1 1 1 1 0
2 0 0 1 0 1 0 1 1 0 1 1 1
3 0 0 1 0 0 1 0 1 1 1 0 1
4 0 0 1 0 0 1 0 1 0 0 1 1
.. .. .. .. .. .. .. .. .. .. .. .. ..
995 0 0 1 0 1 1 1 1 1 0 1 0
996 1 0 0 1 1 0 0 1 0 0 1 1
997 1 0 0 1 1 1 1 1 1 0 1 1
998 0 0 0 0 1 0 0 1 0 0 1 1
999 1 0 0 1 1 1 1 0 1 1 0 0
[1000 rows x 12 columns]
Then get gamma and epsilon rates:
gamma = day3_df.aggregate('mode')
# For epsilon rate, just swap the numbers
epsilon = gamma.replace([0, 1], [1, 0])
# Concatenate the bits into a single string
gamma = gamma.apply(lambda row: ''.join(row.values.astype(str)), axis = 1)[0]
epsilon = epsilon.apply(lambda row: ''.join(row.values.astype(str)), axis = 1)[0]
gamma; epsilon
'010100010100'
'101011101011'
Finally, use int()
with base = 2
to convert to decimal:
Next, you should verify the life support rating, which can be determined by multiplying the oxygen generator rating by the CO2 scrubber rating.
Both the oxygen generator rating and the CO2 scrubber rating are values that can be found in your diagnostic report - finding them is the tricky part. Both values are located using a similar process that involves filtering out values until only one remains. Before searching for either rating value, start with the full list of binary numbers from your diagnostic report and consider just the first bit of those numbers. Then:
- Keep only numbers selected by the bit criteria for the type of rating value for which you are searching. Discard numbers which do not match the bit criteria.
- If you only have one number left, stop; this is the rating value for which you are searching.
- Otherwise, repeat the process, considering the next bit to the right.
The bit criteria depends on which type of rating value you want to find:
- To find oxygen generator rating, determine the most common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 1 in the position being considered.
- To find CO2 scrubber rating, determine the least common value (0 or 1) in the current bit position, and keep only numbers with that bit in that position. If 0 and 1 are equally common, keep values with a 0 in the position being considered.
Before doing anything, I need to alter my vector_mode
function to deal with ties:
vector_mode_part2 <- function(x, freq_rank = 1) {
freq_table <- table(x) %>%
sort(decreasing = TRUE)
# If there is a tie
if (freq_table["0"] == freq_table["1"]) {
# And we're looking for the most frequent (oxygen rating)
if (freq_rank == 1) {
# Then return 1
return(1)
} else {
# Otherwise return 0 (CO2 rating)
return(0)
}
# Otherwise, return the value from the table as usual
} else {
freq_table %>%
names() %>%
pluck(freq_rank) %>%
as.integer()
}
}
This definitely isn’t the most efficient way to implement the bit criteria, but an easy solution is to just filter
bit-by-bit.
oxygen_rating <- day3_df
for (bit in names(day3_df)) {
# If 1 number (row) remains, we have found the single oxygen rating
if (nrow(oxygen_rating) == 1) break
most_freq <- vector_mode_part2(oxygen_rating[[bit]])
oxygen_rating <- oxygen_rating %>%
filter(!!sym(bit) == most_freq)
}
oxygen_rating
# A tibble: 1 × 12
...1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 0 1 0 1 0 0 1 0 1 1 1 1
co2_rating <- day3_df
for (bit in names(day3_df)) {
if (nrow(co2_rating) == 1) break
least_freq <- vector_mode_part2(co2_rating[[bit]], 2)
co2_rating <- co2_rating %>%
filter(!!sym(bit) == least_freq)
}
co2_rating
# A tibble: 1 × 12
...1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12
<int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 1 1 0 1 0 1 1 0 0 1 0 1
Convert the binary representations and compute the product:
tibble(
oxygen_rating = oxygen_rating %>% str_c(collapse = ""),
co2_rating = co2_rating %>% str_c(collapse = "")
) %>%
mutate(across(c(oxygen_rating, co2_rating), strtoi, base = 2),
prod = oxygen_rating * co2_rating)
# A tibble: 1 × 3
oxygen_rating co2_rating prod
<int> <int> <int>
1 1327 3429 4550283
It is simple enough to reproduce those loops in Python:
oxygen_rating = day3_df
for bit in day3_df:
if len(oxygen_rating) == 1:
break
bit_value_counts = oxygen_rating[bit].value_counts()
if bit_value_counts[1] >= bit_value_counts[0]:
oxygen_rating = oxygen_rating[oxygen_rating[bit] == 1]
else:
oxygen_rating = oxygen_rating[oxygen_rating[bit] == 0]
co2_rating = day3_df
for bit in day3_df:
if len(co2_rating) == 1:
break
bit_value_counts = co2_rating[bit].value_counts()
# In cases where there are no 0s or no 1s, need to fill with 0
bit_value_counts = bit_value_counts.reindex([0, 1], fill_value = 0)
if bit_value_counts[0] <= bit_value_counts[1]:
co2_rating = co2_rating[co2_rating[bit] == 0]
else:
co2_rating = co2_rating[co2_rating[bit] == 1]
# In part 1, I used apply, here I'll use aggregate along axis = 1
co2_rating = co2_rating.astype(str).aggregate(''.join, axis = 1).values[0]
oxygen_rating = oxygen_rating.astype(str).aggregate(''.join, axis = 1).values[0]
int(co2_rating, 2) * int(oxygen_rating, 2)
4550283
Bingo is played on a set of boards each consisting of a 5x5 grid of numbers. Numbers are chosen at random, and the chosen number is marked on all boards on which it appears. (Numbers may not appear on all boards.) If all numbers in any row or any column of a board are marked, that board wins. (Diagonals don’t count.)
The submarine has a bingo subsystem to help passengers (currently, you and the giant squid) pass the time. It automatically generates a random order in which to draw numbers and a random set of boards (your puzzle input).
The score of the winning board can now be calculated. Start by finding the sum of all unmarked numbers on that board; in this case, the sum is 188. Then, multiply that sum by the number that was just called when the board won, 24, to get the final score, 188 * 24 = 4512.
To guarantee victory against the giant squid, figure out which board will win first. What will your final score be if you choose that board?
Import the bingo input:
[1] "94,21,58,16,4,1,44,6,17,48,20,92,55,36,40,63,62..."
[2] ""
[3] "49 74 83 34 40"
[4] "87 16 57 75 3"
[5] "68 94 77 78 89"
[6] "56 38 29 26 60"
[7] "41 42 45 19 1"
[8] ""
The data needs to be split
up into the called numbers (at the top) and the boards. To do this, I’ll use this trick with cumsum
that I found on Stack Overflow:
day4_split <- split(
day4[day4 != ""],
cumsum(day4 == "")[day4 != ""]
)
called_numbers <- day4_split[[1]]
bingo_boards <- day4_split[2:length(day4_split)]
str_trunc(called_numbers, 50); bingo_boards[1]
[1] "94,21,58,16,4,1,44,6,17,48,20,92,55,36,40,63,62..."
$`1`
[1] "49 74 83 34 40" "87 16 57 75 3" "68 94 77 78 89" "56 38 29 26 60"
[5] "41 42 45 19 1"
Now I need to convert the called_numbers
to a numeric vector, and the bingo_boards
to numeric matrices:
called_numbers <- strsplit(called_numbers, ",")[[1]] %>% as.integer()
bingo_boards <- bingo_boards %>%
map(
~ {
.x %>%
# str_squish replaces the double spaces before single digits numbers
# with single spaces, so that we can properly strsplit by " "
str_squish() %>%
str_split(" ") %>%
map(as.integer) %>%
unlist() %>%
matrix(nrow = 5, byrow = TRUE)
}
)
head(called_numbers); bingo_boards[1]
[1] 94 21 58 16 4 1
$`1`
[,1] [,2] [,3] [,4] [,5]
[1,] 49 74 83 34 40
[2,] 87 16 57 75 3
[3,] 68 94 77 78 89
[4,] 56 38 29 26 60
[5,] 41 42 45 19 1
Here is my iteration strategy for identifying and marking called numbers (not evaluated, just a demonstration with one board and one number):
bingo_board1 <- bingo_boards[[1]]
called_number1 <- 49 # suppose 49 was called
# Replace any 49s with -1
bingo_board1[bingo_board1 == called_number1] <- -1
# Look for and row or column sums that = -5 (all values = -1)
row_sums1 <- rowSums(bingo_board1)
col_sums1 <- colSums(bingo_board1)
# If we have bingo
if (-5 %in% c(row_sums1, col_sums1)) {
# Compute the sum of the uncalled (non-negative) numbers
uncalled_sum <- sum(bingo_board1[bingo_board1 > 0])
# Return the product as the answer to the puzzle
called_number1 * uncalled_sum
}
Now put it into a loop over all numbers and boards:
bingo_boards_part1 <- bingo_boards
for (called_number in called_numbers) {
bingo_boards_part1 <- map(
bingo_boards_part1,
~{
.x[.x == called_number] <- -1
.x
}
)
# Find any winning boards
bingo_board_winner <- map_lgl(
bingo_boards_part1,
~{-5 %in% c(rowSums(.x), colSums(.x))}
)
if (sum(bingo_board_winner) > 0) {
bingo_board_final <- bingo_boards_part1[bingo_board_winner]
break
}
}
bingo_board_final; called_number
$`19`
[,1] [,2] [,3] [,4] [,5]
[1,] 93 -1 26 35 39
[2,] 91 -1 85 69 -1
[3,] -1 -1 27 57 10
[4,] -1 -1 30 73 22
[5,] -1 -1 -1 -1 9
[1] 7
Then the solution is:
For the Python solution, I’ll practice my list comprehension to compile the bingo boards:
called_numbers = [int(s) for s in r.day4[0].split(',')]
# Find the indices of the '' characters separating the bingo boards
bingo_boards_sep = [i for i,j in enumerate(r.day4) if j == '']
# Compile a list of bingo boards
bingo_boards = [r.day4[(i+1):(i+6)] for i in bingo_boards_sep]
# For each row of each board, split the string into multiple values
bingo_boards = [[board_row.split() for board_row in board] for board in bingo_boards]
That last line is a bit of a mess – it is a nested list comprehension loop which iterates over boards and then iterates over rows of each board to split the string into single values – but converting it all to numeric arrays is now simple:
array([[49, 74, 83, 34, 40],
[87, 16, 57, 75, 3],
[68, 94, 77, 78, 89],
[56, 38, 29, 26, 60],
[41, 42, 45, 19, 1]])
Now I can re-create the same loop from the R solution:
# In Python, you use deepcopy() to make copies of nested structures like this
import copy
bingo_boards_part1 = copy.deepcopy(bingo_boards)
for called_number in called_numbers:
# For each board, mark the called numbers as -1
for i,b in enumerate(bingo_boards_part1):
bingo_boards_part1[i][bingo_boards_part1[i] == called_number] = -1
# Find winning boards
winners = [-5 in np.concatenate([board.sum(axis = 0), board.sum(axis = 1)]) \
for board in bingo_boards_part1]
if True in winners:
bingo_board_final = bingo_boards_part1[winners.index(True)]
break
bingo_board_final[bingo_board_final > 0].sum() * called_number
4662
On the other hand, it might be wise to try a different strategy: let the giant squid win.
You aren’t sure how many bingo boards a giant squid could play at once, so rather than waste time counting its arms, the safe thing to do is to figure out which board will win last and choose that one. That way, no matter which boards it picks, it will win for sure.
Figure out which board will win last. Once it wins, what would its final score be?
Simple enough to alter the loop to iteratively remove winning boards until one remains:
bingo_boards_part2 <- bingo_boards
for (called_number in called_numbers) {
bingo_boards_part2 <- map(
bingo_boards_part2,
~{
.x[.x == called_number] <- -1
.x
}
)
# Find any winning boards
bingo_board_winner <- map_lgl(
bingo_boards_part2,
~{-5 %in% c(rowSums(.x), colSums(.x))}
)
# If more than one board remains, remove winners
if (length(bingo_boards_part2) > 1) {
bingo_boards_part2 <- bingo_boards_part2[!bingo_board_winner]
} else {
# Otherwise, continue until the last board wins
if (sum(bingo_board_winner) > 0) {
bingo_board_final <- bingo_boards_part2[bingo_board_winner]
break
}
}
}
bingo_board_final; called_number
$`14`
[,1] [,2] [,3] [,4] [,5]
[1,] -1 -1 -1 -1 -1
[2,] -1 -1 -1 -1 0
[3,] 10 38 -1 -1 25
[4,] -1 11 -1 -1 -1
[5,] -1 -1 67 -1 -1
[1] 80
And the product:
Python:
bingo_boards_part2 = copy.deepcopy(bingo_boards)
for called_number in called_numbers:
for i,b in enumerate(bingo_boards_part2):
bingo_boards_part2[i][bingo_boards_part2[i] == called_number] = -1
winners = [-5 in np.concatenate([board.sum(axis = 0), board.sum(axis = 1)]) \
for board in bingo_boards_part2]
# If more than one board remains, remove winners
if len(bingo_boards_part2) > 1:
bingo_boards_part2 = [b for i,b in \
enumerate(bingo_boards_part2) if not winners[i]]
else:
if True in winners:
bingo_board_final = bingo_boards_part2[winners.index(True)]
break
bingo_board_final[bingo_board_final > 0].sum() * called_number
12080
You come across a field of hydrothermal vents on the ocean floor! These vents constantly produce large, opaque clouds, so it would be best to avoid them if possible. They tend to form in lines; the submarine helpfully produces a list of nearby lines of vents (your puzzle input) for you to review.
Each line of vents is given as a line segment in the format x1,y1 -> x2,y2 where x1,y1 are the coordinates of one end the line segment and x2,y2 are the coordinates of the other end. These line segments include the points at both ends. For now, only consider horizontal and vertical lines: lines where either x1 = x2 or y1 = y2.
To avoid the most dangerous areas, you need to determine the number of points where at least two lines overlap. Consider only horizontal and vertical lines. At how many points do at least two lines overlap?
Import the lines:
[1] "223,805 -> 223,548" "609,164 -> 609,503" "461,552 -> 796,552"
[4] "207,361 -> 207,34" "503,879 -> 503,946" "937,52 -> 937,268"
I’ll use a series of separate
s to get the numeric coordinates
day5_df <- tibble(x = day5) %>%
separate(x, into = c("x1_y1", "x2_y2"), sep = " -> ") %>%
separate(x1_y1, into = c("x1", "y1"), sep = ",", convert = TRUE) %>%
separate(x2_y2, into = c("x2", "y2"), sep = ",", convert = TRUE)
head(day5_df)
# A tibble: 6 × 4
x1 y1 x2 y2
<int> <int> <int> <int>
1 223 805 223 548
2 609 164 609 503
3 461 552 796 552
4 207 361 207 34
5 503 879 503 946
6 937 52 937 268
Get the straight lines by looking for x1 == x2
or y1 == y2
, then use crossing
to get all the points touched by each line:
# Only straight lines
day5_straight <- day5_df %>%
filter((x1 == x2) | (y1 == y2)) %>%
rowwise() %>%
mutate(xy = list(crossing(x = x1:x2, y = y1:y2))) %>%
ungroup()
# As an example, show the first few points crossed by the first line
day5_straight %>%
slice(1) %>%
unnest(xy) %>%
head()
# A tibble: 6 × 6
x1 y1 x2 y2 x y
<int> <int> <int> <int> <int> <int>
1 223 805 223 548 223 548
2 223 805 223 548 223 549
3 223 805 223 548 223 550
4 223 805 223 548 223 551
5 223 805 223 548 223 552
6 223 805 223 548 223 553
Now to find the dangerous points, just need to look for any combinations of x
and y
that occur more than once:
# A tibble: 1 × 1
dangerous_points
<int>
1 7142
Create the same data frame in a pandas
DataFrame
:
day5_df = [[coord.split(',') for coord in line.split(' -> ')] \
for line in r.day5]
# "Flatten" the lists so that each element has the four coordinates
day5_df = [xy[0] + xy[1] for xy in day5_df]
day5_df = pd.DataFrame(day5_df, columns = ['x1', 'y1', 'x2', 'y2']).astype(int)
day5_df.head()
x1 y1 x2 y2
0 223 805 223 548
1 609 164 609 503
2 461 552 796 552
3 207 361 207 34
4 503 879 503 946
Find the straight lines with query
:
I’m going to brute force a solution here with a for loop and a grid of values:
ocean_floor = np.zeros((1000, 1000))
for index, row in day5_straight.iterrows():
# Need to fix the range() step if going "backwards"
x_step = 1 if row.x1 <= row.x2 else -1
y_step = 1 if row.y1 <= row.y2 else -1
for x in range(row.x1, row.x2 + x_step, x_step):
for y in range(row.y1, row.y2 + y_step, y_step):
ocean_floor[x, y] += 1
np.count_nonzero(ocean_floor > 1)
7142
Unfortunately, considering only horizontal and vertical lines doesn’t give you the full picture; you need to also consider diagonal lines. Because of the limits of the hydrothermal vent mapping system, the lines in your list will only ever be horizontal, vertical, or a diagonal line at exactly 45 degrees. Consider all of the lines. At how many points do at least two lines overlap?
Find the diagonal line points:
Combine the straight and diagonal lines and add up the points:
bind_rows(
day5_straight %>% unnest(xy),
day5_diag %>% unnest(c(x, y))
) %>%
count(x, y) %>%
summarise(dangerous_points = sum(n > 1))
# A tibble: 1 × 1
dangerous_points
<int>
1 20012
For the Python brute force solution, I can continue adding to the existing ocean_floor
grid from part 1:
day5_diag = day5_df.query('(x1 != x2) & (y1 != y2)')
for index, row in day5_diag.iterrows():
x_step = 1 if row.x1 <= row.x2 else -1
y_step = 1 if row.y1 <= row.y2 else -1
for x, y in zip(range(row.x1, row.x2 + x_step, x_step),
range(row.y1, row.y2 + y_step, y_step)):
ocean_floor[x, y] += 1
np.count_nonzero(ocean_floor > 1)
20012
Here are my personal stats so far:
tibble::tribble(
~Part, ~Day, ~Time, ~Rank, ~Score,
1, 5, "00:50:24", 6542, 0,
2, 5, "00:57:17", 4865, 0,
1, 4, "10:52:11", 33771, 0,
2, 4, "11:07:58", 30829, 0,
1, 3, "10:43:14", 64952, 0,
2, 3, "11:43:13", 45788, 0,
1, 2, "11:48:16", 74444, 0,
2, 2, "12:21:09", 72356, 0,
1, 1, "13:02:23", 72332, 0,
2, 1, "13:23:44", 63804, 0
) %>%
pivot_wider(names_from = Part, values_from = c(Time, Rank, Score),
names_glue = "Part {Part}_{.value}") %>%
mutate(
`Time between parts` = as.numeric(hms(`Part 2_Time`) - hms(`Part 1_Time`),
"minutes") %>% round(1)
) %>%
gt() %>%
tab_spanner_delim(delim = "_", split = "first")
Day | Time | Rank | Score | Time between parts | |||
---|---|---|---|---|---|---|---|
Part 1 | Part 2 | Part 1 | Part 2 | Part 1 | Part 2 | ||
5 | 00:50:24 | 00:57:17 | 6542 | 4865 | 0 | 0 | 6.9 |
4 | 10:52:11 | 11:07:58 | 33771 | 30829 | 0 | 0 | 15.8 |
3 | 10:43:14 | 11:43:13 | 64952 | 45788 | 0 | 0 | 60.0 |
2 | 11:48:16 | 12:21:09 | 74444 | 72356 | 0 | 0 | 32.9 |
1 | 13:02:23 | 13:23:44 | 72332 | 63804 | 0 | 0 | 21.4 |
Except for day 5 (when I stayed up late because it was the weekend), I’ve been completing the puzzles around lunch time on my break from work.
These 0 scores come from the global leaderboard, which only gives points to the first 100 users to finish, which I definitely won’t be doing. A better benchmark is the private leaderboard:
library(httr)
leaderboard <- httr::GET(
url = "https://adventofcode.com/2021/leaderboard/private/view/1032765.json",
httr::set_cookies(session = Sys.getenv("AOC_COOKIE"))
) %>%
content() %>%
as_tibble() %>%
unnest_wider(members) %>%
arrange(desc(local_score)) %>%
transmute(
Rank = 1:n(), Name = name, Score = local_score, Stars = stars
)
leaderboard %>%
gt() %>%
text_transform(
locations = cells_body(columns = Stars),
fn = function(stars_col) {
map_chr(stars_col,
~ html(rep(fontawesome::fa('star', fill = 'gold'),
times = as.integer(.x))))
}
) %>%
cols_align("left") %>%
tab_style(
style = list(cell_text(weight = "bold")),
locations = cells_body(
rows = (Name == "taylordunn")
)
) %>%
tab_options(container.height = 500)
Rank | Name | Score | Stars |
---|---|---|---|
1 | Emil Hvitfeldt | 1191 | |
2 | Colin Rundel | 1183 | |
3 | trang1618 | 1170 | |
4 | @_TanHo | 1154 | |
5 | mkiang | 1126 | |
6 | gpecci | 1112 | |
7 | David Robinson | 1112 | |
8 | Jarosław Nirski | 1110 | |
9 | dhimmel | 1105 | |
10 | hrushikeshrv | 1032 | |
11 | pritikadasgupta | 1028 | |
12 | Ildikó Czeller | 1026 | |
13 | Sherry Zhang | 1023 | |
14 | john-b-edwards | 1014 | |
15 | @_mnar99 | 1012 | |
16 | Melinda Tang | 984 | |
17 | Josh Gray | 952 | |
18 | Jonathan Spring | 952 | |
19 | Derek Holliday | 946 | |
20 | Jacqueline Nolis | 905 | |
21 | Anna Fergusson | 901 | |
22 | Andrew Argeros | 894 | |
23 | Zach Bogart 💙 | 891 | |
24 | exunckly | 876 | |
25 | Riinu Pius | 869 | |
26 | HannesOberreiter | 863 | |
27 | Tokhir Dadaev | 856 | |
28 | ashbaldry | 835 | |
29 | cramosu | 831 | |
30 | Tom Jemmett | 795 | |
31 | fabio machado | 791 | |
32 | @Mid1995Sed | 766 | |
33 | karawoo | 761 | |
34 | Flavien Petit | 756 | |
35 | Farhan Reynaldo | 751 | |
36 | patelis | 746 | |
37 | Arun Chavan | 746 | |
38 | @woodspock | 740 | |
39 | delabj | 737 | |
40 | MetaMoraleMundo | 715 | |
41 | Calum You | 712 | |
42 | Daniel Coulton | 708 | |
43 | mbjoseph | 684 | |
44 | Alex N | 681 | |
45 | KT421 | 680 | |
46 | jordi figueras puig | 680 | |
47 | pi55p00r | 671 | |
48 | scalgary | 659 | |
49 | Erez Shomron | 654 | |
50 | Jim Leach | 649 | |
51 | rywhale | 641 | |
52 | Matt Onimus | 635 | |
53 | collinberke | 632 | |
54 | Ghislain Nono Gueye | 628 | |
55 | ldnam | 612 | |
56 | taylordunn | 606 | |
57 | jaapwalhout | 603 | |
58 | Jeffrey Brabec | 589 | |
59 | dirkschumacher | 574 | |
60 | AlbertRapp | 561 | |
61 | Nathan Moore | 556 | |
62 | mfiorina | 553 | |
63 | Miha Gazvoda | 553 | |
64 | Nerwosolek | 545 | |
65 | TylerGrantSmith | 542 | |
66 | Darrin Speegle | 540 | |
67 | duju211 | 525 | |
68 | blongworth | 514 | |
69 | Sydney | 510 | |
70 | Kelly N. Bodwin | 508 | |
71 | David Schoch | 503 | |
72 | long39ng | 489 | |
73 | CarlssonLeo | 486 | |
74 | A-Farina | 468 | |
75 | cathblatter | 463 | |
76 | Scott-Gee | 445 | |
77 | Julian Tagell | 434 | |
78 | Josiah Parry | 419 | |
79 | thedivtagguy | 400 | |
80 | jwinget | 397 | |
81 | andrew-tungate-cms | 385 | |
82 | @mfarkhann | 382 | |
83 | @Maatspencer | 372 | |
84 | @KentWeyrauch | 368 | |
85 | Andrew Tungate | 365 | |
86 | Emryn Hofmann | 359 | |
87 | columbaspexit | 346 | |
88 | ALBERT | 343 | |
89 | Maya Gans | 338 | |
90 | Alan Feder | 314 | |
91 | Jenna Jordan | 306 | |
92 | Kevin Kent | 305 | |
93 | olmgeorg | 296 | |
94 | Wendy Christensen | 291 | |
95 | Eric Ekholm | 287 | |
96 | Daniel Gemara | 276 | |
97 | AmitLevinson | 268 | |
98 | quickcoffee | 258 | |
99 | cynthiahqy | 235 | |
100 | Andrew Fraser | 226 | |
101 | jennifer-furman | 223 | |
102 | soto solo | 222 | |
103 | antdurrant | 211 | |
104 | Adrian Perez | 196 | |
105 | Billy Fryer | 186 | |
106 | April | 181 | |
107 | Lukas Gröninger | 156 | |
108 | Jose Pliego San Martin | 113 | |
109 | aleighbrown | 106 | |
110 | Kyle Ligon | 105 | |
111 | Bruno Mioto | 85 | |
112 | Duncan Gates | 68 | |
113 | @jdknguyen | 30 | |
114 | Matthew Wankiewicz | 17 | |
115 | chapmandu2 | 0 | |
116 | NA | 0 | |
117 | Wiktor Jacaszek | 0 | |
118 | jacquietran | 0 | |
119 | Tony ElHabr | 0 | |
120 | Rizky Luthfianto | 0 | |
121 | CaioBrighenti | 0 |
Currently at rank 56, so about middle of the pack.
setting value
version R version 4.2.1 (2022-06-23 ucrt)
os Windows 10 x64 (build 19044)
system x86_64, mingw32
ui RTerm
language (EN)
collate English_Canada.utf8
ctype English_Canada.utf8
tz America/Curacao
date 2022-10-27
pandoc 2.18 @ C:/Program Files/RStudio/bin/quarto/bin/tools/ (via rmarkdown)
python: C:/Users/tdunn/Documents/.virtualenvs/r-reticulate/Scripts/python.exe
libpython: C:/Users/tdunn/AppData/Local/r-reticulate/r-reticulate/pyenv/pyenv-win/versions/3.9.13/python39.dll
pythonhome: C:/Users/tdunn/Documents/.virtualenvs/r-reticulate
version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
Architecture: 64bit
numpy: C:/Users/tdunn/Documents/.virtualenvs/r-reticulate/Lib/site-packages/numpy
numpy_version: 1.23.3
NOTE: Python version was forced by use_python function
Local: main C:/Users/tdunn/Documents/tdunn-quarto
Remote: main @ origin (https://github.com/taylordunn/tdunn-quarto.git)
Head: [4eb5bf2] 2022-10-26: Added font import to style sheet
@online{dunn2021,
author = {Dunn, Taylor},
title = {Advent of {Code} 2021: {Days} 1-5},
date = {2021-12-01},
url = {https://tdunn.ca/posts/2021-12-01-advent-of-code-2021-days-1-5},
langid = {en}
}