Advent of Code (Day 1, Part 2)

Advent of Code (Day 1, Part 2)

~Counting down the days to the Christmas holidays :)

In my previous blog post, I went through my solution in detail for part 1 of AoC 2023 day 1. In this post, I would like to do a walkthrough of day 1, part 2. The solution and documentation for part 2 will be relatively shorter as compared to part 1. That being said, after going through the solution, I would also like to discuss about some challenges I faced in part 2 and some key learning points.

I would like to reemphasize that I am by no means a competitive programmer nor that my solution is the most efficient.

With that out of the way, let's begin.

The Puzzle (Part 2)

Above is the puzzle given for part 2. The calibration document which contains all the lines is also the same from part 1.

Seems like the task is largely the same as before except now written numbers must be converted to their numerical representation. After the conversion is complete, the first and last number in the line will be concatenated to form a 2-digit number. The sum of all the 2-digit numbers for all the 1001 lines will be the answer.

Since part 1 was solved using Python3, it would only make sense to implement part 2 using the same language.

Before considering producing the 2-digit value for each line in the calibration document, I made a small test program to check if my solution was able to convert the written numbers to numerals. Once the test program was able to correctly run all the 7 lines given in the puzzle above, I made the program into a function and integrated said function to my code from part 1.

Code Walkthrough

def converter(rawinput):

    cases = { "one": "one1one",  "two": "two2two", "three": "three3three", "four": "four4four",  "five": "five5five", "six": "six6six",  "seven": "seven7seven", "eight": "eight8eight", "nine":"nine9nine"}

    for key, value in cases.items():
        rawinput = rawinput.replace(key, value)

    return rawinput

Above is the aptly named converter(rawinput) function. This function takes in a single line of the calibration document as an input argument as a string(rawinput).

Within this function, a dictionary (cases) is declared. The values in the keys of this dictionary are the written numbers and their corresponding values are the numerical representations.

However, you may notice that the values in the dictionary are somewhat strange. Instead of the "one" key being mapped to a "1" value, the value is instead "one1one". The explanation for this can be found below in Subsection 2 of the "Challenges Faced, Lessons Learned" section where I discuss the technical difficulties faced for part 2.

The function will then search for the keys of the dictionary (which can be thought of as substrings) in the input string (rawinput) using the for-loop. If any of the key's exist in the input string, the substring will be swapped with the mapped value of the key. The input string will then be updated.

Once the for-loop has completed execution, the function will the return the converted input string (rawinput). Yes, in hindsight, I could have assigned the return output string of the function to another variable instead of keeping the same name to avoid potential confusion.

The return value of this function can then be fed as an input argument to the lineprocessor(name) function in part 1.

Before moving forward, let's look at a few examples of the conversion done by the function.

  • two1nine = two2two1nine9nine

  • eightwothree = eight8eighttwo2twothree3three

  • zoneight234 = zone1oneeight8eight234

Compiling Function Outputs

for i in range(len(mainlst)):
    convertedlst.append(converter(mainlst[mainlst_index]))    #converter function applied to all elements in the input file. Converter function converts all the written numbers in input text file to digits. For example, "eight" to "8". 
                                                                #Appends all converted elements into convertedlst.
    mainlst_index = mainlst_index + 1

mainlst_index = 0

for i in range(len(convertedlst)):
    runningsum.append(lineprocessor(convertedlst[mainlst_index]))    #Runs the lineprocessor function on all elements in the convertedlst and appends the output of each element to runningsum list.
    mainlst_index = mainlst_index + 1

totalsum = sum(runningsum)    #Finds the sum of all elements in the runningsum list

print("Number of distinct elements in input file=", len(mainlst))
print("Sum of all calibration values=", totalsum)

Above are the lines of code to process the lines of the calibration document input text file through the 2 functions. Since there are 2 functions to be run, the flow of the code has changed slightly compared to part 1.

for i in range(len(mainlst)):
    convertedlst.append(converter(mainlst[mainlst_index]))    #converter function applied to all elements in the input file. Converter function converts all the written numbers in input text file to digits. For example, "eight" to "8". 
                                                                #Appends all converted elements into convertedlst.
    mainlst_index = mainlst_index + 1

mainlst_index = 0

This for-loop essentially takes processes the lines of the calibration document. Since the calibration document has been saved into the program as a list (mainlst), the for-loop shown here simply goes through the list by incrementing the list index (mainlst_index).

For every index of the list, the converter() function will be applied to the element. Once the element has been processed, it will be appended to another list named "convertedlst".

Once the entire input list (mainlst) has been processed, the index of said list will be reset to "0".

for i in range(len(convertedlst)):
    runningsum.append(lineprocessor(convertedlst[mainlst_index]))    #Runs the lineprocessor function on all elements in the convertedlst and appends the output of each element to runningsum list.
    mainlst_index = mainlst_index + 1

totalsum = sum(runningsum)    #Finds the sum of all elements in the runningsum list

print("Number of distinct elements in input file=", len(mainlst))
print("Sum of all calibration values=", totalsum)

This portion of the program is very similar to part 1. The only difference is that the lineprocessor() function will take in the previously generated "convertedlst" as an input argument instead.

This way, the lineprocessor() can take into account the entire calibration document which has had the written numbers swapped for numerical digits.

Submission Time!!

And there it is!! Another gold star :)

This marks the end of day 1 of AoC 2023. I do have to say that part 2 was much trickier to get the correct answer. In fact, it took 2 wrong submission attempts before the answer turned out to be correct.

Challenges Faced, Lessons Learned

Alright, it's time to review and recap on the challenges I faced along the way. I will break this section down into 2 subsections. Each of the subsections cover a particular challenge I encountered during part 2.

Subsection 1 (A Makeshift Dictionary)

def converter(rawinput):

    substring_lst= ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]
    replacement_lst = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
    shared_index = 9

    for i in range(shared_index):    

    #everytime the loop runs, it starts checking if the last element of substring_lst exists in the input. The last element is "nine". If a "nine" exists, it will be swapped with the corresponding index of replacement_lst.
    #after the first iteration, it will check "eight" and do the same. So on and so fourth till till first element of sublist_lst which is "zero"

        if substring_lst[shared_index] not in rawinput:    #if "nine" is not in the string, minus 1 from shared_index to check if "eight" is in.
            shared_index = shared_index -1

        elif substring_lst[shared_index]  in rawinput:    #if "nine" is in, swap "nine" with "9".
            rawinput = rawinput.replace(substring_lst[shared_index] , replacement_lst[shared_index])
            shared_index = shared_index -1

    return rawinput

Above is an early version of the converter(rawinput) function. The approach used here is different from my final implementation and this subsection will cover the details.

In this implementation I did not think of using a dictionary to swap the written numbers of the input argument with the appropriate numerical representations. Instead, I created 2 lists containing 9 elements that share a common index.

Within the first list (substring_lst), the 9 elements are the written numbers from "one" to "nine" with "one" being at index "0".

The second list (replacement_lst) is identical except the elements are the numerical representations of the written numbers "1" to "9" with "1" being at index "0".

The use of these 2 lists will be explained below when I go into detail about the for-loop block.

for i in range(shared_index):    

    #everytime the loop runs, it starts checking if the last element of substring_lst exists in the input. The last element is "nine". If a "nine" exists, it will be swapped with the corresponding index of replacement_lst.
    #after the first iteration, it will check "eight" and do the same. So on and so fourth till till first element of sublist_lst which is "zero"

        if substring_lst[shared_index] not in rawinput:    #if "nine" is not in the string, minus 1 from shared_index to check if "eight" is in.
            shared_index = shared_index -1

        elif substring_lst[shared_index]  in rawinput:    #if "nine" is in, swap "nine" with "9".
            rawinput = rawinput.replace(substring_lst[shared_index] , replacement_lst[shared_index])
            shared_index = shared_index -1

This for-loop will run the same number of times as the initialized value of the "shared_index" local variable which has been set to "9".

Every time the loop runs, the first "if" block will check if index "9" of the "substring_lst" list exists in the function input. In other words, it will check if the input string contains the substring "nine". If the input does not contain the substring, the "shared_index" will be decremented by 1 so that the previous element of "substring_lst" can be checked.

In the case where the first "if" block does detect that the substring exists in the input string, the first "elif" block will be executed. Within this block, the substring that was detected in the input string will be replaced with its numerical equivalent. For example, if a "nine" is detected, the code block will swap the "nine" with a "9".

The program knows which element of the "substring_lst" list should be swapped with which element of the "replacement_lst" since both these lists share a common index (shared_index). At any point within the external for-loop, the shared index will be the same unless either the "if" or the "elif" block has been executed. Then, the common index will be decremented by 1 so that the next element of "substring_lst" can be searched for in the input string.

This was the solution I came up with before using a dictionary. The outputs given by this solution were incorrect, but it is not because of a flaw in code logic.

Below is an example of an incorrect output generated by this early implementation:

  • zoneight234 = zon8234

In this example, the correct output should be "z18234". There are 2 reasons why the output is incorrect.

  1. First, the "eight" in this string is replaced with "8". Doing this will make the input "zon8234". Because the "eight" shares its first letter "e" with the "one" that comes before it, replacing the "eight" will get rid of the last letter of "one". As the program checks down the elements of "substring_lst", a "one" will not be found. This is the first error.

  2. Secondly, there is an issue of which written number should be replaced first. In my implementation I started checking if the input string contains the last element at the end of "substring_lst" which is "nine". This means that larger numbers will be replaced first and if that were to happen, the smaller numbers may not be recognized and replaced since a letter is missing.

    If the order at which the elements of the list are checked were to be reversed, the same thing will happen to bigger numbers that come after the smaller numbers that have already been replaced.

    If the program checks starting from the end of the "substring_lst" list, the output would be "zon8234".

    If it were to check from the start of said list, the output would then be "z1ight234".

    This is the second error.

In hindsight, while this implementation might be less efficient than a dictionary, I do believe that both errors can be rectified if I made one minor adjustment. I will discuss in detail about this "minor adjustment" in subsection 2.

Subsection 2 (Replacement with Padding)

cases = { "one": "one1one",  "two": "two2two", "three": "three3three", "four": "four4four",  "five": "five5five", "six": "six6six",  "seven": "seven7seven", "eight": "eight8eight", "nine":"nine9nine"}

Above is the dictionary declared in the final implementation of the solution. In addition to switching to a dictionary instead of using the 2 lists that share a common index, there is one additional minor adjustment.

As shown, the key of the first item in the dictionary which is "one" is mapped to "one1one".

Since the key is mapped to the value of "one1one", every time the key appears in the input string of the function, the value will be replaced in the input string.

Doing this solves both the issues raised in Subsection 1. Let's examine by using the same input string as Subsection 1.

  1. The first issue raised was that replacing the written number substring that exists in the input string with its numerical value may "obscure" other written digits. However, if we were to do the same process but change what we replace the written number with, we can bypass this issue. This is why the value of "one1one" for example is padded by the written numbers "one" at both the start and the end.

    Let's try it out on the input string "zoneight234".

    Now, the string becomes "zone1oneeight8eight234". The "one" and "eight" are successfully recognized and replaced and we have solved the first issue.

  2. The second issue raised was the "priority" problem. No matter which way we start checking the "substring_lst" list from, a smaller or bigger number that has yet to be checked and replaced may end up being obscured.

    By replacing the written numbers with padding, the issue of priority effectively disappears. It does not matter whether we check from the start or end of the list anymore since the replaced written numbers that were replaced first will no longer obscure any numbers to be replaced later.

Conclusion

That's all for day 1 of AoC 2023!

Upon completion of both parts, I believe I can now better approach future problems from multiple viewpoints. I believe that there are many ways a problem can be solved and sometimes a little "out-of-the-box" thinking is necessary.

I believe a possible area of self-improvement that was identified upon completion was to avoid hyper-fixation on only 1 approach to solve a complex problem. Admittedly, this is quite a common pitfall that I find myself in.

Moving forward, my hope is that I will better avoid this pitfall in the future and leave myself more open to exploring other possible solutions.

Perhaps I'll catch less of myself doing this for future AoC puzzles :)

Happy Holidays!!