cm

February 5, 2024

Using Deque and Set: Breaking Down Another Meta Level 1 Coding Challenge

In my last blog post I broke down a public level 1 Meta coding challenge from their website metacareers.com. These code puzzles are intended as practice for technical interviews for engineering roles at Meta. Here I will use Python again to break down one more level 1 puzzle: "kaitenzushi". I chose to write a blog post about it, because I got particular enjoyment out of it despite its simplicity.

First, let's take a look at the puzzle. Just like last time we're given a prompt, some constraints, and the function parameters. The basic premise is that you are at a sushi place that uses a kaiten (conveyor) belt to send sushi around to customers. You can't control what sushi is on the belt, represented by integer List D of length N, but you decide to take and eat only the sushi dishes that are of a different type (represented as integers) than the previous K (integer) types. The puzzle is to return the integer value of the maximum number of sushi dishes you can eat. Here is the starter function (in Python):
from typing import List
# Write any import statements here

def getMaximumEatenDishCount(N: int, D: List[int], K: int) -> int:
  # Write your code here
  return 0

And the constraints (the upper bounds are somewhat unrealistic!):
1 ≤ N ≤ 500,000
1 ≤ K N
1 ≤ Di ​≤ 1,000,000
Execution time ≤ 5 seconds

To begin solving this puzzle my first thought was to use a sliding window, because I pictured needing to know the last K number of dishes. But what do we actually need to check? Two things:
  1. Have I eaten this (Di) type of dish yet?
  2. Have I eaten K different types of dishes yet?

At first, it sounds like a Set might be an efficient and sufficient data structure to use to track our eaten dish types. Sets are often useful when we need a unique collection of items, because they enforce uniqueness (they also offer some neat comparison methods, but we don't need those here). For example, if we have a Set { 1, 2, 3 } and try to add 1 again we'll find the Set unchanged after the operation and without an error thrown. Sets are like key-only dictionaries with comparable lookup and insertion times to that of Dictionaries (O(1) in best case).

We can iterate over the dishes in List D and add them to our Set. Only unique ones will get added. Then we can check the length of the Set and remove the first dish type if the length equals K and we encounter a new type we're going to eat. Unfortunately, Sets are not ordered so how can we determine which element to remove? Since Python 3.7 Dictionaries are ordered. However, if we implement this solution we will find that it fails due to exceeding the execution time limit (5 seconds). Let's take a look at the function:
def getMaximumEatenDishCount(N: int, D: List[int], K: int) -> int:
  tried_types: dict = {}
  acc: int = 0
  
  for d in D:
    if d not in tried_types:
      tt_keys: List[int] = list(tried_types)
      if len(tt_keys) == K:
        del tried_types[tt_keys[0]]

      tried_types[d]: int = d
      acc += 1
  
  return acc

Right away we can see that the problem likely lies in the fact that in the worst case on each iteration of D we have to perform three lookups on our tried_types Dictionary while also creating a new List out of the existing keys. Additionally, you may have noticed that we end up checking if d (Di) is not in tried_types (the Dictionary). Since we have to perform more logic than simply adding new types to our data object, it is more efficient to check if the dish type is there first. Therefore, the uniqueness property of Dictionaries (and Sets) are not directly needed, but their quick lookup is even more helpful.

So rather than create a List on each iteration, let's move the List outside of the For loop entirely. We can track the types in the List by updating the List in our For loop logic. We can continue to use the Dictionary for lookups, but since we don't need key-value pairs let's refactor it to use Sets:
def getMaximumEatenDishCount(N: int, D: List[int], K: int) -> int:
  tried_types_list: List[int] = []
  tried_types_set: set[int] = set()
  acc: int = 0
  
  for d in D:
    if d not in tried_types_set:
      if len(tried_types_set) == K:
        old: int = tried_types_list.pop(0)
        tried_types_set.remove(old)

      tried_types_set.add(d)
      tried_types_list.append(d)
      acc += 1
  
  return acc

This solution fails on only one out of 33 tests, due to exceeding the execution time limit (5 seconds). To pass this puzzle we'll need to make this algorithm even more efficient. Lists are known for being fairly efficient at adding and removing elements from the end (also known as a Stack), but what about our use case of removing an element from the front as well (also known as a Queue)? Is there something more efficient than a List to act as a Queue? Yes: the deque, whose name is a portmanteau of double-ended and queue. Deques have some nifty features we can use to make our code slightly simpler as well. Let's take a look at our function utilizing deque and then I'll explain:
from typing import List
from collections import deque

def getMaximumEatenDishCount(N: int, D: List[int], K: int) -> int:
  tried_types_deque: deque[int] = deque(maxlen=K)
  tried_types_set: set[int] = set()
  acc: int = 0
  
  for d in D:
    if d not in tried_types_set:
      if len(tried_types_set) == K:
        tried_types_set.remove(tried_types_deque[0])
        
      tried_types_deque.append(d)
      tried_types_set.add(d)
      acc += 1
  
  return acc

This function passes all 33 of the tests. You may notice that the deque (tried_types_deque) only has two operations performed on it: a lookup of the zeroth index and an append to the end. We never remove the first element. Or do we? If we look at where the deque is initialized you can see that I provided the optional parameter of "maxlen" (maximum length) and set it to K. This means our deque will never exceed that length and additionally it will automatically remove elements on a FIFO basis (first-in, first-out). So by simply specifying the max length we get the Queue behavior we wanted from a data structure that is already optimized for this behavior. 

Each of the steps in the For loop can be done in constant time in the best case (O(1)). Therefore, the overall time complexity of this algorithm is O(N) where N is the length of the List D. 

About cm