Thawing from deep freeze

Like many people in the world, I’m still recuperating from the charred and smoking ruins of 2020, but have started to figure out how to function again. You can glimpse a little sliver of that effort at innerdemons.me, a mental health resource I built that’s sorted by difficult emotions you may currently be having, or for people who are looking to start therapy.

On the bright side, I’m finally, finally going to Recurse Center for a full batch! The minibatch I did right before lockdown whetted my appetite, but I got derailed by all of 2020.

I’m pretty sure that working out my volitional muscles at Recurse Center much earlier in life would have paid off massive dividends in fulfillment by now – I’ve been considering going since 2013, when it was still called “Hacker School”, but kept postponing for some logistical reason or other. If you love the idea but are logistically on the fence, all I can say is: JUST GO FOR IT, BUDDY.

My batch doesn’t start for another month, but I wanted to get into the mood and loosen up my code-brain, which is gaunt and starved from hibernation. I ran a very overdue OS update, and then warmed up with a handful of r/dailyprogrammer problems for fun! (r/dailyprogrammer is not in fact daily, and has been dormant for some time, but there are still lots of great problems with a variety of solution approaches in the archives).

#383: Necklace Matching

Two strings describe the same necklace if you can remove some number of letters from the beginning of one, attach them to the end in their original ordering, and get the other string. Reordering the letters in some other way does not, in general, produce a string that describes the same necklace. Write a function that describes whether two strings can be the same necklace.

In other words, determine whether two strings are equivalent under some rotation cipher. I wrote a super straightforward brute-force solution since it only costs O(n) to do so.

def same_necklace(a, b):
  # Sanity check that the strings match expectations
  if len(a) != len(b):
    return False
  # Already done
  if a==b:
    return True
  # Loop through all possible rotations of first string
  for split_index in range(len(a)):
    if a[split_index:] + a[0:split_index] == b:
      return True
  return False

This does some quick sanity checks, then loops through all possible rotations of the first string and checks if any of them are equivalent to the second string.

I felt dissatisfied by this problem, so I masochistically decided to redo it in bash. I just really like figuring out new bash tricks! (I was also inspired to seek out opportunities for learning & understanding, rather than simply Solving The Problem, by this excellent blog post on systematic debugging (with unexpected twists) from fellow Recurser Allison Kaptur.

Take 2: Bash approach

While bash tools are incredibly powerful when you’re schlepping data around your filesystem, it’s a massive syntactic pain to do some “simple” things like arithmetic or conditionals, making it a great (painful) learning opportunity for an algorithmically simple problem.

You can use the expr command to do arithmetic more easily: $(expr \( $x \) + 1). However, this may not always be the appropriate tool, as expr is a binary rather than a shell built-in, and can be slower. In addition, expr behavior may vary between systems.

Some more vanilla bash-safe ways to do inline addition are let y=x+1 or $((x+1)). (If you’re actually incrementing or decrementing your variable, rather than using an inline expression, you can just use the -- / ++ operators.)

Rough draft (not correct)

Ok, back to the problem. (Note that LEN is ignored and hardcoded to 6 for drafting simplicity.)

#!/bin/bash
function same_necklace() {
  STR=$1
  # (there is an error in the following line)
  LEN=`echo ${STR} | wc -c`; echo $LEN
  for i in {1..6}; do
    # Rotate string by concatenating substr[i, len] and substr[1, i-1]
    ROT="`echo ${STR} | cut -c${i}-6``[[ ${i} > 1 ]] && (echo ${STR} | cut -c1-$((i-1)))`";
    [[ $ROT == $2 ]] && return 1
  done
  return 0
}

Yik. There’s a real syntactic mess up in here, but the gist of it is pretty much exactly the same as the Python script, minus the two convenience checks at the beginning because I truly hate remembering bash conditional syntax.

Explanation of cut syntax

The cut command can be used to cut on fields, delimiters, character indices, etc. In this particular case, a command like echo "abcdef" | cut -c1-3 will give you abc. Note that cut is 1-indexed, and indices are inclusive.

So the ROT=... line is just taking the second substring (at position i-6, with -c${i}-6) and gluing it to the first substring (at position 1-i-1, with -c1-$((i-1)).

Have you checked your whitespace?

This worked great for my 6-character test values "string" and "ringst". But when I went to switch over the hardcoded length of 6 to the generalized $LEN I wrote earlier, I ran into bizarre errors!

cut: 7: No such file or directory

cut: [-cf] list: values may not include zero

I futzed around for a while trying to understand these cryptic messages, before remembering that the output of wc is lousy with whitespace :sob:

> echo "string" | wc -c
       7
> printf "string" | wc -c
       6

Oh NO. Have I been counting the extra newline all my life? Always check your intermediate steps in the shell!

We can strip that trailing whitespace by piping the output to xargs.

echo ${STR} | xargs | wc -c

Word of caution about &&

A warning about bash bitwise logical operators which I discovered while debugging the above.

It’s fine in certain cases to use && as shorthand for an inline conditional. In my particular case, it’s safe because I’m using it like a binary conditional to only emit the second part of the string when the indices are valid.

Unfortunately, if you’re doing a ternary bash conditional, a && b || c will not always behave as expected. This is because we’re not in a nice, controlled programming environment with types or anything so fancy; we’re in the shell.

a && b || c is merely a series of logic operations that may or may not fail if the shell commands between the operands emit non-zero exit statuses. So if you want a safe ternary conditional, use if..then..fi instead!

Final solution

The final, awful bash mess I came up with. It’s the journey that matters.

#!/bin/bash/
function same_necklace() {
  STR=$1
  # Get the length of the input string
  LEN=`printf ${STR} | wc -c | xargs`
  for i in $(seq 1 ${LEN}); do
    # Rotate string by cutting at the index and concating flipped
    ROT="`echo ${STR} | cut -c"${i}-${LEN}"``[[ ${i} > 1 ]] && (echo ${STR} | cut -c1-$((i-1)))`";
    [[ $ROT == $2 ]] && echo "yes" && return 1
  done
  echo "no" && return 0
}

Tangent: the many uses of [tr]

While looking around for string utility documentation, I ran into a neat thing I didn’t know about, that’s totally unrelated to this problem. tr (short for “translate”) can be used to transform standard output with substitution or deletion of selected characters! So you can do all kinds of one-line CLI witchery.

We can quickly transform a string to uppercase:

> echo "HelLo wOrLd" | tr a-z A-Z
HELLO WORLD

or put all words in a string on their own line,

> echo "One fish two fish" | tr -cs 'a-zA-Z0-9' '[\n*]'
One
fish
two
fish

or even one-line a Caesar cipher!

> echo "Caesar Cipher" | tr 'A-Za-z' 'N-ZA-Mn-za-m'
Pnrfne Pvcure

Bookmarking this as future fodder for a Joy of Bash zine!

#371: N-Queens Validator

Given an array of 8 integers between 1 and 8, determine whether it represents a valid 8 queens solution. You may optionally handle solutions for any N, not just N = 8.

N-Queens is a classic problem of placing N queens upon an NxN chessboard such that none of them can attack each other. It’s a good algorithms problem to think about because the brute force solution is uncomfortably exponential, but it can be optimized in a wide variety of ways including backtracking, genetic algorithms, and constraint solving.

(For an arcane witch’s tale featuring N-Queens, see Aphyr’s Typing The Technical Interview, whose Haskell magick I cannot claim to comprehend.)

“Are you really unable,” you ask, voice as calm as stone, “to imagine eight powerful women in the same room without them trying to kill each other?”

“It’s… it’s just how the problem works.”

For the validator, I used set checks to get an O(n) runtime.

#!/usr/bin/env python3
def qcheck(queens):
  """O(n) check if an N-queens solution is valid"""
  # Check if any on the same row
  if len(set(queens)) != len(queens):
    return False
  # The difference between each queen's row and col,
  # 0-indexed, uniquely represents its diagonal of
  # slope +1. Check if any are the same.
  diffs = [
    (col - row - 1) for col, row in enumerate(queens)]
  if len(set(diffs)) != len(diffs):
    return False
  # If we reverse the col indices, we get the
  # diagonals of slope -1.
  diffs = [
    (-col - row - 1) for col, row in enumerate(queens)]
  if len(set(diffs)) != len(diffs):
    return False
  return True

I relied on the constant difference between squares on the same diagonal of slope=+1: (1,1), (3,3), etc. Instead of comparing them all with each other, I just use a set to check for uniqueness; if the set length and list length are different, then the elements aren’t all unique.

You don’t need to wrap around with a modulus here (unless you’re playing toroidal chess!) since queen attacks don’t wrap around the board.

I forgot for a hot second about the slope=-1 diagonals! After staring at (2,0) and (1,1) with horror, I realized I could just reverse the column indices on the board to get a constant difference unique to each diagonal, which I was pretty happy with.

#368: Single Symbol Squares

Given a grid size N, find an NxN layout of X’s and O’s such that no axis-aligned square (2x2 or larger) within the grid has the same symbol at each of its four corners. That is, if four cells of the grid form a square, they must not be either all X’s or all O’s.

I started by writing a validator for a Single Symbol Square, which, for brevity’s sake, I shall henceforth refer to as an SSS.

def sss_checker(mat):
  """Check if an NxM bool matrix is an SSS"""
  N = len(mat)
  M = len(mat[0])
  def check_subsq(i,j,k):
    """check corners of square of size k starting at i,j, where k>1"""
    return len(
      {mat[i][j], mat[i][j+k-1], mat[i+k-1][j], mat[i+k-1][j+k-1]}) == 2
  for i in range(N):
    for j in range(M):
      for k in range(2, min(N-i+1, M-j+1)):
        if not check_subsq(i,j,k):
          return False
  return True

Just out of curiosity, I wrote a bogo (or brute force) solver for comparison, to see how slow it would be. It was only performant to N=3.

def get_all_grids(N):
  """Return iterator of all possible {0,1} matrices of size NxN"""
  return iter(
    np.reshape(np.array(i), (N,N))
    for i in itertools.product([0,1], repeat=N*N)
  )

def find_sss_bogo(N):
  """Slow brute force SSS finder."""
  for mat in get_all_grids(N):
    if sss_checker(mat):
      return mat

(I love itertools! itertools.product is a great way to quickly generate combinations; I use it a lot for generating test cases.)

Now I had a sense of how to approach this a little more performantly. I mutate the 0-row and 0-column – think of them as the 0-side edge of the grid – and recursively solve for subsquares of size N-1 that satisfy the outer matrix.

This was able to achieve a valid N=7 solution in 0.2s, but was way too slow to handle N=8. The actual code is mostly matrix fluff, so here’s some quick pseudocode.

  1. If we’re starting from scratch, init the matrix as an NxN 2D array of zeros.

  2. Base case: If N=2, return a brute force check using all valid (precomputed) 2x2 SSSs as our inner subsquare.

  3. Otherwise, mutate the first row. For each possibility, check if the outer layers are still valid.

  4. (Nested) Mutate the first column. For each possibility, check if the outer layers are still valid.

  5. (Nested) Check if there exists any (N-1)x(N-1) subsquare one layer in such that the entire matrix is a valid SSS. If not, backtrack and continue mutating.

In retrospect, I think I could have potentially made this a lot faster by flipping the search around USACO-style, but I ran out of energy to redo it after I was done.

Last but not least, this blog post

I spent more time writing this blog post as I did actually writing code, but I’m glad I did! It forced me to lock in my understandings about the funky errata I came across, and to explain my reasoning and process. Also, a lot of random junk can break when you update your OS, and it was good to dust off this blog once more.