How to Explain Encode and Decode Strings in a Coding Interview (Step-by-Step)

How to Explain Encode and Decode Strings in a Coding Interview (Step-by-Step)

Encode and Decode Strings looks simple on the surface — but the entire challenge is in the edge cases. A naive delimiter breaks the moment a string contains that delimiter. The key insight is length-prefix encoding, a real technique used in network protocols.

In this post, you’ll learn exactly how to walk an interviewer through this problem: why naive approaches fail, how length-prefix encoding works, and how to implement both encode and decode cleanly.


Problem Statement

LeetCode 271 Medium ⭐ Premium

Design an algorithm to encode a list of strings to a single string. The encoded string is then sent over the network and is decoded back to the original list of strings.

Implement encode and decode methods such that decode(encode(strs)) == strs for any valid input.

Examples

Example 1
Input: [“neet”, “code”, “love”, “you”]
Encoded: “4#neet4#code4#love3#you”
Output: [“neet”, “code”, “love”, “you”]
Example 2
Input: [“we”, “say”, “:”, “yes”]
Encoded: “2#we3#say1#:3#yes”
Output: [“we”, “say”, “:”, “yes”]

Constraints

  • 0 <= strs.length <= 200
  • 0 <= strs[i].length <= 200
  • strs[i] contains any possible character — including #, spaces, and special chars

What Is the Problem?

“I need to serialize a list of strings into a single string in a way that is completely reversible — no matter what characters the original strings contain. The encoded format must allow me to recover the exact original list.”

The critical phrase is “any possible character” — that’s what makes this problem interesting and eliminates most naive approaches.


Step 1: Explain Why Naive Approaches Fail

Walk the interviewer through the natural first instincts — and why they break:

❌ Join with delimiter

e.g. "neet|code|love"

Breaks if any string contains "|". Since any character is allowed, no delimiter is safe.

❌ Escape the delimiter

e.g. escape "|" as "\|"

Works in theory but gets complicated fast — what if the string contains the escape character itself? Fragile and error-prone.

⚠️ The Core Problem

Any character-based delimiter can appear inside a string. We need a scheme where the decoder can always tell where one string ends and the next begins — without relying on a reserved character.


The Key Insight — Length-Prefix Encoding

💡 Core Insight

Instead of marking where strings end, tell the decoder how long each string is upfront. Prefix every string with its length and a separator: 4#neet. The decoder reads the length first, then reads exactly that many characters — no ambiguity possible.

“The format is [length]#[string] for each word. The # acts as a delimiter between the length number and the string content — but since we always read exactly length characters after #, it doesn’t matter if the string itself contains #.”

This is the same principle used in real-world protocols like HTTP chunked transfer encoding and length-delimited protobuf fields.


Step 2: Explain Encode and Decode in Plain English

Encode

“For each string, I prepend its length followed by #, then concatenate everything into one long string.”

Decode

“I maintain a pointer i starting at 0. I find the next # character, read the number before it as the length, then slice exactly length characters after the #. That’s one word. I advance i and repeat until the string is exhausted.”

Step 3: Talk Through the Code

class Codec:

    def encode(self, strs: List[str]) -> str:
        result = ""
        for s in strs:
            result += str(len(s)) + "#" + s  # Prefix each string with "length#"
        return result

    def decode(self, s: str) -> List[str]:
        result = []
        i = 0

        while i < len(s):
            j = i
            while s[j] != "#":   # Scan forward to find the '#' separator
                j += 1

            length = int(s[i:j])         # Everything before '#' is the length
            word = s[j + 1: j + 1 + length]  # Read exactly 'length' chars after '#'
            result.append(word)
            i = j + 1 + length           # Advance pointer past this word

        return result

Point out that we never split on # globally — we only use it to find the end of the length prefix. The actual word is read by position, not by delimiter.


Visual Walkthrough

Encoding ["neet", "code", ":", "yes"]:

Encoding: each word → length# + word
Input
neet code : yes
Encoded
4#neet 4#code 1#: 3#yes
Result
"4#neet4#code1#:3#yes"

Decoding steps for "4#neet4#code1#:3#yes":

1
i=0 → scan to # at index 1 → length = 4 → read s[2:6] = "neet" → advance i=6
2
i=6 → scan to # at index 7 → length = 4 → read s[8:12] = "code" → advance i=12
3
i=12 → scan to # at index 13 → length = 1 → read s[14:15] = ":" → advance i=15
4
i=15 → scan to # at index 16 → length = 3 → read s[17:20] = "yes" → advance i=20

Step 4: Highlight Edge Cases

1 String contains #
"["a#b"] encodes to "3#a#b". During decode, we find the first # at index 1, read length 3, then read exactly 3 characters: a#b. The # inside the string is consumed as data, not as a delimiter."
2 Empty string in list
"An empty string encodes as "0#". During decode we read length 0, then read 0 characters — appending an empty string to the result. Correct."
3 Empty list
"If strs = [], encode returns an empty string "". The decode while loop condition i < len(s) is immediately false — we return an empty list. Correct."
4 String length greater than 9
"The length prefix can be multiple digits — e.g. "12#hello world...". The inner while loop scans until it finds #, so it correctly reads multi-digit lengths like 12, 100, or 200."

Step 5: State Time and Space Complexity

Let n = total number of characters across all strings.

OperationTimeSpace
Encode O(n) O(n)
Decode O(n) O(n)
"Both operations are O(n) in time and space where n is the total number of characters. Each character is visited exactly once during encoding and exactly once during decoding."

Interview Checklist for Encode and Decode Strings

  • Restate — "serialize a list of strings into one string, fully reversible"
  • Explain why delimiter-based approaches fail — any char can appear in a string
  • State the key insight — length-prefix encoding removes delimiter ambiguity
  • Explain encode: prepend length# to each string
  • Explain decode: find #, read length, slice exactly that many chars
  • Walk through the #-inside-string edge case explicitly
  • Handle empty string, empty list, and multi-digit length edge cases
  • State O(n) time and O(n) space for both operations
  • Mention real-world relevance — HTTP chunked encoding, protobuf
  • Ask before coding, don't just dive in

If you can solve Encode and Decode Strings cleanly, try these next:

💡 Practicing your explanations out loud is just as important as writing correct code. The best candidates aren't the ones who solve it fastest — they're the ones who make the interviewer feel like they're thinking alongside them.
Scroll to Top