How to Explain Encode and Decode Strings in a Coding Interview (Step-by-Step)
Encode and Decode Strings looks simple on the surface — but the entire challenge is in the edge cases. A naive delimiter breaks the moment a string contains that delimiter. The key insight is length-prefix encoding, a real technique used in network protocols.
In this post, you’ll learn exactly how to walk an interviewer through this problem: why naive approaches fail, how length-prefix encoding works, and how to implement both encode and decode cleanly.
Problem Statement
Design an algorithm to encode a list of strings to a single string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Implement encode and decode methods such that decode(encode(strs)) == strs for any valid input.
Examples
Encoded: “4#neet4#code4#love3#you”
Output: [“neet”, “code”, “love”, “you”]
Encoded: “2#we3#say1#:3#yes”
Output: [“we”, “say”, “:”, “yes”]
Constraints
0 <= strs.length <= 2000 <= strs[i].length <= 200strs[i]contains any possible character — including#, spaces, and special chars
What Is the Problem?
“I need to serialize a list of strings into a single string in a way that is completely reversible — no matter what characters the original strings contain. The encoded format must allow me to recover the exact original list.”
The critical phrase is “any possible character” — that’s what makes this problem interesting and eliminates most naive approaches.
Step 1: Explain Why Naive Approaches Fail
Walk the interviewer through the natural first instincts — and why they break:
e.g. "neet|code|love"
Breaks if any string contains "|". Since any character is allowed, no delimiter is safe.
e.g. escape "|" as "\|"
Works in theory but gets complicated fast — what if the string contains the escape character itself? Fragile and error-prone.
Any character-based delimiter can appear inside a string. We need a scheme where the decoder can always tell where one string ends and the next begins — without relying on a reserved character.
The Key Insight — Length-Prefix Encoding
Instead of marking where strings end, tell the decoder how long each string is upfront. Prefix every string with its length and a separator: 4#neet. The decoder reads the length first, then reads exactly that many characters — no ambiguity possible.
“The format is[length]#[string]for each word. The#acts as a delimiter between the length number and the string content — but since we always read exactlylengthcharacters after#, it doesn’t matter if the string itself contains#.”
This is the same principle used in real-world protocols like HTTP chunked transfer encoding and length-delimited protobuf fields.
Step 2: Explain Encode and Decode in Plain English
Encode
“For each string, I prepend its length followed by #, then concatenate everything into one long string.”
Decode
“I maintain a pointeristarting at 0. I find the next#character, read the number before it as the length, then slice exactlylengthcharacters after the#. That’s one word. I advanceiand repeat until the string is exhausted.”
Step 3: Talk Through the Code
class Codec:
def encode(self, strs: List[str]) -> str:
result = ""
for s in strs:
result += str(len(s)) + "#" + s # Prefix each string with "length#"
return result
def decode(self, s: str) -> List[str]:
result = []
i = 0
while i < len(s):
j = i
while s[j] != "#": # Scan forward to find the '#' separator
j += 1
length = int(s[i:j]) # Everything before '#' is the length
word = s[j + 1: j + 1 + length] # Read exactly 'length' chars after '#'
result.append(word)
i = j + 1 + length # Advance pointer past this word
return result
Point out that we never split on # globally — we only use it to find the end of the length prefix. The actual word is read by position, not by delimiter.
Visual Walkthrough
Encoding ["neet", "code", ":", "yes"]:
Decoding steps for "4#neet4#code1#:3#yes":
i=0 → scan to # at index 1 → length = 4 → read s[2:6] = "neet" → advance i=6i=6 → scan to # at index 7 → length = 4 → read s[8:12] = "code" → advance i=12i=12 → scan to # at index 13 → length = 1 → read s[14:15] = ":" → advance i=15i=15 → scan to # at index 16 → length = 3 → read s[17:20] = "yes" → advance i=20Step 4: Highlight Edge Cases
#"["a#b"]encodes to"3#a#b". During decode, we find the first#at index 1, read length3, then read exactly 3 characters:a#b. The#inside the string is consumed as data, not as a delimiter."
"An empty string encodes as"0#". During decode we read length0, then read 0 characters — appending an empty string to the result. Correct."
"Ifstrs = [], encode returns an empty string"". The decode while loop conditioni < len(s)is immediately false — we return an empty list. Correct."
"The length prefix can be multiple digits — e.g."12#hello world...". The inner while loop scans until it finds#, so it correctly reads multi-digit lengths like12,100, or200."
Step 5: State Time and Space Complexity
Let n = total number of characters across all strings.
| Operation | Time | Space |
|---|---|---|
| Encode | O(n) | O(n) |
| Decode | O(n) | O(n) |
"Both operations are O(n) in time and space where n is the total number of characters. Each character is visited exactly once during encoding and exactly once during decoding."
Interview Checklist for Encode and Decode Strings
- Restate — "serialize a list of strings into one string, fully reversible"
- Explain why delimiter-based approaches fail — any char can appear in a string
- State the key insight — length-prefix encoding removes delimiter ambiguity
- Explain encode: prepend
length#to each string - Explain decode: find
#, read length, slice exactly that many chars - Walk through the
#-inside-string edge case explicitly - Handle empty string, empty list, and multi-digit length edge cases
- State O(n) time and O(n) space for both operations
- Mention real-world relevance — HTTP chunked encoding, protobuf
- Ask before coding, don't just dive in
Related Problems to Practice
If you can solve Encode and Decode Strings cleanly, try these next:
