Longest Substring Without Repeating Characters

The "Longest Substring Without Repeating Characters" problem is a common challenge in computer science, particularly in the field of string processing and manipulation. This problem tests the understanding of hash tables, two-pointer techniques, and string operations. Let's break down the problem and its solution from first principles.

Problem Statement:

  1. Input: A string.
  2. Goal: Find the length of the longest substring without repeating characters.

Fundamental Concepts:

  • String: A sequence of characters.
  • Substring: A contiguous sequence of characters within a string.
  • Non-Repeating Characters: Characters that do not repeat within a given substring.

Solution Approach:

  1. Sliding Window Technique:

    • Idea: Maintain a moving window that expands or contracts, ensuring no character repeats within it.
    • Implementation: Use two pointers (indices), one for the start and one for the end of the window, and a hash table to store characters and their indices.
    • Process: Move the end pointer to expand the window and include new characters until a repeat character is found. When a repeat is found, move the start pointer to exclude characters until no repeat remains in the window.
    • Tracking: Use the hash table to track the index of each character. If a character is repeated, the hash table will have its previous index.
  2. Time and Space Complexity:

    • Time Complexity: O(n), where n is the length of the string. Each character is visited at most twice (once by the end pointer and once by the start pointer).
    • Space Complexity: O(min(m, n)), where m is the size of the character set (alphabet) and n is the size of the string. The space is used by the hash table, and in the worst case, it will store all unique characters of the string.

Example Implementation in Python:

def length_of_longest_substring(s):
    """
    Finds the length of the longest substring without repeating characters.

    Args:
    s (str): The input string.

    Returns:
    int: The length of the longest substring without repeating characters.
    """
    char_index_map = {}  # Hash table to store the index of characters.
    start = maxLength = 0  # Initialize start pointer and maxLength.

    for end in range(len(s)):
        if s[end] in char_index_map:
            # Update start pointer if repeating character is found.
            start = max(start, char_index_map[s[end]] + 1)
        # Update the index of the current character.
        char_index_map[s[end]] = end
        # Update maxLength with the length of the current non-repeating substring.
        maxLength = max(maxLength, end - start + 1)

    return maxLength

This implementation:

  • Efficiently tracks the longest non-repeating substring using a sliding window.
  • Updates the start of the window when a repeating character is found.
  • Ensures that the start pointer never moves backward, using max(start, char_index_map[s[end]] + 1).
  • Continuously updates the maximum length of the non-repeating substring found so far.

Next