Understanding C#’s String Intern Pool

C# is generally considered a safe language, but its string handling reveals interesting aspects about memory management and the nature of immutability guarantees in language design.

Consider this code:

using System;
class Program
{
    static void Main()
    {
        string original = "Hello, World!";
        unsafe
        {
            fixed (char* p = original)
            {
                p[0] = 'J';
                p[1] = 'e';
                p[2] = 'l';
                p[3] = 'l';
                p[4] = 'y';
            }
        }
        Console.WriteLine($"Hello, World!");
    }
}

This outputs Jelly, World! – not Hello, World! as you might expect.

Let that sink in. We modified one string, and a completely different string literal—one we never touched—changed.

This isn’t a compiler bug. It’s not even an oversight. It’s a deliberate architectural choice hidden behind layers of abstraction that you’ve been told not to question.

How String Internment Works

The .NET runtime implements a concept called the “string intern pool.” While we’re taught that strings are immutable, the runtime actually recycles string instances behind the scenes.

This pool is primarily a memory optimization, but it also represents an interesting design choice about identity. The runtime treats two identical string literals not just as equal values, but as the same object occupying a single memory address.

When you write:

string a = "Hello";
string b = "Hello";

You’re not creating two strings. You’re creating two references to a single dictionary entry in the intern pool. The CLR has decided that string identity transcends your local variables.

The Role of Unsafe Code

The unsafe keyword in C# serves as a bridge between the managed environment and lower-level memory access. When you use it, you’re essentially stepping outside the normal guardrails of the language.

Using a character pointer with fixed (char* p = original) provides direct memory access to what is normally a protected structure in C#.

What’s particularly interesting is what this reveals: strings that appear as separate entities in your code are actually shared objects in memory. This optimization, while efficient, creates a connection between seemingly independent parts of your code.

The Hidden Architecture of Meaning

This behavior reveals something deeper about programming languages and their designers. Every language makes philosophical choices, disguised as technical ones:

  • C# chose efficiency over true isolation. String literals share storage to save memory, creating hidden connections between supposedly separate parts of your program.

  • The intern pool is hidden. You’re not supposed to know it exists or interact with it directly. It’s a behind-the-scenes optimization that becomes a front-and-center vulnerability when combined with unsafe code.

  • The CLR assumes you’ll play by the rules. The entire premise of string immutability depends on the assumption that no one will use unsafe code to modify strings. It’s security through obscurity.

What Are You Actually Building?

After 18 years of C#, I’ve realized something profound: we’re not actually building programs that manipulate data according to rules. We’re building worlds with rules we selectively break when their limitations become inconvenient.

The string intern pool isn’t just an implementation detail—it’s a manifestation of this tension. It’s the language designers saying: “Strings are immutable and unique… except when that’s too expensive.”

And unsafe code isn’t just a performance tool—it’s an acknowledgment that sometimes, you need to step outside the comfortable abstractions to do what needs to be done.

Implications for Developers

Understanding these mechanisms provides valuable insights. Even fundamental guarantees in a language have edge cases and exceptions that are worth exploring.

This suggests that abstractions in programming languages should be understood as useful models rather than absolute truths. Being aware of how these abstractions are implemented helps us make better design decisions.

The string intern pool example demonstrates a classic principle in computer science: all abstractions leak to some degree. Understanding these implementation details can be crucial when debugging unexpected behavior or optimizing performance.

When writing C# code with string literals, it’s worth remembering that identical strings may share the same memory location. In most cases, this optimization is beneficial, but in certain edge cases—particularly with unsafe code—it may lead to surprising behavior.

This exploration of C#’s string internment mechanism helps us better understand the tradeoffs language designers make between safety, performance, and simplicity.