CWE-120: Buffer Copy Without Checking Size of Input (“Classic Buffer Overflow”)

Few vulnerabilities are as historically significant—or still as dangerous—as the classic buffer overflow. Despite decades of awareness, CWE-120 remains relevant because legacy code, unsafe languages, and performance-driven low-level software continue to rely on manual memory management and unchecked copy operations.

CWE-120 occurs when software copies input into a fixed-size buffer without verifying that the input fits within the destination.

In practical terms:

The application copies more data than the buffer can hold, overwriting adjacent memory.

This article breaks down how classic buffer overflows work, why developers still introduce them, modern exploitation techniques, framework-specific mitigations, and secure coding patterns.

What Is a Classic Buffer Overflow?

A buffer overflow occurs when a program writes more bytes into a buffer than it was allocated to contain.

Unsafe example:

char buf[16];
strcpy(buf, userInput);

If userInput is longer than 16 bytes, the copy writes beyond buf.

That overflow may corrupt:

  • Adjacent stack/heap variables
  • Saved frame pointers
  • Return addresses
  • Function pointers
  • Security-critical flags

How Buffer Overflows Actually Work

The root issue is unchecked copy operations into fixed-length memory regions.

Attack Flow

  1. Buffer allocated with fixed size
  2. Input larger than buffer received
  3. Copy operation ignores destination bounds
  4. Excess bytes overwrite adjacent memory
  5. Crash, corruption, or code execution occurs

Visual: Buffer Overflow Data Flow

Allocated Buffer Overflow Region Unchecked Copy Exceeds Buffer Size

Why Developers Still Get Buffer Overflows Wrong

Legacy APIs Persist

Dangerous functions remain in use:

  • strcpy
  • strcat
  • sprintf
  • gets
  • scanf("%s")

Misunderstood “Safe” Replacements

Functions like strncpy() can still be misused:

  • May omit null termination
  • Frequently paired with incorrect lengths

Trusting Input Length Constraints

Developers assume:

  • UI fields limit length
  • Protocols enforce size
  • Upstream validation is sufficient

Attackers bypass assumptions.

Performance / Low-Level Constraints

Systems software often prioritizes speed/manual memory control.

Modern Exploitation Techniques

Stack Return Address Overwrite

Classic overwrite of saved instruction pointer.

ROP (Return-Oriented Programming)

Chain existing code snippets (“gadgets”) to bypass NX/DEP.

Heap Overflow Chaining

Overflow adjacent heap objects/metadata.

Partial Overwrites

Modify only key bytes of pointers/flags.

Data-Only Exploitation

Corrupt program state without hijacking control flow.

Visual: Buffer Overflow Exploitation Chain

Buffer Overflow Memory Corruption Control/Data Overwrite ROP RCE

How CWE-120 Differs from CWE-787

Developers often confuse these two.

CWE-120

Specifically:

Copying input into buffer without checking destination size.

Focused on unchecked copy operations.

CWE-787

Broader category:

Any write beyond buffer bounds.

CWE-120 is effectively a specialized subset of CWE-787.

Secure Coding Examples

Unsafe

strcpy(dest, src);

Safer

snprintf(dest, sizeof(dest), "%s", src);

Better

std::string dest = src;

Use safe abstractions where possible.

Framework / Language Mitigations

Compiler Protections

Use:

  • Stack canaries
  • ASLR
  • DEP/NX
  • CFI

Mitigate exploitation, not root cause.

Runtime Sanitizers

Use in CI/testing:

  • AddressSanitizer
  • UBSan
  • MemorySanitizer

Memory-Safe Languages

Prefer when feasible:

  • Rust
  • Go
  • Java
  • C#

Defense in Depth

Ban Unsafe APIs

Enforce secure coding standards prohibiting dangerous functions.

Validate Length Before Copy

Every copy operation should verify destination capacity.

Fuzz Boundary Conditions

Fuzzing excels at detecting overflow conditions.

Isolate High-Risk Parsers

Sandbox legacy/native parsing code.

Final Thoughts

Classic buffer overflows remain relevant because memory corruption remains relevant.

They persist because:

  • Unsafe languages remain widespread
  • Legacy code survives indefinitely
  • “Safer” APIs are still misused
  • Exploit mitigations create complacency

The core lesson is simple:

If you copy data without proving it fits, attackers may decide what memory gets overwritten next.