CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')

Description

The product constructs all or part of a command, data structure, or record using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify how it is parsed or interpreted when it is sent to a downstream component.

Extended Description

Software or other automated logic has certain assumptions about what constitutes data and control respectively. It is the lack of verification of these assumptions for user-controlled input that leads to injection problems. Injection problems encompass a wide variety of issues -- all mitigated in very different ways and usually attempted in order to alter the control flow of the process. For this reason, the most effective way to discuss these weaknesses is to note the distinct features that classify them as injection weaknesses. The most important issue to note is that all injection problems share one thing in common -- i.e., they allow for the injection of control plane data into the user-controlled data plane. This means that the execution of the process may be altered by sending code in through legitimate data channels, using no other mechanism. While buffer overflows, and many other flaws, involve the use of some further issue to gain execution, injection problems need only for the data to be parsed.

ThreatScore

Threat Mapped score: 0.0

Industry: Finiancial

Threat priority: Unclassified

Observed Examples (CVEs)

CVE: CVE-2024-5184

API service using a large generative AI model allows direct prompt injection to leak hard-coded system prompts or execute other prompts.
CVE: CVE-2022-36069

Python-based dependency management tool avoids OS command injection when generating Git commands but allows injection of optional arguments with input beginning with a dash (CWE-88), potentially allowing for code execution.
CVE: CVE-1999-0067

Canonical example of OS command injection. CGI program does not neutralize "|" metacharacter when invoking a phonebook program.
CVE: CVE-2022-1509

injection of sed script syntax ("sed injection")
CVE: CVE-2020-9054 — KEV

Chain: improper input validation (CWE-20) in username parameter, leading to OS command injection (CWE-78), as exploited in the wild per CISA KEV.
CVE: CVE-2021-44228 — KEV

Product does not neutralize ${xyz} style expressions, allowing remote code execution. (log4shell vulnerability)

Related Attack Patterns (CAPEC)

CAPEC-10
CAPEC-101
CAPEC-105
CAPEC-108
CAPEC-120
CAPEC-13
CAPEC-135
CAPEC-14
CAPEC-24
CAPEC-250
CAPEC-267
CAPEC-273
CAPEC-28
CAPEC-3
CAPEC-34
CAPEC-42
CAPEC-43
CAPEC-45
CAPEC-46
CAPEC-47
CAPEC-51
CAPEC-52
CAPEC-53
CAPEC-6
CAPEC-64
CAPEC-67
CAPEC-7
CAPEC-71
CAPEC-72
CAPEC-76
CAPEC-78
CAPEC-79
CAPEC-8
CAPEC-80
CAPEC-83
CAPEC-84
CAPEC-9

Attack TTPs

T1574.007 — Path Interception by PATH Environment Variable (persistence, privilege-escalation, defense-evasion)
T1574.006 — Dynamic Linker Hijacking (persistence, privilege-escalation, defense-evasion)
T1562.003 — Impair Command History Logging (defense-evasion)
T1027 — Obfuscated Files or Information (defense-evasion)

Malware

APTs (Intrusion Sets)

Modes of Introduction

Phase	Note
Implementation	REALIZATION: This weakness is caused during implementation of an architectural security tactic.

Common Consequences

Impact: Read Application Data — Notes: Many injection attacks involve the disclosure of important information -- in terms of both data sensitivity and usefulness in further exploitation.
Impact: Bypass Protection Mechanism — Notes: In some cases, injectable code controls authentication; this may lead to a remote vulnerability.
Impact: Alter Execution Logic — Notes: Injection attacks are characterized by the ability to significantly change the flow of a given process, and in some cases, to the execution of arbitrary code.
Impact: Other — Notes: Data injection attacks lead to loss of data integrity in nearly all cases as the control-plane data injected is always incidental to data recall or writing.
Impact: Hide Activities — Notes: Often the actions performed by injected control code are unlogged.

Potential Mitigations

Requirements: Programming languages and supporting technologies might be chosen which are not subject to these issues. (N/A)
Implementation: Utilize an appropriate mix of allowlist and denylist parsing to filter control-plane syntax from all input. (N/A)

Applicable Platforms

None (Not Language-Specific, Undetermined)

Demonstrative Examples

Intro: This example code intends to take the name of a user and list the contents of that user's home directory. It is subject to the first variant of OS command injection.

Body: The $userName variable is not checked for malicious input. An attacker could set the $userName variable to an arbitrary OS command such as:

$userName = $_POST["user"]; $command = 'ls -l /home/' . $userName; system($command);

Intro: The following code segment reads the name of the author of a weblog entry, author, from an HTTP request and sets it in a cookie header of an HTTP response.

Body: Assuming a string consisting of standard alpha-numeric characters, such as "Jane Smith", is submitted in the request the HTTP response including this cookie might take the following form:

String author = request.getParameter(AUTHOR_PARAM); ... Cookie cookie = new Cookie("author", author); cookie.setMaxAge(cookieExpiration); response.addCookie(cookie);

Intro: Consider the following program. It intends to perform an "ls -l" on an input filename. The validate_name() subroutine performs validation on the input to make sure that only alphanumeric and "-" characters are allowed, which avoids path traversal (CWE-22) and OS command injection (CWE-78) weaknesses. Only filenames like "abc" or "d-e-f" are intended to be allowed.

Body: However, validate_name() alows filenames that begin with a "-". An adversary could supply a filename like "-aR", producing the "ls -l -aR" command (CWE-88), thereby getting a full recursive listing of the entire directory and all of its sub-directories. There are a couple possible mitigations for this weakness. One would be to refactor the code to avoid using system() altogether, instead relying on internal functions. Another option could be to add a "--" argument to the ls command, such as "ls -l --", so that any remaining arguments are treated as filenames, causing any leading "-" to be treated as part of a filename instead of another option. Another fix might be to change the regular expression used in validate_name to force the first character of the filename to be a letter or number, such as:

my $arg = GetArgument("filename"); do_listing($arg); sub do_listing { my($fname) = @_; if (! validate_name($fname)) { print "Error: name is not well-formed!\n"; return; } # build command my $cmd = "/bin/ls -l $fname"; system($cmd); } sub validate_name { my($name) = @_; if ($name =~ /^[\w\-]+$/) { return(1); } else { return(0); } }

Intro: Consider a "CWE Differentiator" application that uses an an LLM generative AI based "chatbot" to explain the difference between two weaknesses. As input, it accepts two CWE IDs, constructs a prompt string, sends the prompt to the chatbot, and prints the results. The prompt string effectively acts as a command to the chatbot component. Assume that invokeChatbot() calls the chatbot and returns the response as a string; the implementation details are not important here.

Body: To avoid XSS risks, the code ensures that the response from the chatbot is properly encoded for HTML output. If the user provides CWE-77 and CWE-78, then the resulting prompt would look like:

prompt = "Explain the difference between {} and {}".format(arg1, arg2) result = invokeChatbot(prompt) resultHTML = encodeForHTML(result) print resultHTML

Notes

Theoretical: Many people treat injection only as an input validation problem (CWE-20) because many people do not distinguish between the consequence/attack (injection) and the protection mechanism that prevents the attack from succeeding. However, input validation is only one potential protection mechanism (output encoding is another), and there is a chaining relationship between improper input validation and the improper enforcement of the structure of messages to other components. Other issues not directly related to input validation, such as race conditions, could similarly impact message structure.

← Back to CWE list