CWE-838 (输出上下文语义编码不恰当) — Vulnerability Class 3

3 vulnerabilities classified as CWE-838 (输出上下文语义编码不恰当). AI Chinese analysis included.

CWE-838 represents a critical input validation failure where software transmits data using an encoding scheme mismatched with the downstream component’s expectations. This discrepancy causes the receiver to misinterpret the byte stream, potentially decoding intended characters as malicious payloads or structural commands. Attackers typically exploit this vulnerability by injecting specially crafted input that, when incorrectly decoded, triggers cross-site scripting, command injection, or buffer overflows within the receiving system. To mitigate this risk, developers must rigorously enforce consistent encoding standards across all system interfaces, explicitly defining character sets like UTF-8 for all data exchanges. Implementing strict validation checks ensures that output generation aligns precisely with consumer requirements, thereby preventing unintended data interpretation and neutralizing the attack surface created by encoding mismatches.

MITRE CWE Description

The product uses or specifies an encoding when generating output to a downstream component, but the specified encoding is not the same as the encoding that is expected by the downstream component. This weakness can cause the downstream component to use a decoding method that produces different data than what the product intended to send. When the wrong encoding is used - even if closely related - the downstream component could decode the data incorrectly. This can have security consequences when the provided boundaries between control and data are inadvertently broken, because the resulting data could introduce control characters or special elements that were not sent by the product. The resulting data could then be used to bypass protection mechanisms such as input validation, and enable injection attacks. While using output encoding is essential for ensuring that communications between components are accurate, the use of the wrong encoding - even if closely related - could cause the downstream component to misinterpret the output. For example, HTML entity encoding is used for elements in the HTML body of a web page. However, a programmer might use entity encoding when generating output for that is used within an attribute of an HTML tag, which could contain functional Javascript that is not affected by the HTML encoding. While web applications have received the most attention for this problem, this weakness could potentially apply to any type of product that uses a communi…

Common Consequences (1)

Integrity, Confidentiality, AvailabilityModify Application Data, Execute Unauthorized Code or Commands

An attacker could modify the structure of the message or data being sent to the downstream component, possibly injecting commands.

Mitigations (3)

ImplementationUse context-aware encoding. That is, understand which encoding is being used by the downstream component, and ensure that this encoding is used. If an encoding can be specified, do so, instead of assuming that the default encoding is the same as the default being assumed by the downstream component.

Architecture and DesignWhere possible, use communications protocols or data formats that provide strict boundaries between control and data. If this is not feasible, ensure that the protocols or formats allow the communicating components to explicitly state which encoding/decoding method is being used. Some template frameworks provide built-in support.

Architecture and DesignUse a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, consider using the ESAPI Encoding control [REF-45] or a similar tool, library, or framework. These will help the programmer encode outputs in a manner less prone to error. Note that some template mechanisms provide built-in support for the approp…

Examples (1)

This code dynamically builds an HTML page using POST data:

$username = $_POST['username']; $picSource = $_POST['picsource']; $picAltText = $_POST['picalttext']; ... echo "<title>Welcome, " . htmlentities($username) ."</title>"; echo "<img src='". htmlentities($picSource) ." ' alt='". htmlentities($picAltText) . '" />'; ...

Bad · PHP

"altTextHere' onload='alert(document.cookie)"

Attack

CVE ID	Title	CVSS	Severity	Published
CVE-2024-34006	moodle: unsanitized HTML in site log for config_log_created	3.5	-	2024-05-31
CVE-2023-5770	HTML injection in email body through email subject — Proofpoint Enterprise Protection	5.3	Medium	2024-01-09
CVE-2020-7292	Web Gateway (MWG) - Inappropriate Encoding for output context — McAfee Web Gateway (MWG)	4.3	Medium	2020-07-15

Vulnerabilities classified as CWE-838 (输出上下文语义编码不恰当) represent 3 CVEs. The CWE taxonomy describes the weakness; review individual CVEs for product-specific impact.

Goal Reached Thanks to every supporter — we hit 100%!

CWE-838 (输出上下文语义编码不恰当) — Vulnerability Class 3